Skip to content

Refine v2 web activity shell#2560

Merged
serrrfirat merged 2 commits intostagingfrom
henry/2542-routine-web-ux
Apr 17, 2026
Merged

Refine v2 web activity shell#2560
serrrfirat merged 2 commits intostagingfrom
henry/2542-routine-web-ux

Conversation

@henrypark133
Copy link
Copy Markdown
Collaborator

Summary

  • add an engine-v2-aware activity shell with a persistent compact jobs/missions ticker and tab badges for jobs/missions only
  • hide the legacy Routines tab in v2, keep mission progress derived from live thread activity, and rehydrate the shell from existing jobs/missions APIs
  • add focused browser coverage for the v2 shell and fix snapshot/SSE reconciliation issues for job and mission terminal states

Testing

  • cargo fmt
  • cargo clippy --all-targets --all-features -- -D warnings
  • node --check crates/ironclaw_gateway/static/app.js
  • python3 -m py_compile tests/e2e/scenarios/test_v2_activity_shell.py
  • tests/e2e/.venv/bin/pytest tests/e2e/scenarios/test_v2_activity_shell.py -v

Notes

  • WORKTREE_NOTES.md is intentionally left untracked and excluded.
  • One intermediate Playwright rerun failed due local disk exhaustion in the fixture build; the same focused file passed after rerunning once space/build artifacts were stable.

Copilot AI review requested due to automatic review settings April 17, 2026 00:19
@github-actions github-actions bot added scope: channel/web Web gateway channel size: XL 500+ changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Apr 17, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an ActiveWorkStore and a corresponding UI active work strip to provide real-time visibility into background jobs and missions. It includes integration with SSE for status updates, tab badges for active task counts, and logic to hide legacy Routines in favor of Missions when Engine V2 is active. Review feedback suggests refining the activity bar's visibility logic, fixing indentation, localizing hardcoded strings, adopting more robust state management for thread status, and optimizing the polling mechanism for mission progress.

Comment thread crates/ironclaw_gateway/static/app.js Outdated
Comment thread crates/ironclaw_gateway/static/app.js Outdated
Comment thread crates/ironclaw_gateway/static/app.js Outdated
Comment thread crates/ironclaw_gateway/static/app.js
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refines the web “activity shell” when engine v2 is enabled by adding a persistent jobs/missions activity strip, tab badges for active work, and mission progress derived from live thread activity, plus targeted Playwright coverage for the new shell behavior.

Changes:

  • Expose engine_v2 in /api/gateway/status and use it in the frontend to hide the legacy Routines tab in v2.
  • Add a persistent “active work strip” (jobs + missions) with tab count badges and mission/thread progress rendering.
  • Add focused Playwright e2e coverage for the v2 shell (routines hidden; activity strip persists across tab switches).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/e2e/scenarios/test_v2_activity_shell.py Adds Playwright coverage for v2 shell behavior (routines hidden, strip persists).
src/channels/web/server.rs Adds engine_v2 boolean to gateway status response for frontend gating.
crates/ironclaw_gateway/static/style.css Styles tab badges, the active work strip, and mission progress states.
crates/ironclaw_gateway/static/index.html Adds active work strip container and a Missions “Progress” column header.
crates/ironclaw_gateway/static/app.js Implements v2-aware shell logic: tab normalization/hiding, persistent strip store, mission progress hydration and refresh.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/ironclaw_gateway/static/app.js
Comment thread crates/ironclaw_gateway/static/app.js
Comment thread crates/ironclaw_gateway/static/index.html Outdated
Comment thread crates/ironclaw_gateway/static/app.js Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff6fe39cc6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/ironclaw_gateway/static/app.js Outdated
Copy link
Copy Markdown
Collaborator Author

Addressed the current web-shell review feedback in 8a23af97.

Resolved in this follow-up:

  • hide/localize the v2 activity strip and related labels
  • localize the new activity/status strings in the SSE/chat startup paths
  • replace blocked-state string matching with explicit entry state
  • fix active job snapshot reconciliation
  • batch thread metadata ingestion during loadThreads()
  • clear mission refresh timers/flags during connection cleanup
  • advance the mission refresh watermark even when a force-refresh finds an in-flight fetch
  • localize the Missions Progress header

I left the current 5s active-mission polling interval in place for now. The correctness issues around stale mappings / reschedule churn are fixed, but the broader polling-vs-push tradeoff still depends on adding better mission/thread linkage from the backend.

Copy link
Copy Markdown
Collaborator

@serrrfirat serrrfirat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed as paranoid architect. XSS protection is solid throughout — all innerHTML paths use escapeHtml or DOMPurify. Backend change is minimal (one bool field). Minor suggestions: truncate job_message content before storing in statusText, add eviction to the new Maps, and i18n the hardcoded 'Idle' string. None are blocking. Ship it.

@serrrfirat serrrfirat merged commit 4308628 into staging Apr 17, 2026
15 checks passed
@serrrfirat serrrfirat deleted the henry/2542-routine-web-ux branch April 17, 2026 05:34
@claude
Copy link
Copy Markdown

claude bot commented Apr 17, 2026

Code review

No issues found.

The PR refines the v2 web activity shell with new state management classes for tracking jobs and missions. Key observations:

  • Rust changes (server.rs): Minimal and safe — adds engine_v2 field to status response
  • JavaScript security: Uses escapeHtml() properly for all user-derived data in HTML strings
  • State management: Cache and timer cleanup are properly implemented; timers are nulled after completion
  • Test coverage: New E2E test scenario validates the activity shell behavior
  • Translations: i18n strings added for three locales

The implementation follows existing code patterns and maintains defensive null-checking throughout.

🤖 Generated with Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Apr 17, 2026

Additional Review - Architectural Patterns

Found some architectural and pattern concerns:

  1. [MEDIUM] String-based state normalizationJobActivityEntry.normalizeState() and MissionActivityEntry.normalizeState() use hardcoded string comparisons ('pending', 'in_progress', 'running', 'completed', etc.) instead of enums. Per CLAUDE.md "prefer strong types over strings," this is prone to silent failures on typos or API contract changes.

  2. [MEDIUM] DRY violation in state logic — Both JobActivityEntry and MissionActivityEntry duplicate normalizeState(), formatStatus(), and shouldPreserveActiveStatus() methods with nearly identical logic. Consider extracting to a shared utility or mixin.

  3. [MEDIUM] Module-scoped globals tightly coupled to ActiveWorkStoreengineV2Enabled, currentMissionData, missionDetailCache, etc. (lines 104–115) are global module variables that should be encapsulated as instance state in ActiveWorkStore. This makes testing state transitions in isolation difficult.

  4. [MEDIUM] Nested conditionals for activity type dispatch — The render() method and SSE event handlers use inline if (item.kind === 'job') checks instead of polymorphism. Better to define variants and use visitor/dispatch patterns.

  5. [LOW] Implicit state fallback in CSSrender() defaults to 'done' if state is missing (line 523), but CSS only styles known states. A new backend state could render silently.

  6. [LOW] Test coverage gap — E2E test test_active_work_strip_survives_tab_switch() only verifies visibility, not that SSE events populate the bar with live status. Per testing rules, tests should drive SSE events (thinking, tool_started, etc.) and assert state transitions.


None of these block the PR, but they are architectural debts to consider in follow-ups.

@claude
Copy link
Copy Markdown

claude bot commented Apr 17, 2026

Performance & Production Issues

Found several performance and memory concerns that should be addressed:

Critical Issues

[HIGH:HIGH] Timer closure captures stale mission IDs

In scheduleActiveMissionMappingRefresh() (lines 6945-6956), the closure captures missionIds at scheduling time. If active missions change before the timer fires 5 seconds later, it refetches completed/removed missions while missing newly active ones. This causes unnecessary API calls and misses real-time updates.

Recommendation: Capture a snapshot of mission IDs in the timer callback itself, or use a volatile set that's re-queried at timer fire time.


[HIGH:MEDIUM] Unbounded cache growth in missionDetailCache

missionDetailCache (line 108) has no eviction policy. Over long sessions, the cache grows indefinitely with each mission object (including all thread metadata). This causes memory leaks proportional to total missions seen.

Recommendation: Implement an LRU eviction policy or cap cache size at ~100-200 entries.


Performance Issues

[MEDIUM:HIGH] DOM querySelectorAll in hot path without batching

refreshMissionProgressViews() (lines 6967-6981) performs 3 separate DOM scans and sets innerHTML on each node individually. When called from requestAnimationFrame during rapid SSE events, this causes excessive reflow/repaint cycles.

Recommendation: Batch DOM updates or use targeted class toggling instead of full innerHTML replacement.


[MEDIUM:MEDIUM] Activity bar recreates entire DOM tree on any mission update

ActiveWorkStore.render() (line 517) rebuilds the entire button tree via innerHTML = ...items.map(...).join(''). Even single status changes tear down and rebuild all buttons, losing focus state and regenerating DOM nodes.

Recommendation: Use DOM diffing or maintain button nodes and update their content/classes instead.


[MEDIUM:MEDIUM] Mission refresh cascade from SSE events

Each SSE event (thinking, tool_started, etc.) triggers scheduleMissionProgressViewsRefresh() → timer → multiple fetchMissionDetailForProgress() calls. With rapid events, timers queue up and create API spikes despite the 5-second throttle.

Recommendation: Use a debounced refresh with a longer idle period before triggering fetches.


Test Coverage

The E2E test doesn't validate:

  • Long-session memory behavior with mission churn
  • Concurrent SSE events + timer interactions
  • New missions tracked after initial timer fires

Recommendation: Add scenario tests for session memory growth and concurrent event handling.


Summary: Issues #1 and #2 should block merge (memory leak + missed mission tracking). Issues #3-5 are performance optimizations that improve UX but don't break functionality.

@claude
Copy link
Copy Markdown

claude bot commented Apr 17, 2026

Security Review

No critical security vulnerabilities found. The implementation properly handles untrusted data.

Key Security Findings

Proper XSS prevention: All untrusted SSE event data (statusText, messages, names, etc.) is properly escaped with escapeHtml() before DOM insertion.

Safe i18n substitution: Parameter substitution in i18n strings is escaped before rendering.

HTML rendering patterns: Functions like renderMissionProgressMarkup(), refreshMissionProgressViews(), and activity bar rendering all follow safe escaping practices.


Minor Recommendation

[MEDIUM:LOW] Input validation at store level

The statusText field stores untrusted SSE event data (tool names, job messages, status messages) without validation. While output escaping prevents XSS, adding input validation creates defense-in-depth and reduces future regression risk.

Recommendation: Add light validation at ActivityEntry intake time:

  • Trim length of statusText (max 200 chars)
  • Validate format if possible
  • This reduces maintenance burden if render paths are added later

Summary: Security posture is solid. This is a low-priority hardening improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: channel/web Web gateway channel size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants