Refine v2 web activity shell by henrypark133 · Pull Request #2560 · nearai/ironclaw

henrypark133 · 2026-04-17T00:19:20Z

Summary

add an engine-v2-aware activity shell with a persistent compact jobs/missions ticker and tab badges for jobs/missions only
hide the legacy Routines tab in v2, keep mission progress derived from live thread activity, and rehydrate the shell from existing jobs/missions APIs
add focused browser coverage for the v2 shell and fix snapshot/SSE reconciliation issues for job and mission terminal states

Testing

cargo fmt
cargo clippy --all-targets --all-features -- -D warnings
node --check crates/ironclaw_gateway/static/app.js
python3 -m py_compile tests/e2e/scenarios/test_v2_activity_shell.py
tests/e2e/.venv/bin/pytest tests/e2e/scenarios/test_v2_activity_shell.py -v

Notes

WORKTREE_NOTES.md is intentionally left untracked and excluded.
One intermediate Playwright rerun failed due local disk exhaustion in the fixture build; the same focused file passed after rerunning once space/build artifacts were stable.

gemini-code-assist

Code Review

This pull request introduces an ActiveWorkStore and a corresponding UI active work strip to provide real-time visibility into background jobs and missions. It includes integration with SSE for status updates, tab badges for active task counts, and logic to hide legacy Routines in favor of Missions when Engine V2 is active. Review feedback suggests refining the activity bar's visibility logic, fixing indentation, localizing hardcoded strings, adopting more robust state management for thread status, and optimizing the polling mechanism for mission progress.

Copilot

Pull request overview

This PR refines the web “activity shell” when engine v2 is enabled by adding a persistent jobs/missions activity strip, tab badges for active work, and mission progress derived from live thread activity, plus targeted Playwright coverage for the new shell behavior.

Changes:

Expose engine_v2 in /api/gateway/status and use it in the frontend to hide the legacy Routines tab in v2.
Add a persistent “active work strip” (jobs + missions) with tab count badges and mission/thread progress rendering.
Add focused Playwright e2e coverage for the v2 shell (routines hidden; activity strip persists across tab switches).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
tests/e2e/scenarios/test_v2_activity_shell.py	Adds Playwright coverage for v2 shell behavior (routines hidden, strip persists).
src/channels/web/server.rs	Adds `engine_v2` boolean to gateway status response for frontend gating.
crates/ironclaw_gateway/static/style.css	Styles tab badges, the active work strip, and mission progress states.
crates/ironclaw_gateway/static/index.html	Adds active work strip container and a Missions “Progress” column header.
crates/ironclaw_gateway/static/app.js	Implements v2-aware shell logic: tab normalization/hiding, persistent strip store, mission progress hydration and refresh.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff6fe39cc6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

henrypark133 · 2026-04-17T00:52:03Z

Addressed the current web-shell review feedback in 8a23af97.

Resolved in this follow-up:

hide/localize the v2 activity strip and related labels
localize the new activity/status strings in the SSE/chat startup paths
replace blocked-state string matching with explicit entry state
fix active job snapshot reconciliation
batch thread metadata ingestion during loadThreads()
clear mission refresh timers/flags during connection cleanup
advance the mission refresh watermark even when a force-refresh finds an in-flight fetch
localize the Missions Progress header

I left the current 5s active-mission polling interval in place for now. The correctness issues around stale mappings / reschedule churn are fixed, but the broader polling-vs-push tradeoff still depends on adding better mission/thread linkage from the backend.

serrrfirat

Reviewed as paranoid architect. XSS protection is solid throughout — all innerHTML paths use escapeHtml or DOMPurify. Backend change is minimal (one bool field). Minor suggestions: truncate job_message content before storing in statusText, add eviction to the new Maps, and i18n the hardcoded 'Idle' string. None are blocking. Ship it.

claude · 2026-04-17T05:50:19Z

Code review

No issues found.

The PR refines the v2 web activity shell with new state management classes for tracking jobs and missions. Key observations:

Rust changes (server.rs): Minimal and safe — adds engine_v2 field to status response
JavaScript security: Uses escapeHtml() properly for all user-derived data in HTML strings
State management: Cache and timer cleanup are properly implemented; timers are nulled after completion
Test coverage: New E2E test scenario validates the activity shell behavior
Translations: i18n strings added for three locales

The implementation follows existing code patterns and maintains defensive null-checking throughout.

🤖 Generated with Claude Code

claude · 2026-04-17T05:51:02Z

Additional Review - Architectural Patterns

Found some architectural and pattern concerns:

[MEDIUM] String-based state normalization — JobActivityEntry.normalizeState() and MissionActivityEntry.normalizeState() use hardcoded string comparisons ('pending', 'in_progress', 'running', 'completed', etc.) instead of enums. Per CLAUDE.md "prefer strong types over strings," this is prone to silent failures on typos or API contract changes.
[MEDIUM] DRY violation in state logic — Both JobActivityEntry and MissionActivityEntry duplicate normalizeState(), formatStatus(), and shouldPreserveActiveStatus() methods with nearly identical logic. Consider extracting to a shared utility or mixin.
[MEDIUM] Module-scoped globals tightly coupled to ActiveWorkStore — engineV2Enabled, currentMissionData, missionDetailCache, etc. (lines 104–115) are global module variables that should be encapsulated as instance state in ActiveWorkStore. This makes testing state transitions in isolation difficult.
[MEDIUM] Nested conditionals for activity type dispatch — The render() method and SSE event handlers use inline if (item.kind === 'job') checks instead of polymorphism. Better to define variants and use visitor/dispatch patterns.
[LOW] Implicit state fallback in CSS — render() defaults to 'done' if state is missing (line 523), but CSS only styles known states. A new backend state could render silently.
[LOW] Test coverage gap — E2E test test_active_work_strip_survives_tab_switch() only verifies visibility, not that SSE events populate the bar with live status. Per testing rules, tests should drive SSE events (thinking, tool_started, etc.) and assert state transitions.

None of these block the PR, but they are architectural debts to consider in follow-ups.

claude · 2026-04-17T05:51:16Z

Performance & Production Issues

Found several performance and memory concerns that should be addressed:

Critical Issues

[HIGH:HIGH] Timer closure captures stale mission IDs

In scheduleActiveMissionMappingRefresh() (lines 6945-6956), the closure captures missionIds at scheduling time. If active missions change before the timer fires 5 seconds later, it refetches completed/removed missions while missing newly active ones. This causes unnecessary API calls and misses real-time updates.

Recommendation: Capture a snapshot of mission IDs in the timer callback itself, or use a volatile set that's re-queried at timer fire time.

[HIGH:MEDIUM] Unbounded cache growth in missionDetailCache

missionDetailCache (line 108) has no eviction policy. Over long sessions, the cache grows indefinitely with each mission object (including all thread metadata). This causes memory leaks proportional to total missions seen.

Recommendation: Implement an LRU eviction policy or cap cache size at ~100-200 entries.

Performance Issues

[MEDIUM:HIGH] DOM querySelectorAll in hot path without batching

refreshMissionProgressViews() (lines 6967-6981) performs 3 separate DOM scans and sets innerHTML on each node individually. When called from requestAnimationFrame during rapid SSE events, this causes excessive reflow/repaint cycles.

Recommendation: Batch DOM updates or use targeted class toggling instead of full innerHTML replacement.

[MEDIUM:MEDIUM] Activity bar recreates entire DOM tree on any mission update

ActiveWorkStore.render() (line 517) rebuilds the entire button tree via innerHTML = ...items.map(...).join(''). Even single status changes tear down and rebuild all buttons, losing focus state and regenerating DOM nodes.

Recommendation: Use DOM diffing or maintain button nodes and update their content/classes instead.

[MEDIUM:MEDIUM] Mission refresh cascade from SSE events

Each SSE event (thinking, tool_started, etc.) triggers scheduleMissionProgressViewsRefresh() → timer → multiple fetchMissionDetailForProgress() calls. With rapid events, timers queue up and create API spikes despite the 5-second throttle.

Recommendation: Use a debounced refresh with a longer idle period before triggering fetches.

Test Coverage

The E2E test doesn't validate:

Long-session memory behavior with mission churn
Concurrent SSE events + timer interactions
New missions tracked after initial timer fires

Recommendation: Add scenario tests for session memory growth and concurrent event handling.

Summary: Issues #1 and #2 should block merge (memory leak + missed mission tracking). Issues #3-5 are performance optimizations that improve UX but don't break functionality.

claude · 2026-04-17T05:51:38Z

Security Review

No critical security vulnerabilities found. The implementation properly handles untrusted data.

Key Security Findings

✅ Proper XSS prevention: All untrusted SSE event data (statusText, messages, names, etc.) is properly escaped with escapeHtml() before DOM insertion.

✅ Safe i18n substitution: Parameter substitution in i18n strings is escaped before rendering.

✅ HTML rendering patterns: Functions like renderMissionProgressMarkup(), refreshMissionProgressViews(), and activity bar rendering all follow safe escaping practices.

Minor Recommendation

[MEDIUM:LOW] Input validation at store level

The statusText field stores untrusted SSE event data (tool names, job messages, status messages) without validation. While output escaping prevents XSS, adding input validation creates defense-in-depth and reduces future regression risk.

Recommendation: Add light validation at ActivityEntry intake time:

Trim length of statusText (max 200 chars)
Validate format if possible
This reduces maintenance burden if render paths are added later

Summary: Security posture is solid. This is a low-priority hardening improvement.

Refine v2 web activity shell

ff6fe39

Copilot AI review requested due to automatic review settings April 17, 2026 00:19

github-actions bot added scope: channel/web Web gateway channel size: XL 500+ changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Apr 17, 2026

Copilot started reviewing on behalf of henrypark133 April 17, 2026 00:19 View session

gemini-code-assist bot reviewed Apr 17, 2026

View reviewed changes

Comment thread crates/ironclaw_gateway/static/app.js Outdated

Comment thread crates/ironclaw_gateway/static/app.js Outdated

Comment thread crates/ironclaw_gateway/static/app.js Outdated

Comment thread crates/ironclaw_gateway/static/app.js

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Comment thread crates/ironclaw_gateway/static/app.js

Comment thread crates/ironclaw_gateway/static/app.js

Comment thread crates/ironclaw_gateway/static/index.html Outdated

Comment thread crates/ironclaw_gateway/static/app.js Outdated

chatgpt-codex-connector bot reviewed Apr 17, 2026

View reviewed changes

Comment thread crates/ironclaw_gateway/static/app.js Outdated

Polish v2 web activity shell

8a23af9

serrrfirat approved these changes Apr 17, 2026

View reviewed changes

serrrfirat merged commit 4308628 into staging Apr 17, 2026
15 checks passed

serrrfirat deleted the henry/2542-routine-web-ux branch April 17, 2026 05:34

github-actions bot mentioned this pull request Apr 17, 2026

chore: promote staging to staging-promote/fe5cdcee-24548466471 (2026-04-17 05:46 UTC) #2567

Merged

Conversation

henrypark133 commented Apr 17, 2026

Summary

Testing

Notes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

henrypark133 commented Apr 17, 2026

Uh oh!

serrrfirat left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude bot commented Apr 17, 2026

Code review

Uh oh!

claude bot commented Apr 17, 2026

Additional Review - Architectural Patterns

Uh oh!

claude bot commented Apr 17, 2026

Performance & Production Issues

Critical Issues

Performance Issues

Test Coverage

Uh oh!

claude bot commented Apr 17, 2026

Security Review

Key Security Findings

Minor Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants