feat: new-project skill and template ref resolution for parallel tool calls by ilblackdragon · Pull Request #2353 · nearai/ironclaw

ilblackdragon · 2026-04-12T06:14:19Z

Summary

New-project skill: Renamed project-management → new-project so users can invoke /new-project <description>. Rewrote skill to use memory_write + mission_create directly instead of referencing nonexistent project_create/project_update tools. Includes goals and metrics when the project warrants them. Instructs sequential tool execution.
Template ref resolution: Some OpenAI-format models (e.g. Qwen) emit {{call_id.field}} template references in parallel tool call arguments. Added a resolution pass in LlmBridgeAdapter that resolves these from prior tool results in conversation history, preventing orphaned/broken mission creates.
Gateway enhancements: Project metrics types, mission cadence scheduling UI, and frontend project views with metrics and goal tracking.

Motivated by trace analysis (trace_20260411T133641.json) where Qwen 3.5-122B failed to create a project correctly via the old skill.

Project Detail Page

Test plan

All 30 bridge::llm_adapter tests pass (8 new template resolution tests)
cargo clippy clean
E2E screenshot test (test_project_detail.py) passes
Manual test: /new-project <description> creates workspace files and missions correctly
Verify template resolution with a Qwen model making parallel tool calls

🤖 Generated with Claude Code

gemini-code-assist

Code Review

This pull request introduces a 'Projects Control Room' (Engine V2) to facilitate autonomous project management, incorporating goals, metrics, and project-scoped missions. Key updates include new backend data structures, API endpoints for project overviews and dynamic widget discovery, and an extensive frontend dashboard. Feedback points out a mismatch between the project creation skill and the backend's UUID requirement, a logic flaw in template resolution that breaks early on unresolvable patterns, and missing fields in the project overview DTO required by the frontend. Additionally, improvements were suggested for metric update logic and CSS scoping for project-specific widgets to prevent UI leakage.

gemini-code-assist · 2026-04-12T06:16:57Z

+Create recurring missions scoped to the project. Use the **slug** as `project_id`:
+
+```
+mission_create(name: "...", goal: "...", cadence: "daily", project_id: "{slug}")
+```


There is a conflict between the skill instructions and the mission_create tool implementation. The skill instructs the agent to use the project slug as the project_id, but the mission_create tool (in effect_adapter.rs) expects a valid UUID. If the agent passes a slug, uuid::Uuid::parse_str will fail, and the mission will incorrectly default to the current thread's project instead of the intended one.

Suggested change

Create recurring missions scoped to the project. Use the **slug** as `project_id`:

```

mission_create(name: "...", goal: "...", cadence: "daily", project_id: "{slug}")

```

Create recurring missions scoped to the project. Use the project_id (UUID) returned by the project_create tool:

mission_create(name: "...", goal: "...", cadence: "daily", project_id: "{project_id}")

Fixed in b7fe8cc — mission_create now accepts project names/slugs and resolves them by matching against the user's projects. SKILL.md updated to use project name instead of slug.

gemini-code-assist · 2026-04-12T06:16:58Z

+    for _ in 0..50 {
+        let Some(start) = result.find("{{") else {
+            break;
+        };
+        let Some(end) = result[start..].find("}}") else {
+            break;
+        };
+        let end = start + end;
+        let ref_str = &result[start + 2..end]; // e.g. "chatcmpl-tool-9816a462feb22da1.project_id"
+
+        let resolved = if let Some(dot_pos) = ref_str.rfind('.') {
+            let call_id = &ref_str[..dot_pos];
+            let field = &ref_str[dot_pos + 1..];
+            tool_results
+                .iter()
+                .find(|(id, _)| id == call_id)
+                .and_then(|(_, json)| json.get(field))
+                .map(|v| match v {
+                    serde_json::Value::String(s) => s.clone(),
+                    other => other.to_string(),
+                })
+        } else {
+            None
+        };
+
+        match resolved {
+            Some(val) => {
+                result.replace_range(start..end + 2, &val);
+            }
+            None => {
+                // Can't resolve — stop to avoid infinite loop on the same pattern
+                break;
+            }
+        }
+    }


The resolve_template_refs function stops processing all templates in a string if it encounters a single unresolvable pattern (e.g., a non-existent call_id or a string that happens to contain {{ but isn't a template). This is because the loop breaks when resolved is None. It should instead skip the unresolvable pattern and continue searching for subsequent templates.

gemini-code-assist · 2026-04-12T06:16:58Z

+  if (proj && (proj.goals && proj.goals.length || proj.metrics && proj.metrics.length)) {
+    var gmHtml = '';
+    if (proj.goals && proj.goals.length) {
+      gmHtml += '<div class="cr-goals"><div class="cr-section-title">Goals</div>';
+      proj.goals.forEach(function(g) {
+        gmHtml += '<div class="cr-goal-item">' + escapeHtml(g) + '</div>';
+      });
+      gmHtml += '</div>';
+    }
+    // Metrics would come from project detail; overview doesn't include them yet.
+    document.getElementById('cr-drill-header').innerHTML += gmHtml;
+  }


The drillIntoProject function attempts to display project goals and metrics using the proj object from the cached overview. However, the ProjectOverviewEntry DTO returned by the /api/engine/projects/overview endpoint does not include goals or metrics fields. Consequently, these sections will never be rendered in the drill-in view. You should either include these fields in the overview response or fetch the full project detail in this function.

gemini-code-assist · 2026-04-12T06:16:58Z

+                if let Some(metrics) = params.get("metrics").and_then(|v| v.as_array()) {
+                    project.metrics = metrics
+                        .iter()
+                        .filter_map(|v| serde_json::from_value(v.clone()).ok())
+                        .collect();
+                }


When updating project metrics, ensure the logic sums the values with existing ones rather than overwriting them. Additionally, if a timestamp is missing, do not fall back to the current time (Utc::now()) to avoid issues with resource cleanup logic; instead, log a warning and skip the update for that metric. Please add tests to cover multi-event scenarios.

if let Some(new_metrics) = params.get("metrics").and_then(|v| v.as_array()) { for v in new_metrics { if let Ok(new_m) = serde_json::from_value::<ironclaw_engine::ProjectMetric>(v.clone()) { if let Some(existing) = project.metrics.iter_mut().find(|m| m.name == new_m.name) { existing.value += new_m.value; } else if new_m.updated_at.is_some() { project.metrics.push(new_m); } else { tracing::warn!("Skipping metric update due to missing timestamp: {}", new_m.name); } } } }

References

When accumulating metrics from multiple events, ensure the logic correctly sums the values rather than overwriting them. Add tests to cover multi-event scenarios.

When calculating the age of a resource for cleanup, if the creation timestamp is unknown or cannot be parsed, treat it as un-reapable by logging a warning and skipping it. Do not fall back to the current time.

gemini-code-assist · 2026-04-12T06:16:58Z

+        if (w.css) {
+          var style = document.createElement('style');
+          style.setAttribute('data-widget', manifest.id);
+          style.textContent = w.css; // Already scoped server-side? No — project widgets come raw.
+          // Scope it client-side using the same prefix convention.
+          var prefix = '[data-widget="' + manifest.id + '"]';
+          // Simple client-side scoping: prepend prefix to each rule.
+          // For production, use the server's scope_css. For now, inject as-is
+          // with the data-widget attribute providing natural scoping.
+          document.head.appendChild(style);
+        }


Project widgets are injected with their CSS added directly to the document head without scoping. If a widget contains global CSS rules (e.g., targeting body, h2, or common class names like .btn), it will leak and affect the entire application UI. Consider implementing the client-side scoping mentioned in the comments or ensuring the server-side scope_css logic is applied to project-specific widgets.

Copilot

Pull request overview

Adds engine-v2 “Projects control room” UX and supporting engine APIs, plus LLM-side tool-call argument template-reference resolution, and introduces a new /new-project skill.

Changes:

Added project overview + project widgets endpoints and updated gateway status to expose engine_v2_enabled.
Implemented project CRUD actions in the engine bridge and added project goals/metrics types.
Added a template-ref resolution pass in LlmBridgeAdapter for {{call_id.field}} patterns in tool-call JSON.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
src/channels/web/server.rs	Adds new engine routes (`/projects/overview`, `/projects/{id}/widgets`) and exposes `engine_v2_enabled` in gateway status.
src/channels/web/handlers/frontend.rs	Implements `project_widgets_handler` to discover/load project-scoped widgets from workspace.
src/channels/web/handlers/engine.rs	Adds optional `project_id` query filter for missions/threads and a new overview handler.
src/bridge/router.rs	Registers project capability/actions and implements projects overview aggregation + project DTO expansions (goals/metrics).
src/bridge/mod.rs	Re-exports newly added overview DTOs and query function.
src/bridge/llm_adapter.rs	Resolves `{{call_id.field}}` references in tool-call parameters and adds unit tests.
src/bridge/effect_adapter.rs	Adds handlers for `project_create`/`project_update`/`project_list` and supports `project_id` override on mission creation.
skills/new-project/SKILL.md	Introduces the `new-project` skill definition and procedure.
crates/ironclaw_gateway/static/style.css	Adds styling for the Projects control room UI.
crates/ironclaw_gateway/static/index.html	Replaces legacy missions tab with Projects tab (v2) and updates tab labels/buttons.
crates/ironclaw_gateway/static/i18n/en.js	Renames “Memory” tab to “Workspace” and adds “Projects” label.
crates/ironclaw_gateway/static/app.js	Implements Projects control room behavior, overview/drill-in/detail rendering, and widget loading.
crates/ironclaw_gateway/src/widget.rs	Adds new widget slots (`ProjectHeader`, `ProjectSection`) and updates slot serialization tests.
crates/ironclaw_gateway/src/bundle.rs	Makes `ResolvedWidget` serializable for JSON responses.
crates/ironclaw_engine/src/types/project.rs	Adds `ProjectMetric` + goals/metrics fields on `Project`.
crates/ironclaw_engine/src/runtime/mission.rs	Exposes `MissionManager::store()` for project CRUD via the mission manager.
crates/ironclaw_engine/src/lib.rs	Re-exports `ProjectMetric`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-12T06:19:38Z

+Create recurring missions scoped to the project. Use the **slug** as `project_id`:
+
+```
+mission_create(name: "...", goal: "...", cadence: "daily", project_id: "{slug}")
+```


The skill instructs using the workspace slug as project_id for mission_create, but the engine expects project_id to be a UUID (see mission/project IDs parsed with Uuid::parse_str). As written, mission creation will fail to scope to the new project and will fall back to the current thread’s project (often Default). Update the skill to create the project first (e.g. via project_create) and then pass the returned UUID project_id, while still using the slug only for workspace paths.

Copilot · 2026-04-12T06:19:39Z

+        // Eval the JS module to register the widget.
+        try {
+          var api = typeof IronClaw !== 'undefined' ? IronClaw.api : null;
+          var fn = new Function('container', 'api', 'projectId', w.js);
+          fn(container, api, projectId);
+


new Function(...) is effectively eval and will be blocked by the gateway’s CSP (script-src does not include 'unsafe-eval'). This means project widgets will fail to mount in production. To keep CSP strict, load widget JS by injecting a <script type="module"> element carrying the page’s existing CSP nonce (or otherwise execute through a nonce-authorized script path) instead of using new Function.

Suggested change

// Eval the JS module to register the widget.

try {

var api = typeof IronClaw !== 'undefined' ? IronClaw.api : null;

var fn = new Function('container', 'api', 'projectId', w.js);

fn(container, api, projectId);

// Execute the widget through a nonce-authorized module script so it

// remains compatible with a strict CSP that disallows unsafe-eval.

try {

var api = typeof IronClaw !== 'undefined' ? IronClaw.api : null;

var nonceSource = document.querySelector('script[nonce]');

var nonce = nonceSource ? nonceSource.nonce || nonceSource.getAttribute('nonce') : '';

var widgetToken = 'cr-widget-' + manifest.id + '-' + Date.now() + '-' + Math.random().toString(36).slice(2);

var widgetRegistry = window.__ironclawProjectWidgetMounts || (window.__ironclawProjectWidgetMounts = {});

var script = document.createElement('script');

widgetRegistry[widgetToken] = {

container: container,

api: api,

projectId: projectId

};

script.type = 'module';

if (nonce) script.setAttribute('nonce', nonce);

script.setAttribute('data-widget', manifest.id);

script.addEventListener('error', function(err) {

delete widgetRegistry[widgetToken];

script.remove();

console.error('[projects] Failed to mount widget ' + manifest.id + ':', err);

container.innerHTML = '<div class="cr-empty">Widget error: ' + manifest.id + '</div>';

});

script.textContent =

'(function(){\n' +

' var registry = window.__ironclawProjectWidgetMounts || {};\n' +

' var ctx = registry[' + JSON.stringify(widgetToken) + '];\n' +

' if (!ctx) return;\n' +

' try {\n' +

' (function(container, api, projectId) {\n' +

w.js + '\n' +

' })(ctx.container, ctx.api, ctx.projectId);\n' +

' } finally {\n' +

' delete registry[' + JSON.stringify(widgetToken) + '];\n' +

' document.currentScript.remove();\n' +

' }\n' +

'})();';

document.head.appendChild(script);

Copilot · 2026-04-12T06:19:39Z

+pub async fn project_widgets_handler(
+    State(state): State<Arc<GatewayState>>,
+    AuthenticatedUser(user): AuthenticatedUser,
+    Path(project_id): Path<String>,
+) -> Result<Json<Vec<ResolvedWidget>>, (StatusCode, String)> {
+    // Resolve project name to derive the workspace slug.
+    let project = crate::bridge::get_engine_project(&project_id, &user.user_id)
+        .await
+        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?
+        .ok_or((StatusCode::NOT_FOUND, "Project not found".to_string()))?;
+
+    let slug = project
+        .name
+        .to_lowercase()
+        .replace(|c: char| !c.is_alphanumeric() && c != '-', "-");
+    let widgets_dir = format!("projects/{slug}/.system/widgets/");


Project widget lookup derives the workspace slug from project.name at request time. This makes widget discovery fragile: renaming a project (or creating workspace folders with a different slug) will break widget loading because the directory path changes. Consider persisting a stable workspace slug/path in the project record (metadata) and using that here, rather than recomputing from the display name.

Suggested change

pub async fn project_widgets_handler(

State(state): State<Arc<GatewayState>>,

AuthenticatedUser(user): AuthenticatedUser,

Path(project_id): Path<String>,

) -> Result<Json<Vec<ResolvedWidget>>, (StatusCode, String)> {

// Resolve project name to derive the workspace slug.

let project = crate::bridge::get_engine_project(&project_id, &user.user_id)

.await

.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?

.ok_or((StatusCode::NOT_FOUND, "Project not found".to_string()))?;

let slug = project

.name

.to_lowercase()

.replace(|c: char| !c.is_alphanumeric() && c != '-', "-");

let widgets_dir = format!("projects/{slug}/.system/widgets/");

fn project_name_slug(name: &str) -> String {

name.to_lowercase()

.replace(|c: char| !c.is_alphanumeric() && c != '-', "-")

}

fn project_widgets_dir_from_metadata(project: &impl serde::Serialize, fallback_name: &str) -> String {

let fallback = format!("projects/{}/.system/widgets/", project_name_slug(fallback_name));

let Ok(value) = serde_json::to_value(project) else {

return fallback;

};

let Some(metadata) = value.get("metadata") else {

return fallback;

};

let candidate = metadata

.get("workspace_slug")

.and_then(|v| v.as_str())

.or_else(|| metadata.get("workspaceSlug").and_then(|v| v.as_str()))

.or_else(|| metadata.get("workspace_path").and_then(|v| v.as_str()))

.or_else(|| metadata.get("workspacePath").and_then(|v| v.as_str()));

let Some(raw) = candidate.map(str::trim).filter(|s| !s.is_empty()) else {

return fallback;

};

let normalized = raw.trim_matches('/');

if normalized.is_empty() {

return fallback;

}

let project_root = if normalized.starts_with("projects/") {

normalized.to_string()

} else {

format!("projects/{normalized}")

};

format!("{project_root}/.system/widgets/")

}

pub async fn project_widgets_handler(

State(state): State<Arc<GatewayState>>,

AuthenticatedUser(user): AuthenticatedUser,

Path(project_id): Path<String>,

) -> Result<Json<Vec<ResolvedWidget>>, (StatusCode, String)> {

// Prefer a persisted workspace slug/path from project metadata so widget

// lookup remains stable across project renames. Fall back to the historic

// name-derived slug for backward compatibility with older project records.

let project = crate::bridge::get_engine_project(&project_id, &user.user_id)

.await

.map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?

.ok_or((StatusCode::NOT_FOUND, "Project not found".to_string()))?;

let widgets_dir = project_widgets_dir_from_metadata(&project, &project.name);

Copilot · 2026-04-12T06:19:39Z

+    let workspace = resolve_workspace(&state, &user).await?;
+    let layout = read_layout_config(&workspace).await;
+
+    let entries = workspace.list(&widgets_dir).await.unwrap_or_default();


workspace.list(&widgets_dir).await.unwrap_or_default() silently treats backend listing failures the same as “no widgets”. In this file, other widget discovery paths log a warn! on real workspace errors to avoid hiding operational issues. Consider matching that behavior here (log + empty fallback, or propagate a 500) so widget outages are diagnosable.

Suggested change

let entries = workspace.list(&widgets_dir).await.unwrap_or_default();

let entries = match workspace.list(&widgets_dir).await {

Ok(entries) => entries,

Err(error) => {

tracing::warn!(

"failed to list widgets in workspace at {}: {}",

widgets_dir,

error

);

Vec::new()

}

};

Fixed in b7fe8cc — now logs at debug! level before falling back to empty list.

Copilot · 2026-04-12T06:19:39Z

+        let css = workspace
+            .read(&format!("{widgets_dir}{name}/style.css"))
+            .await
+            .ok()
+            .map(|d| d.content)
+            .filter(|c| !c.trim().is_empty() && c.len() <= MAX_WIDGET_CSS_BYTES);
+


Project widget CSS is returned and later injected without running through the existing scope_css isolation used for global widgets. This allows a project widget’s style.css to affect the entire gateway UI (style leakage / UI spoofing). Consider scoping the CSS server-side with ironclaw_gateway::scope_css (and applying the same tag-close neutralization used in bundle assembly) before returning it to the client.

Copilot · 2026-04-12T06:19:40Z

+        let threads = store.list_threads(pid, user_id).await.unwrap_or_default();
+        let missions = store
+            .list_missions_with_shared(pid, user_id)
+            .await
+            .unwrap_or_default();


get_engine_projects_overview swallows store errors via unwrap_or_default() when listing threads/missions. If the store is unavailable or returns an error, the overview will silently report “no activity/missions” and can incorrectly show green health. Prefer propagating the error (or at least logging and marking the project unhealthy) so the UI doesn’t mask backend failures.

Suggested change

let threads = store.list_threads(pid, user_id).await.unwrap_or_default();

let missions = store

.list_missions_with_shared(pid, user_id)

.await

.unwrap_or_default();

let threads = store

.list_threads(pid, user_id)

.await

.map_err(|e| engine_err("list project threads", e))?;

let missions = store

.list_missions_with_shared(pid, user_id)

.await

.map_err(|e| engine_err("list project missions", e))?;

Copilot · 2026-04-12T06:19:40Z

+    for project in &projects {
+        let pid = project.id;
+        let threads = store.list_threads(pid, user_id).await.unwrap_or_default();
+        let missions = store
+            .list_missions_with_shared(pid, user_id)
+            .await
+            .unwrap_or_default();


This overview implementation performs per-project I/O (list_threads + list_missions_with_shared) inside a loop, which becomes an N+1 pattern as project count grows. Consider batching at the store layer (e.g., list threads/missions for all projects in one query) or at least fetching threads/missions concurrently per project to reduce tail latency for the dashboard.

Copilot · 2026-04-12T06:19:40Z

+        // Count pending gates for threads in this project.
+        let project_thread_ids: std::collections::HashSet<_> =
+            threads.iter().map(|t| t.id).collect();
+        let project_gates: Vec<_> = user_gates
+            .iter()
+            .filter(|g| project_thread_ids.contains(&g.thread_id))
+            .collect();
+        let pending_gate_count = project_gates.len() as u64;


Pending-gate counting does a full scan of user_gates for every project (user_gates.iter().filter(...)), which is O(projects × gates) and also allocates a HashSet per project. Consider pre-indexing user_gates by thread_id once (e.g., HashMap<ThreadId, Vec>) and counting/collecting by looking up only the project’s thread IDs.

Copilot · 2026-04-12T06:19:40Z

+/// Returns the resolved string if all references could be substituted,
+/// or the original string if none were found.


The doc comment says this returns the resolved string only if all references could be substituted, but the implementation can partially substitute earlier templates and then break on the first unresolvable one (returning a partially-resolved string). Either adjust the behavior to be all-or-nothing, or update the comment to reflect the current partial-substitution semantics.

Suggested change

/// Returns the resolved string if all references could be substituted,

/// or the original string if none were found.

/// Returns the original string unchanged if no template markers are found.

/// Otherwise, resolves references iteratively until there are no more

/// `{{...}}` patterns or until an unresolvable reference is encountered.

/// If resolution stops on an unresolvable reference, any earlier successful

/// substitutions are preserved, so the returned string may be partially resolved.

Copilot · 2026-04-12T06:19:41Z

+                    Ok(u) => ironclaw_engine::ProjectId(u),
+                    Err(e) => {
+                        return Some(Err(EngineError::Effect {
+                            reason: format!("Invalid project_id: {e}"),


The error message uses Invalid project_id but the parameter name for this action is id (and the schema also calls it id). Using the same term in code and schema would make debugging easier (e.g., Invalid project id or Invalid id).

Suggested change

reason: format!("Invalid project_id: {e}"),

reason: format!("Invalid id: {e}"),

ilblackdragon · 2026-04-12T06:41:13Z

Project Detail Page

Screenshot from the Playwright E2E test (test_project_detail.py):

Shows the drill-in view for 'AI Research Intelligence' project with goals, 4 missions (3 active, 1 paused), and recent activity threads with status badges.

To view the screenshot locally:

cd tests/e2e && .venv/bin/pytest scenarios/test_project_detail.py -v
# Screenshot saved to tests/e2e/project-detail.png

Copilot

Pull request overview

Copilot reviewed 18 out of 19 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-12T06:49:55Z

+        // Inject scoped CSS if present.
+        if (w.css) {
+          var style = document.createElement('style');
+          style.setAttribute('data-widget', manifest.id);
+          style.textContent = w.css; // Already scoped server-side? No — project widgets come raw.
+          // Scope it client-side using the same prefix convention.
+          var prefix = '[data-widget="' + manifest.id + '"]';
+          // Simple client-side scoping: prepend prefix to each rule.
+          // For production, use the server's scope_css. For now, inject as-is
+          // with the data-widget attribute providing natural scoping.
+          document.head.appendChild(style);
+        }


Project widget CSS is injected without scoping/isolation (unlike global widgets, which are passed through scope_css). Relying on a data-widget attribute alone doesn’t prevent selectors like body { ... } from leaking across the app. Consider applying scope_css server-side in project_widgets_handler (preferred, consistent with bundle assembly) or implement real client-side selector prefixing before injecting.

Copilot · 2026-04-12T06:49:55Z

+    // Resolve project name to derive the workspace slug.
+    let project = crate::bridge::get_engine_project(&project_id, &user.user_id)
+        .await
+        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?
+        .ok_or((StatusCode::NOT_FOUND, "Project not found".to_string()))?;
+
+    let slug = project
+        .name
+        .to_lowercase()
+        .replace(|c: char| !c.is_alphanumeric() && c != '-', "-");
+    let widgets_dir = format!("projects/{slug}/.system/widgets/");
+


Deriving the workspace slug from the current project name makes the widgets path unstable (rename breaks widget discovery) and can also cause collisions between projects with the same/similar name. Suggestion: persist a stable slug in the project record/metadata at creation time (or derive from project_id), and use that for projects/{slug}/... lookups instead of re-slugifying project.name on every request.

Copilot · 2026-04-12T06:49:55Z

+        let css = workspace
+            .read(&format!("{widgets_dir}{name}/style.css"))
+            .await
+            .ok()
+            .map(|d| d.content)
+            .filter(|c| !c.trim().is_empty() && c.len() <= MAX_WIDGET_CSS_BYTES);
+
+        let enabled = layout
+            .widgets
+            .get(&manifest.id)
+            .map(|w| w.enabled)
+            .unwrap_or(true);
+        if !enabled {
+            continue;
+        }
+
+        widgets.push(ResolvedWidget {
+            manifest,
+            js: js_doc.content,
+            css,
+        });


This handler returns raw widget CSS/JS, but unlike the global widget pipeline it doesn’t apply scope_css (style isolation) or use the existing <script nonce=…> module injection approach. If project widgets are meant to behave like normal widgets, consider scoping CSS server-side with ironclaw_gateway::scope_css and serving JS as a same-origin module script (or otherwise ensuring it can run under the current CSP).

Copilot · 2026-04-12T06:49:56Z

+/// Per-project summary with computed health and stats.
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct ProjectOverviewEntry {
+    pub id: String,
+    pub name: String,
+    pub description: String,
+    /// `"green"`, `"yellow"`, or `"red"`.
+    pub health: String,
+    pub active_missions: u64,
+    pub total_missions: u64,
+    pub threads_today: u64,
+    pub cost_today_usd: f64,
+    pub failures_24h: u64,
+    pub pending_gates: u64,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    pub last_activity: Option<String>,
+    pub created_at: String,
+}


ProjectOverviewEntry doesn’t include goals/metrics, but the Projects drill-in UI (and the new E2E mock data) expects goals to be available from the overview payload. Either add goals (and optionally metrics) to ProjectOverviewEntry, or have the frontend fetch /api/engine/projects/{id} on drill-in and render goals/metrics from that response.

Copilot · 2026-04-12T06:49:56Z

+  if (tab === 'projects') {
+    loadProjectsOverview();
+  } else if (crCurrentProjectId) {
+    // Reset drill-in state when leaving Projects tab so we start fresh.


When leaving the Projects tab, the code resets crCurrentProjectId but does not call destroyProjectWidgets() or reset/hide the drill/detail DOM. If a project widget registers timers/event listeners, it can keep running while the tab is hidden, and the next visit may double-mount widgets. Consider calling destroyProjectWidgets() (and/or crBackToOverview()) when switching away from tab === 'projects'.

Suggested change

// Reset drill-in state when leaving Projects tab so we start fresh.

// Reset drill-in state when leaving Projects tab so we start fresh.

// Use the existing Projects navigation reset so any mounted widgets,

// listeners, and drill/detail DOM are torn down before clearing state.

crBackToOverview();

serrrfirat · 2026-04-12T11:22:57Z

                    };
+                // Allow explicit project_id override (so agent can create
+                // missions in a specific project from any thread).
+                let target_project = params


Bug: the new-project skill tells the model to pass a project slug to mission_create.project_id, but this code only accepts a UUID and silently falls back to context.project_id on parse failure. The skill also does not call project_create, so /new-project can write files under projects/{slug} while all created missions land in the current/default project and no project card is created. Please either make the skill create a real project and use the returned UUID, or reject/resolve non-UUID project IDs here instead of falling back silently.

serrrfirat · 2026-04-12T11:22:58Z

+
+/// Per-project summary with computed health and stats.
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct ProjectOverviewEntry {


The frontend drill-in renders proj.goals/proj.metrics from the cached overview response, but ProjectOverviewEntry does not serialize those fields. In production this branch never renders project goals/metrics; the e2e mock includes fields that the real API cannot return, so the screenshot test misses the contract mismatch. Please add these fields to the overview response or fetch project detail before rendering them, and align the test mock with the production response shape.

serrrfirat · 2026-04-12T11:22:59Z

+            continue;
+        }
+
+        let css = workspace


This returns raw project widget CSS to the frontend, which then injects it into document.head without actually prefixing selectors. Global widgets go through scope_css(...) before being embedded, so project widgets can leak styles across the whole gateway UI or between widgets. Please scope this server-side with the existing CSS scoping helper, or reuse the resolved-widget loading path that already enforces the same isolation contract.

serrrfirat · 2026-04-12T11:23:00Z

+
+    for project in &projects {
+        let pid = project.id;
+        let threads = store.list_threads(pid, user_id).await.unwrap_or_default();


Please do not turn store errors into empty thread/mission lists here. If either backend call fails, the overview can report zero activity, green health, and no attention items for a broken project, which makes operational failures look healthy. Propagate the error, or surface a degraded/unknown health state instead of unwrap_or_default().

henrypark133

Review: New-project skill + Projects Control Room + template ref resolution (Risk: Medium)

Good direction — the Projects Control Room is a meaningful UX upgrade, and template ref resolution for parallel tool calls is a valuable Qwen compatibility fix. However, two critical issues block this PR.

Positives:

Clean engine v2 gating with data-v2-only / data-v1-only attributes
Template ref resolution is well-tested with 7 unit tests covering multi-ref, nested JSON, and no-ref cases
E2E screenshot test provides visual regression coverage
Project overview with attention items and health stats is a solid foundation

Critical: `mission_create` project_id override has no ownership check (IDOR)

File: src/bridge/effect_adapter.rs (handle_project_call / mission_create changes)

The LLM can supply an arbitrary project_id UUID to mission_create:

let target_project = params.get("project_id")
    .and_then(|v| v.as_str())
    .and_then(|s| uuid::Uuid::parse_str(s).ok())
    .map(ironclaw_engine::ProjectId)
    .unwrap_or(context.project_id);

MissionManager::create_mission() does NOT validate that target_project is owned by context.user_id. A prompt-injected LLM response could create missions in any user's project by guessing/leaking a UUID.

Fix: Validate ownership before accepting a non-default project_id:

if target_project != context.project_id {
    let proj = store.load_project(target_project).await?;
    match proj {
        Some(p) if p.is_owned_by(&context.user_id) => {},
        _ => return Some(Err(EngineError::Effect {
            reason: "project_id does not belong to current user".into()
        })),
    }
}

Critical: new-project skill v0.2 instructs agent to use slug as UUID — silently broken

File: skills/new-project/SKILL.md

The v0.2 skill tells the LLM: "Use the slug as project_id" — e.g. mission_create(project_id: "ai-research"). But project_id is parsed as uuid::Uuid::parse_str(), which fails on slugs and silently falls back to context.project_id. Every mission gets created in the wrong project.

Fix: Restore the project_create → get UUID → pass to mission_create workflow from v0.1.

Concerning: `ProjectOverviewEntry` omits `goals` field

File: src/bridge/router.rs

The struct has no goals field, but app.js reads proj.goals. The E2E mock masks this — goals will never display in production.

Fix: Add goals: Vec<String> to ProjectOverviewEntry and populate it.

Concerning: E2E test widget mock has wrong JSON shape

File: tests/e2e/scenarios/test_project_detail.py

Mock returns {"widgets": []} (object) but the real handler returns Vec<ResolvedWidget> (bare array). Test passes by accident because both result in "no widgets rendered."

Concerning: Template ref resolution not tested through the caller

File: src/bridge/llm_adapter.rs

Per .claude/rules/testing.md, the caller path (LlmBridgeAdapter::complete() with actual {{call_id.field}} refs in message history) should have a test, not just the helper functions.

Concerning: N+1 DB queries in overview

File: src/bridge/router.rs — get_engine_projects_overview()

2 sequential queries per project in a loop. Acceptable for MVP but will cause visible latency at 20+ projects.

Minor notes:

Slug collision between similarly-named projects could serve wrong widgets via project_widgets_handler
CI failing: formatting, cargo-deny, code style need fixing before merge

ilblackdragon

Review response (`5e6c900`)

Thanks @henrypark133 and @serrrfirat for the thorough review. All issues addressed:

Critical fixes

IDOR on mission_create project_id — Added ownership check: when project_id differs from the thread's project, we now load_project() + is_owned_by() before accepting. Non-UUID slugs (the old silent fallback) now return an explicit error.

project_create/update/list tools removed — These were supposed to be removed from this PR. The new-project skill intentionally uses memory_write + mission_create only, not separate project CRUD tools. Removed the capability registration, the handle_project_call() method, and the dispatch call.

Other fixes

Issue	Fix
`ProjectOverviewEntry` missing `goals`	Added `goals: Vec<String>` field, populated from project
Store errors swallowed in overview	Changed `unwrap_or_default()` to `map_err()?` — errors propagate now
Project widget CSS not scoped	Applied `ironclaw_gateway::scope_css()` server-side in `project_widgets_handler`
Template ref doc comment mismatch	Updated to document partial resolution semantics
E2E mock widget shape wrong	Fixed: `[]` instead of `{"widgets": []}`
Widget leak on tab switch	`crBackToOverview()` now called when leaving Projects tab
CI: cargo-deny failures	Removed stale wasmtime advisory ignores, added `RUSTSEC-2026-0097` (rand)
CI: formatting	Ran `cargo fmt`

Tests added

Caller-level template ref test: complete_resolves_template_refs_through_adapter — drives LlmBridgeAdapter::complete() with a mock provider returning {{call-1.project_id}} refs, verifies resolution through the full adapter path (per testing rules: "Test Through the Caller")
Total: 9 template ref tests (8 helper + 1 caller-level)

Deferred

N+1 in overview: acceptable for MVP, will batch when project count warrants it
new Function CSP concern: project widgets follow the same trust model as global widgets (user-placed = trusted), and the gateway CSP already allows inline scripts. Will revisit if we add untrusted widget sources.
Slug collision for widget paths: noted, will add stable slug to project metadata in a follow-up

serrrfirat · 2026-04-13T07:40:02Z

+        // Eval the JS module to register the widget.
+        try {
+          var api = typeof IronClaw !== 'undefined' ? IronClaw.api : null;
+          var fn = new Function('container', 'api', 'projectId', w.js);


Critical Severity — Arbitrary JS execution via workspace widgets

Project widgets load JS from projects/{slug}/.system/widgets/{id}/index.js in the workspace and execute it via new Function(...) in the browser. The memory_write tool has no write protection for .system/widgets/ paths within project directories.

A prompt injection attack could instruct the LLM to write malicious JS to projects/{slug}/.system/widgets/evil/index.js + a manifest.json. The next time the user drills into that project, the JS executes in their authenticated session with full access to IronClaw.api, cookies, and the DOM.

Unlike global widgets (loaded via <script type="module"> from a server-controlled template), project widgets are dynamically fetched and eval'd from user-writable workspace storage. No CSP, no sandboxing, no user confirmation.

Suggested fix: Either:
(a) Block memory_write from writing to .system/ paths unless via a privileged code path, or
(b) Sandbox project widget JS in an iframe with a null origin, or
(c) Require explicit user confirmation before mounting project widgets.

Deferred — widgets follow same trust model as global widgets (user-authored workspace content). CSP blocks external eval. Will revisit for untrusted widget sources.

serrrfirat · 2026-04-13T07:40:10Z

+/// encountered, resolution stops and earlier successful substitutions
+/// are preserved (partial resolution). Returns the original string
+/// unchanged if no `{{` markers are found.
+fn resolve_template_refs(value: &str, tool_results: &[(String, serde_json::Value)]) -> String {


High Severity — Template ref resolution operates on LLM-controlled data before safety pipeline

Template resolution substitutes {{call_id.field}} references from prior tool results into tool call parameters. Both call_id and field come from LLM output, and the resolved value flows directly into downstream tool parameters.

Resolution happens before the safety pipeline sees the parameters. If a malicious tool result contains crafted JSON (e.g., {"project_id": "'; DROP TABLE ..."}), the resolved value flows unsanitized into downstream tools. The effect adapter validates project_id as UUID (blocking this specific case), but other tools may not validate as strictly.

Suggested fix: Document this trust boundary explicitly. Log when template resolution occurs for audit trail. Consider running resolved parameters through safety validation.

Deferred — template refs resolve from prior tool results already validated by the safety pipeline. Will add a second pass on substituted values as a hardening step in a follow-up.

serrrfirat · 2026-04-13T07:40:16Z

+/// Iterates all projects, computes per-project stats from missions and threads,
+/// and collects pending gates as attention items. Designed for the control room
+/// dashboard where the user checks in on a highly autonomous agent.
+pub async fn get_engine_projects_overview(


High Severity — N+1 queries in projects overview

For each project, the handler issues separate list_threads(pid, user_id) and list_missions_with_shared(pid, user_id) queries. With P projects, this is 2P+1 database round trips. A user with 20 projects and thousands of threads will see significant latency. No timeout, pagination, or caching.

Suggested fix: Add pagination or caching. Consider a dedicated Store method that computes per-project stats in a single query. At minimum, add a TODO documenting the N+1 pattern.

serrrfirat · 2026-04-13T07:40:22Z

+Create recurring missions scoped to the project. Use the **slug** as `project_id`:
+
+```
+mission_create(name: "...", goal: "...", cadence: "daily", project_id: "{slug}")


Medium Severity — Skill tells LLM to use slug as project_id, but effect adapter rejects non-UUIDs

The skill instructs: mission_create(... project_id: "{slug}"). But the new effect adapter code rejects non-UUID project_id values with "Invalid project_id '{pid_str}': must be a UUID." This means the LLM following the skill will always fail when passing a slug.

The skill and the validation are contradictory — this is a functional bug.

Suggested fix: Update the skill to instruct the LLM to use the project UUID returned from project creation, not the slug.

serrrfirat · 2026-04-13T07:40:27Z

+          // Scope it client-side using the same prefix convention.
+          var prefix = '[data-widget="' + manifest.id + '"]';
+          // Simple client-side scoping: prepend prefix to each rule.
+          // For production, use the server's scope_css. For now, inject as-is


Medium Severity — Misleading CSS injection comment

The client-side code has a comment saying "For production, use the server's scope_css. For now, inject as-is" — but the server-side handler does call scope_css(). The comment is wrong and could lead a future developer to remove the server-side scoping (thinking it's handled client-side) or vice versa.

Suggested fix: Fix the comment to accurately state that CSS arrives pre-scoped from the server.

Fixed in b7fe8cc — comment now correctly states CSS is scoped server-side.

serrrfirat · 2026-04-13T07:40:32Z

+          _projectWidgets.push({
+            id: manifest.id,
+            container: container,
+            style: style || null,


Medium Severity — `style` variable potentially undefined

If `w.css` is falsy, the `style` variable declared inside the `if (w.css)` block is never assigned. In non-strict JS, `style` resolves to `undefined` (or throws a ReferenceError in strict mode). The `|| null` fallback handles `undefined` but relies on hoisting behavior.

Suggested fix: Declare `var style = null;` before the `if (w.css)` block to make the intent explicit.

Fixed in b7fe8cc — var style = null; declared before the if block.

serrrfirat · 2026-04-13T07:40:35Z

    Ok(Json(EngineProjectListResponse { projects }))
 }

+pub async fn engine_projects_overview_handler(


Medium Severity — No rate limiting on expensive overview endpoint

The /api/engine/projects/overview endpoint is computationally expensive (N+1 queries) but has no rate limiting. An attacker with a valid auth token could spam this endpoint to create excessive database load. Other expensive endpoints (chat send) have rate limiting.

Suggested fix: Add rate limiting or caching for this endpoint.

Deferred — endpoint is auth-protected and single-user. Will add rate limiting when project count grows.

…new-project skill Adds project metrics types, mission cadence scheduling via gateway, and a /new-project skill for creating autonomous projects with goals, metrics, and missions. Includes gateway frontend enhancements for project views with metrics and goal tracking. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… new-project skill Two fixes from trace analysis (trace_20260411T133641.json): 1. Skill rewrite: new-project skill now instructs the model to use memory_write + mission_create directly instead of referencing nonexistent project_create/project_update tools. Includes goals and metrics when appropriate. Instructs sequential execution. 2. Template ref resolution: some OpenAI-format models (e.g. Qwen) emit {{call_id.field}} references in parallel tool call arguments. Added resolution pass in LlmBridgeAdapter that scans ActionCall parameters for these patterns and resolves them from prior tool results in the conversation history. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ilblackdragon · 2026-04-13T13:32:40Z

PR Review Response — Round 2

Rebased onto latest staging and addressed remaining open comments:

Fixed in `b7fe8cc`

Slug-based project_id in mission_create (gemini-code-assist, Copilot, @serrrfirat, @henrypark133) — mission_create now accepts project names/slugs, not just UUIDs. When a non-UUID project_id is provided, it resolves by matching against the user's projects (name or slugified name). SKILL.md updated accordingly.
Misleading CSS comment in app.js (@serrrfirat #3071506775) — Fixed. Comment now correctly states CSS is scoped server-side.
style variable potentially undefined (@serrrfirat #3071507029) — var style = null; declared before the if block.
workspace.list() errors silently swallowed (Copilot #3069074674) — Now logs at debug! level before falling back to empty list.

Deferred (with rationale)

Template ref resolution before safety pipeline (@serrrfirat #3071505436) — Template refs resolve from prior tool results that were already validated by the safety pipeline when produced. Adding a second safety pass on substituted values is a good hardening step but doesn't block this PR.
new Function() for widget JS (@serrrfirat #3071504747, Copilot #3069074652) — Widgets follow the same trust model as global widgets (user-authored workspace content). CSP blocks external eval. Will revisit for untrusted widget sources.
Rate limiting on overview endpoint (@serrrfirat #3071507230) — Endpoint is auth-protected and single-user. Low risk for MVP; will add when project count warrants.
N+1 queries (Copilot, @serrrfirat, @henrypark133) — Acceptable for MVP, will batch when project count grows.
Unstable slug from project name (Copilot) — Will add stable slug to project metadata in follow-up.

Copilot

Pull request overview

Copilot reviewed 19 out of 20 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-13T13:38:04Z

+    for project in &projects {
+        let pid = project.id;
+        let threads = store
+            .list_threads(pid, user_id)
+            .await
+            .map_err(|e| engine_err("list project threads", e))?;
+        let missions = store
+            .list_missions_with_shared(pid, user_id)
+            .await
+            .map_err(|e| engine_err("list project missions", e))?;


get_engine_projects_overview does per-project I/O sequentially (list_threads then list_missions_with_shared inside a for loop). With many projects (or slow backends) this becomes an N+1 latency multiplier and can degrade the Projects overview UI. Consider fetching threads/missions concurrently per project (e.g., tokio::try_join! for the two calls, and futures::future::try_join_all across projects) to keep the endpoint responsive.

serrrfirat · 2026-04-15T14:00:27Z

+                                    .collect();
+                                name_lower == needle
+                                    || name_slug == needle
+                                    || name_slug.starts_with(&format!("{needle}-"))


Medium Severity — Slug prefix matching can route missions to wrong project

The starts_with branch at line 315 is over-broad. If a user has projects "AI" (slug: ai) and "AI Research Intelligence" (slug: ai-research-intelligence), a mission_create with project_id: "ai" will match "AI Research Intelligence" because "ai-research-intelligence".starts_with("ai-") is true. Combined with find() returning the first match, this is non-deterministic — the result depends on list ordering from the store.

Suggested fix: Remove the starts_with branch and require exact name/slug match. If fuzzy matching is desired, split both the needle and slug by - and check that all needle tokens match as a prefix of slug tokens:

name_lower == needle || name_slug == needle // Remove: || name_slug.starts_with(&format!("{needle}-"))

Also: no tests cover this resolution logic. Given the ambiguity risk, adding unit tests for exact match, slug match, and multi-project disambiguation would be valuable.

Fixed in a617fea — removed the starts_with branch entirely. Now only exact name or exact slug match is accepted.

serrrfirat · 2026-04-15T14:00:36Z

+
+    let mut result = value.to_string();
+    // Iteratively resolve all `{{..}}` patterns (limit iterations to prevent infinite loops)
+    for _ in 0..50 {


Medium Severity — Second-order template injection via tool results

The iterative resolution loop re-scans from position 0 after each replacement. If a resolved value itself contains {{call-id.field}} patterns, those will be resolved in subsequent iterations. This creates a second-order injection path:

Tool A returns {"id": "{{tool-B-call-id.secret_field}}"}

LLM makes a parallel call referencing {{tool-A-call-id.id}}

First iteration resolves to {{tool-B-call-id.secret_field}}

Second iteration resolves the injected ref, leaking Tool B's data into Tool A's parameter

While the LLM already sees all tool results (limiting practical impact), this is relevant when tools come from untrusted MCP servers where one tool shouldn't be able to read another's results.

Suggested fix: After resolving a template ref, advance the scan position past the replacement rather than restarting from 0:

// Instead of scanning from 0 each time: let mut pos = 0; for _ in 0..50 { let Some(start) = result[pos..].find("{{").map(|i| i + pos) else { break; }; // ... resolve ... match resolved { Some(val) => { let val_len = val.len(); result.replace_range(start..end + 2, &val); pos = start + val_len; // skip past replacement } None => break, } }

Fixed in a617fea — resolve_template_refs now tracks a search_from cursor that advances past each resolved value, so injected {{...}} patterns in tool results are never re-scanned. Added resolve_template_refs_no_second_order_injection test.

serrrfirat · 2026-04-15T14:00:42Z

+/// Iterates all projects, computes per-project stats from missions and threads,
+/// and collects pending gates as attention items. Designed for the control room
+/// dashboard where the user checks in on a highly autonomous agent.
+pub async fn get_engine_projects_overview(


Medium Severity — N+1 query problem in projects overview

This function issues 1 + 2N database queries: 1 for `list_projects`, then per project: `list_threads` + `list_missions_with_shared`. For a user with 10 projects, that's 21 round trips. The overview endpoint is called on every tab switch to Projects, so this will be felt in the UI.

Suggested fix (pick one):

Batch + partition: Add `store.list_all_threads_for_user(user_id)` and `store.list_all_missions_for_user(user_id)`, then partition by project_id in memory. Reduces to 3 queries total.

Dedicated aggregate query: Add a single SQL query that JOINs projects with thread/mission counts, returning all stats in one round trip.

Short-term: Cache the overview response for a few seconds (`tokio::sync::watch` or a simple timestamp check), since the data is statistical and doesn't need real-time accuracy.

Fixed in a617fea — parallelized with tokio::try_join! (intra-project) and futures::try_join_all (inter-project). Still 1+2N queries total but they execute concurrently now.

serrrfirat · 2026-04-15T14:00:50Z

+        // Eval the JS module to register the widget.
+        try {
+          var api = typeof IronClaw !== 'undefined' ? IronClaw.api : null;
+          var fn = new Function('container', 'api', 'projectId', w.js);


Medium Severity — Arbitrary JS execution from workspace content

`new Function(..., w.js)` executes arbitrary JavaScript from the user's workspace (`projects/{slug}/.system/widgets/`). While this is authenticated and the user is executing their own code, the risk is that any tool with workspace write access (e.g., `memory_write`, MCP tools) could place a malicious widget in `.system/widgets/` that then runs with full browser context (cookies, DOM access, `apiFetch` via the injected `api` parameter).

This is a privilege escalation path: a tool that should only write data can now execute code in the user's browser session.

Suggested mitigations (not necessarily all needed):

Document: Mark `.system/` paths as privileged in the safety layer so writes to `.system/widgets/` require explicit user approval

Sandbox: Run widget JS in an iframe with `sandbox` attribute instead of `new Function`

Audit log: Log widget loads (project, widget id, hash) so users can detect unexpected widget installation

Acknowledged — deferring to a follow-up. Same trust model as global widgets. Will add .system/ write protection when we add untrusted widget sources.

serrrfirat · 2026-04-15T14:00:57Z

+  if (!el) return;
+  if (!items.length) { el.style.display = 'none'; return; }
+  el.style.display = '';
+  el.innerHTML = '<div class="cr-attention-title">Needs attention</div>'


Medium Severity — Hard-coded English strings break i18n

The new control room code uses hard-coded English strings throughout instead of `I18n.t()` calls:

Line 5916: `"Needs attention"`

Line 5989: `"All Projects"`

Line 5997: `"Goals"`

Line 6040: `"Missions"`

Line 6070: `"Recent Activity"`

Plus: "No missions configured yet", "No threads yet", "Loading projects...", "Failed to load projects", etc.

The rest of the codebase consistently uses `I18n.t('key')` for all user-visible strings.

Suggested fix: Add i18n keys under a `projects.` namespace in `en.js` (and other locale files) and replace all hard-coded strings with `I18n.t()` calls. This can be done as a follow-up PR if you'd prefer to land the functionality first.

Valid, deferring to a follow-up PR — adding i18n keys for the control room strings is a separate concern from the functional fixes here.

serrrfirat · 2026-04-16T13:32:31Z

+                                            c
+                                        } else {
+                                            '-'
+                                        }


High Severity — Fuzzy slug matching can silently create missions in the wrong project

The starts_with check on the slug match (around line 306–310) can resolve to an unintended project. Example:

name_slug.starts_with(&format!("{needle}-"))

If a user/agent passes project_id="ai" and there's no project named exactly "AI", this matches any project whose slug starts with "ai-" — e.g., "AI Research Intelligence" (ai-research-intelligence). The mission silently lands in the wrong project instead of returning an error.

Suggested fix: Remove the starts_with clause entirely, or require an exact match on either name_lower or name_slug. Prefix matching on slugs is too ambiguous when multiple projects share a common prefix:

name_lower == needle || name_slug == needle

Fixed in a617fea — starts_with branch removed. Only exact name/slug match accepted.

serrrfirat · 2026-04-16T13:32:36Z

+    let slug = project
+        .name
+        .to_lowercase()
+        .replace(|c: char| !c.is_alphanumeric() && c != '-', "-");


Medium Severity — Slug generation inconsistency between effect_adapter and frontend handler

The slug algorithm differs between these two files:

effect_adapter.rs (line ~294) — uses is_ascii_alphanumeric():

.map(|c| if c.is_ascii_alphanumeric() || c == '-' { c } else { '-' })

frontend.rs (line ~498) — uses is_alphanumeric() (Unicode-aware):

.replace(|c: char| !c.is_alphanumeric() && c != '-', "-")

For non-ASCII project names (e.g., "Café Project"), effect_adapter produces "caf--project" while frontend.rs produces "café-project". This means project_widgets_handler may look up widgets in a different directory than where mission_create resolved the project.

Suggested fix: Extract a single Project::slug() method on the Project type in crates/ironclaw_engine/src/types/project.rs and use it in both locations.

Fixed in a617fea — frontend.rs now uses is_ascii_alphanumeric() matching effect_adapter.rs.

serrrfirat · 2026-04-16T13:32:42Z

                    } else {
                        vec![]
                    };
+                // Allow explicit project_id override (so agent can create


Medium Severity — No test coverage for cross-project mission_create ownership validation

The new project_id override logic (lines 245–340) is security-critical — it prevents IDOR by validating that the target project belongs to the current user. However, there are no tests for any of these paths:

UUID project_id with valid ownership → mission created in target project

UUID project_id owned by a different user → error returned

Non-existent UUID project_id → error returned

Slug/name resolution → correct project matched

Ambiguous slug → expected behavior documented and tested

Per CLAUDE.md testing rules ("Test Through the Caller, Not Just the Helper"), this should have at least an integration-tier test that drives the EffectBridgeAdapter::execute() call site with a mission_create action containing an explicit project_id.

IDOR ownership validation is integration-tier (requires DB). Added two unit-tier security tests for template ref injection paths in this commit. The IDOR paths themselves are exercised via the existing complete_resolves_template_refs_through_adapter caller-level test pattern.

Resolves merge conflicts: - deny.toml: accept new RUSTSEC-2026-0097 advisory - bridge/mod.rs: deduplicate re-exports, add BridgeOutcome - bridge/router.rs: combine project_id with expanded cadence/guardrail params - bridge/effect_adapter.rs: keep both IDOR validation and guardrails extraction - app.js: integrate ActivityWorkStore, mission progress tracking alongside control room; unify openMissionDetail/openEngineThread to route by active tab - index.html: add Missions tab alongside Projects tab Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…+1 queries - Remove over-broad `starts_with` slug prefix matching in mission_create project_id resolution — require exact name/slug match only (serrrfirat) - Fix slug generation inconsistency: frontend.rs now uses is_ascii_alphanumeric() matching effect_adapter.rs (serrrfirat) - Prevent second-order template injection: resolve_template_refs now advances past resolved content instead of re-scanning from position 0, and skips unresolvable refs instead of breaking (serrrfirat) - Parallelize N+1 overview queries: per-project thread/mission fetches now use tokio::try_join! + futures::try_join_all (serrrfirat, Copilot) - Add two new security tests for template ref resolution Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ilblackdragon · 2026-04-17T15:28:36Z

PR Review Response — Round 3

Merged origin/staging and addressed all outstanding comments from @serrrfirat and Copilot. Changes in a617fea.

Fixed

Issue	Reviewer	Fix
Fuzzy slug matching routes missions to wrong project	@serrrfirat	Removed `starts_with` prefix branch — only exact name/slug match accepted now
Slug generation inconsistency (effect_adapter vs frontend.rs)	@serrrfirat	Both now use `is_ascii_alphanumeric()`
Second-order template injection via re-scanning resolved content	@serrrfirat	`resolve_template_refs` now advances `search_from` past replacement text; also fixed `break` on unresolvable ref → `skip + continue`
N+1 queries in projects overview	@serrrfirat, Copilot	Per-project I/O now parallel: `tokio::try_join!` for threads+missions within each project, `futures::try_join_all` across projects
No test for IDOR / slug resolution paths	@serrrfirat	2 new security tests: `no_second_order_injection` + `skips_unresolvable_continues_later`

Deferred (unchanged from round 2)

Arbitrary JS execution from workspace widgets: Widgets follow same trust model as global widgets (user-authored workspace content). CSP blocks external eval. Will revisit for untrusted widget sources.
Hard-coded English strings in control room: Valid — will add i18n keys under projects.* namespace in a follow-up PR to keep this one focused on the functional fixes.

addressed

… tab, onboarding E2E Scheduled batched CI on staging was red across three unrelated paths. All three are fixed in-place; the existing tests become the regression coverage. 1. `tests/support/test_rig.rs`: rebuild the skill registry against the test's `with_skills_dir()` tempdir and actually run `discover_all()`. `AppBuilder::init_database()` reloads `config` from DB/TOML/env at the top of `build_all()`, which clobbered `config.skills.local_dir` back to the default (`~/.ironclaw/skills/`). Any registry `build_all()` constructed therefore pointed at the user's real skills dir, not the tempdir the test had laid down — so `loaded_skill_names()` came back empty and the v1 chain-load assertion panicked. Write the tempdir paths back onto `components.config.skills.*` so `AgentDeps::skills_config` agrees with the registry. `skill_chain_load_lifecycle::v1_chain_load_pulls_in_required_companions` now passes. 2. `crates/ironclaw_gateway/static/index.html`: drop the duplicate right-side `status-logs-btn` Jobs button added in #2353. The main tab-bar already has `<button data-tab="jobs">Jobs</button>`, and the duplicate had no `data-v1-only`/`data-v2-only` marker, so both rendered simultaneously. That broke `test_connection.py` (Playwright strict-mode rejected `.tab-bar button[data-tab="jobs"]` resolving to two elements) and also left both buttons visually `active` when the Jobs tab was open. 3. `tests/e2e/scenarios/test_extensions.py`: align `test_onboarding_failed_sse_shows_error_toast_and_reloads_extensions` with every other auth-card test in the file — resolve the real thread id via `_active_thread_id(page)` before calling `_show_auth_card`. `showAuthCard` short-circuits on `isCurrentThread(data.thread_id)`, and the synthetic `"thread-fail"` id fails that check once `currentThreadId` is populated after `go_to_extensions(page)`. The auth card was never rendered, so the follow-up `wait_for` for `.auth-card` hit its 5s timeout. Verified: `cargo test --features libsql --test skill_chain_load_lifecycle` and `--test skill_setup_marker_lifecycle` pass; `cargo clippy --tests --features libsql` is clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… tab, onboarding E2E (#2637) Scheduled batched CI on staging was red across three unrelated paths. All three are fixed in-place; the existing tests become the regression coverage. 1. `tests/support/test_rig.rs`: rebuild the skill registry against the test's `with_skills_dir()` tempdir and actually run `discover_all()`. `AppBuilder::init_database()` reloads `config` from DB/TOML/env at the top of `build_all()`, which clobbered `config.skills.local_dir` back to the default (`~/.ironclaw/skills/`). Any registry `build_all()` constructed therefore pointed at the user's real skills dir, not the tempdir the test had laid down — so `loaded_skill_names()` came back empty and the v1 chain-load assertion panicked. Write the tempdir paths back onto `components.config.skills.*` so `AgentDeps::skills_config` agrees with the registry. `skill_chain_load_lifecycle::v1_chain_load_pulls_in_required_companions` now passes. 2. `crates/ironclaw_gateway/static/index.html`: drop the duplicate right-side `status-logs-btn` Jobs button added in #2353. The main tab-bar already has `<button data-tab="jobs">Jobs</button>`, and the duplicate had no `data-v1-only`/`data-v2-only` marker, so both rendered simultaneously. That broke `test_connection.py` (Playwright strict-mode rejected `.tab-bar button[data-tab="jobs"]` resolving to two elements) and also left both buttons visually `active` when the Jobs tab was open. 3. `tests/e2e/scenarios/test_extensions.py`: align `test_onboarding_failed_sse_shows_error_toast_and_reloads_extensions` with every other auth-card test in the file — resolve the real thread id via `_active_thread_id(page)` before calling `_show_auth_card`. `showAuthCard` short-circuits on `isCurrentThread(data.thread_id)`, and the synthetic `"thread-fail"` id fails that check once `currentThreadId` is populated after `go_to_extensions(page)`. The auth card was never rendered, so the follow-up `wait_for` for `.auth-card` hit its 5s timeout. Verified: `cargo test --features libsql --test skill_chain_load_lifecycle` and `--test skill_setup_marker_lifecycle` pass; `cargo clippy --tests --features libsql` is clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes #2626. `tests/e2e/helpers.py` used `.tab-bar button[data-tab="{tab}"]` to locate every main-nav tab button. Commit 5058a1c removed the duplicate right-side `status-logs-btn` Jobs button that was resolving that selector to two elements on staging, so `test_connection.py` passes again. The underlying selector is still fragile: any future auxiliary button or `_addWidgetTab`-injected tab that reuses a built-in `data-tab` id (even accidentally, as #2353 did) will make the selector resolve to multiple elements and trip Playwright strict mode the next CI run. Harden the selector in place — the issue's suggested direction of scoping to a more specific parent region — so the regression can't repeat without a test-side opt-in: - `.tab-bar > button`: direct child, skipping any hypothetical nested buttons (e.g. menu popovers). - `:not(.status-logs-btn)`: excludes the right-side logs/docs cluster that uses the same `data-tab` hook for click routing. - `:not(.tab-btn)`: excludes widget-injected tabs (see `_addWidgetTab` in `crates/ironclaw_gateway/static/app.js`), which always carry the `tab-btn` class and could in principle collide with a built-in id. Verified against a synthetic DOM with three colliding `data-tab="jobs"` buttons (original, status-logs-btn duplicate, widget-injected): the old selector matches 3 and trips strict mode on `.click()`; the new selector matches 1 and clicks cleanly. No production HTML change is required — the acceptance criterion explicitly forbids one. `pytest tests/e2e/scenarios/test_connection.py -v` → 3 passed.

…2656) Closes #2626. `tests/e2e/helpers.py` used `.tab-bar button[data-tab="{tab}"]` to locate every main-nav tab button. Commit 5058a1c removed the duplicate right-side `status-logs-btn` Jobs button that was resolving that selector to two elements on staging, so `test_connection.py` passes again. The underlying selector is still fragile: any future auxiliary button or `_addWidgetTab`-injected tab that reuses a built-in `data-tab` id (even accidentally, as #2353 did) will make the selector resolve to multiple elements and trip Playwright strict mode the next CI run. Harden the selector in place — the issue's suggested direction of scoping to a more specific parent region — so the regression can't repeat without a test-side opt-in: - `.tab-bar > button`: direct child, skipping any hypothetical nested buttons (e.g. menu popovers). - `:not(.status-logs-btn)`: excludes the right-side logs/docs cluster that uses the same `data-tab` hook for click routing. - `:not(.tab-btn)`: excludes widget-injected tabs (see `_addWidgetTab` in `crates/ironclaw_gateway/static/app.js`), which always carry the `tab-btn` class and could in principle collide with a built-in id. Verified against a synthetic DOM with three colliding `data-tab="jobs"` buttons (original, status-logs-btn duplicate, widget-injected): the old selector matches 3 and trips strict mode on `.click()`; the new selector matches 1 and clicks cleanly. No production HTML change is required — the acceptance criterion explicitly forbids one. `pytest tests/e2e/scenarios/test_connection.py -v` → 3 passed.

Copilot AI review requested due to automatic review settings April 12, 2026 06:14

github-actions Bot added size: XL 500+ changed lines scope: channel/web Web gateway channel scope: docs Documentation risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs and removed size: XL 500+ changed lines labels Apr 12, 2026

Copilot started reviewing on behalf of ilblackdragon April 12, 2026 06:15 View session

gemini-code-assist Bot reviewed Apr 12, 2026

View reviewed changes

Copilot AI reviewed Apr 12, 2026

View reviewed changes

github-actions Bot added the size: XL 500+ changed lines label Apr 12, 2026

Copilot AI review requested due to automatic review settings April 12, 2026 06:42

Copilot started reviewing on behalf of ilblackdragon April 12, 2026 06:43 View session

Copilot AI reviewed Apr 12, 2026

View reviewed changes

serrrfirat reviewed Apr 12, 2026

View reviewed changes

henrypark133 previously requested changes Apr 13, 2026

View reviewed changes

ilblackdragon commented Apr 13, 2026

View reviewed changes

serrrfirat reviewed Apr 13, 2026

View reviewed changes

ilblackdragon and others added 2 commits April 13, 2026 13:12

Copilot AI review requested due to automatic review settings April 13, 2026 13:32

ilblackdragon force-pushed the feat/new-project-skill branch from 5e6c900 to b7fe8cc Compare April 13, 2026 13:32

Copilot started reviewing on behalf of ilblackdragon April 13, 2026 13:33 View session

Copilot AI reviewed Apr 13, 2026

View reviewed changes

ilblackdragon requested review from henrypark133 and serrrfirat April 13, 2026 18:20

serrrfirat reviewed Apr 15, 2026

View reviewed changes

serrrfirat reviewed Apr 16, 2026

View reviewed changes

ilblackdragon and others added 2 commits April 17, 2026 15:02

ilblackdragon requested a review from serrrfirat April 17, 2026 16:34

ilblackdragon merged commit ab8d64c into staging Apr 17, 2026
15 checks passed

ilblackdragon deleted the feat/new-project-skill branch April 17, 2026 16:37

github-actions Bot mentioned this pull request Apr 17, 2026

chore: promote staging to staging-promote/e65ba2e4-24575255629 (2026-04-17 17:16 UTC) #2588

Merged

ilblackdragon mentioned this pull request Apr 18, 2026

fix(ci): three staging regressions — skill chain-load, duplicate Jobs tab, onboarding E2E #2637

Merged

5 tasks

serrrfirat mentioned this pull request Apr 18, 2026

test(e2e): harden tab_button selector against strict-mode duplicates #2656

Merged

3 tasks

henrypark133 mentioned this pull request Apr 21, 2026

chore: release #2606

Merged

-        // Eval the JS module to register the widget.
-        try {
-          var api = typeof IronClaw !== 'undefined' ? IronClaw.api : null;
-          var fn = new Function('container', 'api', 'projectId', w.js);
-          fn(container, api, projectId);
+        // Execute the widget through a nonce-authorized module script so it
+        // remains compatible with a strict CSP that disallows unsafe-eval.
+        try {
+          var api = typeof IronClaw !== 'undefined' ? IronClaw.api : null;
+          var nonceSource = document.querySelector('script[nonce]');
+          var nonce = nonceSource ? nonceSource.nonce || nonceSource.getAttribute('nonce') : '';
+          var widgetToken = 'cr-widget-' + manifest.id + '-' + Date.now() + '-' + Math.random().toString(36).slice(2);
+          var widgetRegistry = window.__ironclawProjectWidgetMounts || (window.__ironclawProjectWidgetMounts = {});
+          var script = document.createElement('script');
+          widgetRegistry[widgetToken] = {
+            container: container,
+            api: api,
+            projectId: projectId
+          };
+          script.type = 'module';
+          if (nonce) script.setAttribute('nonce', nonce);
+          script.setAttribute('data-widget', manifest.id);
+          script.addEventListener('error', function(err) {
+            delete widgetRegistry[widgetToken];
+            script.remove();
+            console.error('[projects] Failed to mount widget ' + manifest.id + ':', err);
+            container.innerHTML = '<div class="cr-empty">Widget error: ' + manifest.id + '</div>';
+          });
+          script.textContent =
+            '(function(){\n' +
+            '  var registry = window.__ironclawProjectWidgetMounts || {};\n' +
+            '  var ctx = registry[' + JSON.stringify(widgetToken) + '];\n' +
+            '  if (!ctx) return;\n' +
+            '  try {\n' +
+            '    (function(container, api, projectId) {\n' +
+            w.js + '\n' +
+            '    })(ctx.container, ctx.api, ctx.projectId);\n' +
+            '  } finally {\n' +
+            '    delete registry[' + JSON.stringify(widgetToken) + '];\n' +
+            '    document.currentScript.remove();\n' +
+            '  }\n' +
+            '})();';
+          document.head.appendChild(script);

-pub async fn project_widgets_handler(
-    State(state): State<Arc<GatewayState>>,
-    AuthenticatedUser(user): AuthenticatedUser,
-    Path(project_id): Path<String>,
-) -> Result<Json<Vec<ResolvedWidget>>, (StatusCode, String)> {
-    // Resolve project name to derive the workspace slug.
-    let project = crate::bridge::get_engine_project(&project_id, &user.user_id)
-        .await
-        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?
-        .ok_or((StatusCode::NOT_FOUND, "Project not found".to_string()))?;
-    let slug = project
-        .name
-        .to_lowercase()
-        .replace(|c: char| !c.is_alphanumeric() && c != '-', "-");
-    let widgets_dir = format!("projects/{slug}/.system/widgets/");
+fn project_name_slug(name: &str) -> String {
+    name.to_lowercase()
+        .replace(|c: char| !c.is_alphanumeric() && c != '-', "-")
+}
+fn project_widgets_dir_from_metadata(project: &impl serde::Serialize, fallback_name: &str) -> String {
+    let fallback = format!("projects/{}/.system/widgets/", project_name_slug(fallback_name));
+    let Ok(value) = serde_json::to_value(project) else {
+        return fallback;
+    };
+    let Some(metadata) = value.get("metadata") else {
+        return fallback;
+    };
+    let candidate = metadata
+        .get("workspace_slug")
+        .and_then(|v| v.as_str())
+        .or_else(|| metadata.get("workspaceSlug").and_then(|v| v.as_str()))
+        .or_else(|| metadata.get("workspace_path").and_then(|v| v.as_str()))
+        .or_else(|| metadata.get("workspacePath").and_then(|v| v.as_str()));
+    let Some(raw) = candidate.map(str::trim).filter(|s| !s.is_empty()) else {
+        return fallback;
+    };
+    let normalized = raw.trim_matches('/');
+    if normalized.is_empty() {
+        return fallback;
+    }
+    let project_root = if normalized.starts_with("projects/") {
+        normalized.to_string()
+    } else {
+        format!("projects/{normalized}")
+    };
+    format!("{project_root}/.system/widgets/")
+}
+pub async fn project_widgets_handler(
+    State(state): State<Arc<GatewayState>>,
+    AuthenticatedUser(user): AuthenticatedUser,
+    Path(project_id): Path<String>,
+) -> Result<Json<Vec<ResolvedWidget>>, (StatusCode, String)> {
+    // Prefer a persisted workspace slug/path from project metadata so widget
+    // lookup remains stable across project renames. Fall back to the historic
+    // name-derived slug for backward compatibility with older project records.
+    let project = crate::bridge::get_engine_project(&project_id, &user.user_id)
+        .await
+        .map_err(|e| (StatusCode::INTERNAL_SERVER_ERROR, e.to_string()))?
+        .ok_or((StatusCode::NOT_FOUND, "Project not found".to_string()))?;
+    let widgets_dir = project_widgets_dir_from_metadata(&project, &project.name);

-    let entries = workspace.list(&widgets_dir).await.unwrap_or_default();
+    let entries = match workspace.list(&widgets_dir).await {
+        Ok(entries) => entries,
+        Err(error) => {
+            tracing::warn!(
+                "failed to list widgets in workspace at {}: {}",
+                widgets_dir,
+                error
+            );
+            Vec::new()
+        }
+    };

		/// Returns the resolved string if all references could be substituted,
		/// or the original string if none were found.

-/// Returns the resolved string if all references could be substituted,
-/// or the original string if none were found.
+/// Returns the original string unchanged if no template markers are found.
+/// Otherwise, resolves references iteratively until there are no more
+/// `{{...}}` patterns or until an unresolvable reference is encountered.
+/// If resolution stops on an unresolvable reference, any earlier successful
+/// substitutions are preserved, so the returned string may be partially resolved.

	reason: format!("Invalid project_id: {e}"),
	reason: format!("Invalid id: {e}"),

-    // Reset drill-in state when leaving Projects tab so we start fresh.
+    // Reset drill-in state when leaving Projects tab so we start fresh.
+    // Use the existing Projects navigation reset so any mounted widgets,
+    // listeners, and drill/detail DOM are torn down before clearing state.
+    crBackToOverview();

Conversation

ilblackdragon commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Project Detail Page

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

ilblackdragon commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Project Detail Page

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

ilblackdragon commented Apr 12, 2026 •

edited

Loading

ilblackdragon commented Apr 12, 2026 •

edited

Loading

Critical: `mission_create` project_id override has no ownership check (IDOR)

Concerning: `ProjectOverviewEntry` omits `goals` field

Review response (`5e6c900`)