Fix/cli init providers and argo url by hshang315 · Pull Request #199 · als-apg/osprey

hshang315 · 2026-04-03T19:42:04Z

Pull Request: fix(cli): load providers from ProviderRegistry and restore code generators

Branch: fix/cli-init-providers-and-argo-url -> next
Repository: als-apg/osprey
PR Link: https://github.com/als-apg/osprey/pull/new/fix/cli-init-providers-and-argo-url

Summary

Fix osprey interactive init failing at provider and code generator selection steps
get_provider_metadata() now reads from ProviderRegistry (the actual source of truth) instead of the empty config.providers list
get_code_generator_metadata() restored with basic and claude_code entries
Updated Argo provider default_base_url to https://apps.inside.anl.gov/argoapi/v1

Problem

Running osprey (interactive project creation) fails with two errors that block project creation:

! No providers could be loaded from osprey registry

✗ No code generators available
! Osprey could not load any code generators.
Check that osprey is properly installed: uv sync --all-extras

The init wizard aborts at Step 5 (Code Generator), preventing any project from being created.

Root Cause

1. Providers not loading

The next branch moved provider definitions from the registry config to a standalone ProviderRegistry in osprey.models.provider_registry. The CLI function get_provider_metadata() in interactive_menu.py still read from the old location (config.providers), which is now an empty list [].

2. Code generators not loading

The function get_code_generator_metadata() in interactive_menu.py had its implementation replaced with a hardcoded empty dict:

# Code generators were removed from the registry
generators = {}

However, the scaffolding code in scaffolding.py still expects "basic" and "claude_code" as valid generator values.

3. Argo base URL outdated

The Argo provider's default base URL (https://argo-bridge.cels.anl.gov) is no longer the active endpoint.

Changes

`src/osprey/cli/interactive_menu.py`

get_provider_metadata() (~line 326):

Before:

from osprey.registry.builtins import FrameworkRegistryProvider
framework_registry = FrameworkRegistryProvider()
config = framework_registry.get_registry_config()
for provider_reg in config.providers:  # empty list
    module = importlib.import_module(provider_reg.module_path)
    provider_class = getattr(module, provider_reg.class_name)
    ...

After:

from osprey.models.provider_registry import get_provider_registry
pr = get_provider_registry()
for provider_name in pr.list_providers():
    provider_class = pr.get_provider(provider_name)
    ...

get_code_generator_metadata() (~line 423):

Replaced generators = {} with entries for the two generators the scaffolding code expects:

basic -- always available (built-in single-pass LLM generator)
claude_code -- available when claude-agent-sdk is installed (checked via importlib)

`src/osprey/models/providers/argo.py`

Updated the base URL in two places:

Location	Old	New
Line 73 (fallback in `_execute_argo_structured_output()`)	`https://argo-bridge.cels.anl.gov`	`https://apps.inside.anl.gov/argoapi/v1`
Line 145 (`default_base_url` class attribute)	`https://argo-bridge.cels.anl.gov`	`https://apps.inside.anl.gov/argoapi/v1`

Verification

source /home/oxygen/SHANG/next_osprey/.venv/bin/activate
python3 -c "
from osprey.cli.interactive_menu import get_provider_metadata, get_code_generator_metadata

providers = get_provider_metadata()
print(f'Providers loaded: {len(providers)}')
for name in sorted(providers):
    print(f'  {name}')

generators = get_code_generator_metadata()
print(f'Code generators loaded: {len(generators)}')
for name, meta in sorted(generators.items()):
    print(f'  {name}: available={meta[\"available\"]}')
"

Expected output:

Providers loaded: 11
  als-apg
  amsc
  anthropic
  argo
  asksage
  cborg
  google
  ollama
  openai
  stanford
  vllm
Code generators loaded: 2
  basic: available=True
  claude_code: available=True

Test Plan

Run osprey and verify all 11 providers appear at Step 6 (Provider Selection)
Verify both code generators (basic, claude_code) appear at Step 5 (Code Generator)
Complete a full osprey init project creation flow end-to-end
Verify Argo provider health check uses the new base URL
Run unit tests: pytest tests/ --ignore=tests/e2e -v

When re-creating a project at the same path, Claude Code remembers the previous trust decision from ~/.claude/projects/<path>/. Remove that cached state on --force so the trust prompt appears again on first launch.

The trust decision ("hasTrustDialogAccepted") lives in ~/.claude.json → projects.<path>, not in ~/.claude/projects/. Now --force removes the entry from both locations so the trust prompt reappears on next Claude Code launch.

Users often rm -rf the project before re-running osprey init, so the --force cleanup path is never reached. Move trust/session state cleanup to run unconditionally on every osprey init.

…r chips Replace the single-select sector dropdown with inline toggleable chip controls for both sectors and devices. Chips are AND-combined across dimensions (sectors union, devices union, then intersected). Includes all/none actions per row and preserves field group collapse state across filter toggles.

Demote INFO→DEBUG so internal yaml mutation details don't leak into the otherwise clean Rich console output.

thumbnailHtml() was simplified to only show images and bare icons, leaving existing iframe/placeholder/error CSS orphaned. Restore iframe previews for HTML artifacts and notebooks, summary text for data artifacts, styled placeholders, and image error handling.

…n executor

…ore, and device info

regenerate_claude_code() used raw yaml.safe_load() without expanding ${VAR:-default} patterns, so timezone.md rendered the literal string instead of the resolved value. Extract resolve_env_vars() as a public function in config.py and apply it after loading config in templates.py.

Claude Code's own directory structure already isolates sessions per project, making the .sessions.json whitelist redundant. Removing it lets CLI-created sessions appear immediately and eliminates registration race conditions.

… resolution All 7 hooks now use hook_input["cwd"] for project directory (replacing inconsistent env var fallback chains). New osprey_hook_log.py provides get_hook_input(), get_project_dir(), and OSPREY_HOOK_DEBUG-gated log_hook().

log_hook() now prints to stderr instead of writing to data/hooks/activity.log — simpler, no file management needed.

The env var was never reaching hooks because (1) OSPREY_CONFIG was not set before lifespan config reads, causing CWD-dependent cache misses, and (2) four layers of silent `except: pass` hid every failure. - Set OSPREY_CONFIG early in lifespan + reset stale config cache - Replace silent error swallowing with logged warnings in app.py and operator_session.py - Add config.yml fallback in osprey_hook_log._is_debug_enabled() so hooks work even if env var propagation breaks - Default hooks.debug to true in template config - Add 16 tests covering the full propagation chain

Add agentsview (Go binary) as a new "SESSION ANALYTICS" tab in the web terminal, following the same iframe panel pattern as Artifacts, ARIEL, and Monitoring. Auto-launches on `osprey web` if installed, degrades gracefully if not.

MCP servers called initialize_registry() on startup, loading LangGraph-era components (capabilities, services, approval manager, prompt providers) that no MCP tool uses at runtime. Remove the call and add startup timing instrumentation. Saves ~3.3s per server process (~23s cumulative across 7 servers).

Send theme:set alongside osprey-theme-change so agentsview can switch themes instantly via postMessage instead of iframe reload. Remove session-analytics from CROSS_ORIGIN_PANELS reload list.

Keep Claude processes alive in the background when switching sessions, enabling near-instant reattach for warm sessions. Adds LRU pool semantics to PtyRegistry with configurable max_background_sessions (default 5), and a switch_session WebSocket message so the frontend never closes/reopens the connection on switch. - PtyRegistry: OrderedDict + attached set, get_or_create_session, attach/detach/rekey_session, LRU eviction - routes.py: extract _run_output_loop, parameterized _discover_and_notify, switch_session handler, finally detaches instead of terminating - api.js: setUrl() on createWebSocket for reconnect URL updates - terminal.js: switchSession() export, session_switched/error handlers - sessions.js: fast path via switchSession with cold fallback - 13 new unit tests for pool behavior

…ration Add PROVIDER_API_KEYS canonical dict to provider_registry.py and replace 4 drifted inline lists across CLI modules. Extract register_builtin_connectors() in connectors/factory.py so MCP registry delegates instead of duplicating.

…services

…heming

Restructure around safety chain as the central concept. Add hook chain diagram, build/deploy section, expanded layers table (5 → 7 layers), and updated data flow showing PreToolUse hooks in the sequence diagram. Remove the per-tool hook matrix table.

…kflow Add Claude Code CLI and Node.js prerequisites, PyPI install option, updated templates list, agent launch instructions (direct/managed/web), MCP server overview dropdown, and revised troubleshooting. Reframe container runtime as optional and remove placeholder section.

…ata directory Replace all references to osprey-workspace with _agent_data across the entire codebase. Config key changed from workspace.base_dir to agent_data.base_dir. resolve_workspace_root() renamed to resolve_agent_data_root() with backward-compatible alias.

Replace the deleted hello_world_weather template with a new hello_world template that introduces Claude Code + MCP with a mock control system. Rewrite the tutorial from scratch, removing 4 stale LangGraph PLACEHOLDERs and documenting the current architecture. Update all cross-references in docs, CLI help text, and CI workflow.

Missed from 9c180b9 — update README quick-start to use current CLI commands, remove stale PLACEHOLDERs and deleted servers (AccelPapers, MATLAB, graph tools) from MCP servers docs, add entry_publish tool.

Remove references to non-existent CLI commands (osprey tasks, osprey claude install) and directories (.ai-tasks/) from both the Sphinx contributing page and CONTRIBUTING.md.

The ARIEL how-to docs described a single ariel_search tool inside the control_system MCP server, but the actual implementation is a dedicated MCP server (osprey.mcp_server.ariel) with 11 specialized tools. Also contained stale LangGraph/LangChain references and wrong threshold values. - Remove 2 PLACEHOLDER: CONCEPTUAL-MAPPING admonitions - Rewrite osprey-integration.rst with correct MCP architecture and tool table - Delete fabricated error classification section (classify_error doesn't exist) - Fix similarity threshold 0.7 → 0.5 to match DEFAULT_SIMILARITY_THRESHOLD - Replace LangGraph/LangChain refs with actual implementation (custom async ReAct loop) - Fix citation parsing description ([#id] regex → entry_id substring matching) - Add --mode auto CLI documentation to search-modes.rst - Add all 12 osprey ariel CLI subcommands to index.rst - Fix logbook_search capability reference in web-interface.rst

…ural sections - Replace unrealistic "set beam current to 500 mA" scenario with corrector magnet bump workflow using realistic PV names - Add Channel Finder sub-agent step showing address resolution before read - Remove Build & Deploy section (operational, not architectural) - Remove Layers table (stale directory listing) - Remove Runtime API section (better suited for API reference)

…sions - Fix str/bytes TypeError in lifecycle timeout handler: Python's subprocess.TimeoutExpired stores raw bytes in stdout/stderr even with text=True, causing a crash when concatenating with fallback str - Add bundled extension support to duckdb_import: prefer local FTS extension file at data/duckdb_extensions/ before attempting download, with HTTP_PROXY fallback for proxy-restricted environments

- build_cmd: add --skip-lifecycle to skip pre_build, post_build, and validate phases (needed for CI where no container runtime exists) - channel_finder: register query_channels tool, add duckdb_path config property for DuckDB-backed channel search

Skips venv creation and dependency installation when building in CI where OSPREY and deps are pre-installed in the container image.

POST /api/chat with SSE streaming (default) or buffered JSON response. Creates an ephemeral OperatorSession per request, reusing the existing operator mode infrastructure.

McpServerDef now accepts an optional `url` field for HTTP/SSE MCP servers. Profile YAML can specify either `command` (stdio) or `url` (HTTP) per server — validated as mutually exclusive at parse time. _inject_mcp_servers() emits {"type": "sse", "url": "..."} entries in .mcp.json for URL-based servers.

…n-use check Karma (claude-code-karma) was a speculative external dependency that was never published to PyPI. Its auto-launch on every `osprey web` start produced a noisy ERROR traceback followed by misleading "launched" and "available" log messages. Remove all references across registry, launcher, web terminal (app, routes, JS), config templates, pyproject.toml, docs, and tests. Also add a port pre-flight check in `osprey web` foreground mode so stale processes on port 8087 produce an actionable error message instead of a raw uvicorn crash.

… __init_subclass__ Add automatic writes_enabled pre-check to all connector subclasses through ControlSystemConnector.__init_subclass__ wrapping. This ensures the safety invariant holds even when the PreToolUse hook chain is bypassed (e.g., approved readwrite Python subprocesses reaching the connector directly). - Add _writes_enabled property to base class (reads from global config, defaults to False/fail-safe) - Add __init_subclass__ that wraps any subclass write_channel() with a guard returning ChannelWriteResult(success=False) when writes disabled - Remove MockConnector's redundant _enable_writes logic (now handled by base class) - Fix MockDynamicConnector to return proper ChannelWriteResult - Update sibling tests to mock get_config_value instead of config dict - Add 13 new tests across 4 test classes in test_writes_enabled.py

The web terminal loaded xterm.js, highlight.js, marked.js, and Google Fonts from cdn.jsdelivr.net, which is blocked by restrictive proxies (e.g., ALS squid returns 403). Bundle all vendor JS/CSS/fonts into static/vendor/ so the terminal works without external CDN access.

…ators The interactive init wizard failed with "No providers could be loaded" and "No code generators available" because: 1. get_provider_metadata() read from config.providers (empty after providers moved to provider_registry.py). Now uses ProviderRegistry directly. 2. get_code_generator_metadata() had generators hardcoded to {} after being removed from the registry. Restored with "basic" (always available) and "claude_code" (available when claude-agent-sdk is installed). Also updates the Argo provider default_base_url to the new endpoint https://apps.inside.anl.gov/argoapi/v1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

src/osprey/interfaces/web_terminal/static/js/config-renderers.js

+    }
+
+    // ---- Drag-and-drop wiring across columns ----
+    _wireDragAndDrop(columns, colMap, markDirty, container);


src/osprey/interfaces/web_terminal/static/js/prompts-gallery.js

+    fetchJSON('/api/prompts/create', {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify({ category, name: sanitized }),
+    })


src/osprey/interfaces/ariel/static/js/app.js

 import { initDashboard, loadStatus, startAutoRefresh, stopAutoRefresh } from './dashboard.js';
 import { initAdvancedOptions } from './advanced-options.js';
+import { initDrawers } from './drawer.js';
+import { initSettings, loadConfig } from './settings.js';


src/osprey/interfaces/ariel/static/js/app.js

 import { initDashboard, loadStatus, startAutoRefresh, stopAutoRefresh } from './dashboard.js';
 import { initAdvancedOptions } from './advanced-options.js';
+import { initDrawers } from './drawer.js';
+import { initSettings, loadConfig } from './settings.js';
+import { loadFileList } from './claude-setup.js';


src/osprey/interfaces/artifacts/static/js/gallery.js

+    return a.timestamp && a.timestamp >= _sessionStart;
+  }
+
+  function sendToTerminal(text) {


src/osprey/interfaces/channel_finder/static/js/stats-badges.js

+ */
+
+import { fetchJSON } from './api.js';
+import { state } from './state.js';


src/osprey/interfaces/artifacts/static/js/gallery.js

+      if (!chartResp.ok) throw new Error(`Chart fetch failed: ${chartResp.status}`);
+      const chartData = await chartResp.json();
+      const columns = chartData.columns || [];
+      const summary = artifact.metadata || {};


src/osprey/interfaces/web_terminal/static/js/app.js

@@ -0,0 +1,327 @@
+/* OSPREY Web Terminal — Application Entry Point */
+
+import { initTerminal, fitTerminal, focusTerminal, getTerminalDimensions, stopTerminal, startTerminal, restartTerminal, pasteToTerminal } from './terminal.js';


src/osprey/interfaces/web_terminal/static/js/panel-manager.js

+
+// ---- Empty State ----
+
+function renderEmptyState(message) {


src/osprey/interfaces/web_terminal/static/session.html

+      const aid = agent.agent_id || 'main';
+      const isRoot = !agent.agent_id || agent.agent_id === 'main';
+      const name = isRoot ? 'ROOT SESSION' : (agent.agent_type || aid);
+      const borderColor = isRoot ? 'var(--accent)' : `var(--srv, var(--accent))`;


tests/interfaces/channel_finder/test_feedback_api.py

+        assert resp.status_code == 404
+
+    def test_delete_returns_404(self, client):
+        assert client.delete("/api/feedback/somekey").status_code == 404


tests/interfaces/channel_finder/test_feedback_api.py

+        assert client.delete("/api/feedback/somekey").status_code == 404
+
+    def test_clear_returns_404(self, client):
+        assert client.delete("/api/feedback?confirm=true").status_code == 404


tests/services/channel_finder/feedback/test_pending_store.py

+
+def test_delete_existing_item(store):
+    item_id = store.capture({"query": "magnets", "facility": "ALS"})
+    assert store.delete(item_id) is True


tests/services/channel_finder/feedback/test_pending_store.py

+
+
+def test_delete_missing_returns_false(store):
+    assert store.delete("nonexistent") is False


tests/interfaces/web_terminal/test_app.py

+        assert resp.status_code == 200
+        assert resp.json()["active_panel"] is None
+
+    def test_set_panel_focus_artifacts(self, client):


tests/mcp_server/test_screen_capture_backend_macos.py

+
+    async def mock_exec(*args, **kwargs):
+        if args[0] == "screencapture":
+            open(args[-1], "wb").write(b"PNG_DATA")


tests/mcp_server/test_screen_capture_backend_macos.py

+    async def mock_exec(*args, **kwargs):
+        captured_args.append(args)
+        if args[0] == "screencapture":
+            open(filepath, "wb").write(b"PNG_DATA")


tests/mcp_server/test_screen_capture_backend_macos.py

+    async def mock_exec(*args, **kwargs):
+        captured_args.append(args)
+        if args[0] == "screencapture":
+            open(filepath, "wb").write(b"PNG_DATA")


cr-xu · 2026-04-09T19:02:09Z

@hshang315 I think this PR should target the next branch, not the main. Main branch is still the old backend.

thellert added 30 commits March 22, 2026 09:06

feat(init): clear Claude Code project state on --force

460d91b

When re-creating a project at the same path, Claude Code remembers the previous trust decision from ~/.claude/projects/<path>/. Remove that cached state on --force so the trust prompt appears again on first launch.

fix(init): clear trust state from ~/.claude.json on --force

9617c52

The trust decision ("hasTrustDialogAccepted") lives in ~/.claude.json → projects.<path>, not in ~/.claude/projects/. Now --force removes the entry from both locations so the trust prompt reappears on next Claude Code launch.

fix(init): always clear Claude Code trust state, not just on --force

b1cddb5

Users often rm -rf the project before re-running osprey init, so the --force cleanup path is never reached. Move trust/session state cleanup to run unconditionally on every osprey init.

fix(init): suppress noisy config_add_to_list log during init

4209f04

Demote INFO→DEBUG so internal yaml mutation details don't leak into the otherwise clean Rich console output.

feat(config): add resolve_model_id for tier-to-model-id resolution

64b79ba

feat(init): switch to tier names and default to haiku

32a8af7

feat(services): resolve tier names in ariel, channel-namer, and pytho…

3237b80

…n executor

feat(channel-finder): add tree preview, selections paths, feedback st…

d8f1593

…ore, and device info

feat(transcript): add agent lifecycle events and string content handling

13dd565

feat(web-terminal): add session diagnostics panel and safety page

3990cb7

refactor(accelpapers): migrate from SQLite FTS5 to Typesense

ffe17c5

refactor(web-terminal): remove SessionRegistry whitelist

072e783

Claude Code's own directory structure already isolates sessions per project, making the .sessions.json whitelist redundant. Removing it lets CLI-created sessions appear immediately and eliminates registration race conditions.

refactor(hooks): use stderr instead of file for debug logging

ebcb802

log_hook() now prints to stderr instead of writing to data/hooks/activity.log — simpler, no file management needed.

feat(web-terminal): add postMessage theme:set for agentsview iframe

594b6e6

Send theme:set alongside osprey-theme-change so agentsview can switch themes instantly via postMessage instead of iframe reload. Remove session-analytics from CROSS_ORIGIN_PANELS reload list.

chore: remove obsolete tests for deleted CLI commands, registry, and …

4ce1333

…services

fix(tests): repair pre-existing test failures across multiple modules

e01b8ed

fix(ariel-search): catch ImportError in prompt builder loading

d1ceb8c

perf(mcp): add startup_timer instrumentation to all MCP servers

e5e045d

feat(accelpapers): switch to hybrid search with configurable embedding

61e3861

feat(artifacts): add session-scoped filtering and anti-flash Plotly t…

66f183c

…heming

feat(transcript-reader): add session-id lookup and agent timeline

e025d55

zhe-slac and others added 27 commits March 26, 2026 14:56

Merge origin/next: deploy guide fixes

93eea52

Add use CLI chat interface how-to guide

0f6e855

Update CLI reference how-to guide

cd0e304

docs(python-executor): fix inaccuracies and resolve placeholders

84df447

Merge branch 'next' of https://github.com/als-apg/osprey into next

fa4f394

Update build profiles how-to guide

2fccf2d

Merge remote-tracking branch 'origin/next' into next

7a531a4

docs(add-connector): fix inaccuracies and resolve placeholders

5580404

docs: update README and MCP servers page for Claude Code architecture

73450c8

Missed from 9c180b9 — update README quick-start to use current CLI commands, remove stale PLACEHOLDERs and deleted servers (AccelPapers, MATLAB, graph tools) from MCP servers docs, add entry_publish tool.

docs(contributing): remove stale AI workflow sections

a6b23a8

Remove references to non-existent CLI commands (osprey tasks, osprey claude install) and directories (.ai-tasks/) from both the Sphinx contributing page and CONTRIBUTING.md.

feat(build): add --skip-deps flag for CI environments

4c0cd2d

Skips venv creation and dependency installation when building in CI where OSPREY and deps are pre-installed in the container image.

feat(web): add REST /api/chat endpoint for programmatic Claude access

885db5e

POST /api/chat with SSE streaming (default) or buffered JSON response. Creates an ephemeral OperatorSession per request, reusing the existing operator mode infrastructure.

docs: add FIXES.md and INSTALL.md

c214ddf

docs: add CLI reference to README

1816a59

github-code-quality bot found potential problems Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/cli init providers and argo url#199

Fix/cli init providers and argo url#199
hshang315 wants to merge 333 commits intomainfrom
fix/cli-init-providers-and-argo-url

hshang315 commented Apr 3, 2026

Uh oh!

cr-xu commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		@@ -0,0 +1,327 @@
		/* OSPREY Web Terminal — Application Entry Point */

		import { initTerminal, fitTerminal, focusTerminal, getTerminalDimensions, stopTerminal, startTerminal, restartTerminal, pasteToTerminal } from './terminal.js';


		// ---- Empty State ----

		function renderEmptyState(message) {



		def test_delete_missing_returns_false(store):
		assert store.delete("nonexistent") is False

Conversation

hshang315 commented Apr 3, 2026

Pull Request: fix(cli): load providers from ProviderRegistry and restore code generators

Summary

Problem

Root Cause

1. Providers not loading

2. Code generators not loading

3. Argo base URL outdated

Changes

src/osprey/cli/interactive_menu.py

src/osprey/models/providers/argo.py

Verification

Test Plan

Uh oh!

cr-xu commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

`src/osprey/cli/interactive_menu.py`

`src/osprey/models/providers/argo.py`