Skip to content

feat(services): XOpt Optimizer Service infrastructure#89

Draft
thellert wants to merge 15 commits intomainfrom
feature/xopt-optimizer-service
Draft

feat(services): XOpt Optimizer Service infrastructure#89
thellert wants to merge 15 commits intomainfrom
feature/xopt-optimizer-service

Conversation

@thellert
Copy link
Copy Markdown
Collaborator

Summary

This PR adds the foundational agent infrastructure for an XOpt/Badger-based optimizer service. This is a work-in-progress - the agent scaffolding and workflow nodes are implemented, but the actual optimization routines are not yet connected.

What's implemented:

  • State identification node: ReAct agent that assesses machine readiness using channel access tools
  • Strategy decision node: Selects exploration vs. optimization approach based on current state
  • YAML generation agent: Creates XOpt configurations using ReAct pattern with reference file tools
  • Approval node: Human-in-the-loop integration for reviewing proposed optimizations
  • Execution and analysis nodes: Scaffolding for running and evaluating optimizations
  • Multi-iteration approval support: Custom reducers enable approval fields to be reset between iterations
  • Optimization capability: Routes optimization requests to the XOpt service
  • Default optimization prompts: Prompt builder for XOpt workflows

What's NOT yet implemented:

  • Actual XOpt/Badger optimization execution
  • Real control system integration
  • End-to-end optimization workflows

Test plan

  • Unit tests pass: pytest tests/services/xopt_optimizer/ -v
  • Existing tests unaffected: pytest tests/ --ignore=tests/e2e -v
  • Type checking passes: mypy src/

- Add ArtifactType enum (IMAGE, NOTEBOOK, COMMAND, HTML, FILE)
- Create register_artifact() as single source of truth
- Legacy methods delegate to new API with dual-write pattern
- Add populate_legacy_fields_from_artifacts() for backward compatibility
- Python capability uses new API directly with clean accumulation
- ArtifactGallery with keyboard navigation (Ctrl+a, j/k, Enter, o, Esc)
- ArtifactViewer modal with type-specific details and external open
- Native image rendering via textual-image (Sixel/Kitty protocols)
- New/seen tracking with [NEW] badges for current turn artifacts
- Integration with ChatDisplay for artifact state management
- Replace _extract_figures/commands/notebooks_for_cli with single method
- Read from ui_artifacts directly instead of legacy fields
- Support all artifact types: IMAGE, NOTEBOOK, COMMAND, HTML, FILE
- Add type-specific formatting with icons and grouped display
- Remove ~150 lines of redundant legacy extraction code
Enable approval_approved and approved_payload fields to be overwritten
across multiple approval iterations. Without this, LangGraph's default
LastValue channel throws errors when setting approval fields multiple
times in multi-iteration workflows (e.g., XOpt optimization).
Display repeat approval requests in styled Panel matching the first
approval prompt, rather than plain text. Ensures consistent visual
treatment for multi-iteration approval workflows.
Implement XOpt-based autonomous machine parameter optimization service
with complete workflow: state identification, strategy decision, YAML
generation, human approval, and execution.

Key components:
- State identification: ReAct agent assesses machine readiness
- Decision: LLM selects exploration vs optimization strategy
- YAML generation: ReAct agent creates XOpt configurations
- Approval: Human-in-the-loop with structured interrupt
- Configurable modes: react (LLM) or mock (fast tests)

Includes optimization capability, prompt builder, config templates,
approval helpers, and comprehensive test suite.
Document how XOpt/Badger's Environment abstraction maps to Osprey's
ConnectorFactory. The Environment defines variables/observables while
the Interface (OspreyInterface) bridges to Osprey's connector system.

This provides the architectural insight for XOpt integration without
implementing domain-specific logic that requires expert input.

Also includes ruff formatting fixes from quick_check.
@wenatuhs
Copy link
Copy Markdown
Collaborator

wenatuhs commented Feb 1, 2026

@claude please review this PR, focus on if Xopt/Badger optimization ability should be added as a service to osprey, or should we have it as a subagent since it looks like a ReAct agent itself

@claude
Copy link
Copy Markdown

claude bot commented Feb 1, 2026

Claude finished @wenatuhs's task in 3m 1s —— View job


Architectural Review: XOpt Service vs. Subagent Pattern

TL;DR Recommendation

✅ The current service architecture is correct. The XOpt optimizer should remain a service, not be converted to a subagent.


Detailed Analysis

Understanding the Architecture Layers

The Osprey framework has a clear separation of concerns:

  1. Main Graph Layer: The orchestrator manages conversation flow and capability routing
  2. Capability Layer: Thin gateways that classify intent and route to services
  3. Service Layer: Complex, multi-stage workflows with their own LangGraph instances
  4. Component Layer: Individual agents (like ReAct agents) that are nodes within services

Why XOpt Should Be a Service

1. It's a Workflow, Not Just an Agent

The XOpt optimizer is not a single ReAct agent—it's an orchestrated workflow with 6 distinct stages:

State Identification → Decision → YAML Generation → Approval → Execution → Analysis
         ↑                                                                      ↓
         └──────────────────────────── (iteration loop) ─────────────────────┘

File references:

  • Workflow definition: src/osprey/services/xopt_optimizer/service.py:185-244
  • State machine: src/osprey/services/xopt_optimizer/models.py:215-273

Each stage has different concerns:

  • State identification (ReAct agent): Assesses machine readiness using channel access tools
  • Decision: Routes between exploration/optimization/abort strategies
  • YAML generation (ReAct agent): Creates XOpt configs using reference files
  • Approval: Human-in-the-loop gate with interrupt/resume
  • Execution: Runs XOpt/Badger optimization (placeholder)
  • Analysis: Evaluates results and controls iteration loop

Two of these six stages happen to use ReAct agents internally, but that doesn't make the whole system a ReAct agent.

2. Services Handle Complex Cross-Cutting Concerns

The XOpt service manages capabilities that go beyond what a simple subagent can handle:

Multi-iteration loops with state persistence:

# src/osprey/services/xopt_optimizer/service.py:232-237
workflow.add_conditional_edges(
    "analysis",
    self._loop_router,
    {"continue": "state_identification", "complete": "__end__"},
)

LangGraph-native approval workflows:

  • Interrupt support with checkpointing: service.py:95-109
  • Resumable workflows after human approval
  • Approval state that persists across iterations

Complex conditional routing:

  • 4 different routers with multi-way branching: service.py:246-300
  • YAML generation retry logic: service.py:261-276
  • Failure mode handling

Error tracking and recovery:

  • Structured error chain: models.py:103-131
  • Attempt numbering for retry strategies
  • Service-level exception handling: service.py:302-317

These features require a separate LangGraph instance with its own checkpointer—exactly what the service pattern provides.

3. Follows Established Framework Patterns

The XOpt service exactly mirrors the Python executor service:

Aspect Python Executor XOpt Optimizer
Service class PythonExecutorService XOptOptimizerService
Capability gateway PythonCapability OptimizationCapability
Request model PythonExecutionRequest XOptExecutionRequest
Result model PythonServiceResult XOptServiceResult
Approval integration ✅ LangGraph interrupts ✅ LangGraph interrupts
Retry logic ✅ Multi-stage ✅ Multi-stage
Internal workflow ✅ LangGraph ✅ LangGraph

File references:

  • Python service: src/osprey/services/python_executor/service.py:31-100
  • XOpt service: src/osprey/services/xopt_optimizer/service.py:48-369
  • Python capability: src/osprey/capabilities/python.py:1-100
  • XOpt capability: src/osprey/capabilities/optimization.py:1-150

Breaking this consistency would create architectural confusion.

4. Capability Gateway Pattern is Intentional

The OptimizationCapability is a thin gateway, not redundant code:

# src/osprey/capabilities/optimization.py:144-150
def _create_optimization_context(service_result: XOptServiceResult):
    """Create OptimizationResultContext from service result."""
    return OptimizationResultContext(
        run_artifact=service_result.run_artifact,
        strategy=service_result.strategy.value,
        # ... transforms service result to capability context
    )

The capability layer:

  • Provides a stable interface for the orchestrator
  • Handles context transformation (service results → capability contexts)
  • Manages approval system integration
  • Classifies errors for the orchestrator
  • Generates orchestrator/classifier guidance

If you made XOpt a direct subagent in the main graph, you'd lose:

  • Clean separation between conversation orchestration and optimization workflow
  • The ability to checkpoint/resume complex multi-stage optimization workflows
  • Consistent error handling abstractions
  • The capability's context management features

What About the ReAct Agents?

The two ReAct agents are implementation details of the service:

State Identification Agent

File: src/osprey/services/xopt_optimizer/state_identification/agent.py:106-335

class StateIdentificationAgent:
    """ReAct agent for assessing machine readiness."""
    
    async def assess_state(self, objective, additional_context):
        # Uses ReAct pattern to:
        # 1. Read reference docs about ready criteria
        # 2. Check channel values via control system
        # 3. Return structured assessment

This agent is a node in the service workflow, not the whole service.

YAML Generation Agent

File: src/osprey/services/xopt_optimizer/yaml_generation/agent.py:225-488

class YamlGenerationAgent:
    """ReAct agent for generating XOpt YAML configs."""
    
    async def generate_yaml(self, objective, strategy, additional_context):
        # Uses ReAct pattern to:
        # 1. Optionally read example YAMLs
        # 2. Generate config based on strategy
        # 3. Return validated YAML

Again, this is one stage in a six-stage workflow.

Neither agent has the complexity to justify being in the main orchestrator graph. They're specialized components that:

  • Don't need direct user interaction
  • Don't manage their own approval workflows
  • Don't handle multi-iteration loops
  • Are tightly coupled to the optimization domain

Hypothetical: What If It Were a Subagent?

If you tried to make XOpt a direct subagent in the main graph:

Problems You'd Face:

  1. No separate checkpoint space: The main graph's checkpointer would be polluted with optimization-specific state (iteration counts, YAML configs, approval states, error chains). Currently, each service has its own checkpointer.

  2. Approval workflow complexity: You'd need to handle LangGraph interrupts at the main graph level, making the orchestrator's approval logic much more complex.

  3. Loss of encapsulation: The orchestrator would need to understand XOpt-specific routing logic (exploration vs. optimization strategies, YAML retry logic, iteration control).

  4. Can't follow the Python executor pattern: You'd have architectural inconsistency—Python uses a service, but optimization uses a subagent? Why?

  5. Harder to test: Services can be tested independently. tests/services/xopt_optimizer/test_xopt_service.py tests the entire workflow in isolation.

What You'd Gain:

  • Marginally simpler capability layer (one fewer hop)
  • Slightly less code

The trade-off is terrible. You'd sacrifice clean architecture for minimal simplicity gains.


CodeQL Findings (Bonus Review)

Since I reviewed the code, here are the legitimate issues to fix:

1. Unused Variables

File: tests/test_tui_artifacts.py:75, tests/test_tui_artifacts.py:365
File: src/osprey/interfaces/tui/widgets/artifacts.py:187

These are legitimate—variables assigned but never used. Fix by either using them or calling the functions for side effects only.

2. Empty Exception Handlers

File: src/osprey/interfaces/tui/widgets/artifacts.py:313, 336

except Exception:
    pass  # CodeQL: empty handler with no comment

Add explanatory comments about why exceptions are intentionally swallowed (non-critical UI updates).

3. Non-iterable in for loop

File: tests/test_tui_artifacts.py:365

This might be a false positive, but worth investigating.

These are unrelated to the architectural question and should be fixed separately.


Final Recommendation

Keep the service architecture. Here's why:

  1. Matches existing patterns: Consistent with Python executor service
  2. Proper separation of concerns: Orchestrator doesn't need to know about XOpt internals
  3. Handles complexity correctly: Multi-iteration loops, approval gates, error recovery
  4. Testable in isolation: Service can be tested without the full agent graph
  5. Scalable: Easy to add more optimization strategies or workflow stages
  6. Production-ready: LangGraph checkpointing, interrupt/resume, structured errors

The fact that the service uses ReAct agents internally doesn't mean the service should be a ReAct agent in the main graph. That's a category error—services orchestrate workflows; agents are components within those workflows.


Additional Comments

The PR description correctly labels this as infrastructure. That's exactly right. This is foundational service infrastructure that provides:

  • A reusable pattern for any optimization backend (XOpt, Badger, custom optimizers)
  • A framework for machine state awareness in optimization workflows
  • A template for approval-gated, multi-iteration scientific workflows

The "work in progress" nature (placeholder execution, mock components) is appropriate—you're building the scaffolding first, which is the right approach. The architecture is sound; you just need to connect the actual XOpt/Badger execution when ready.


Files reviewed:

  • Service layer: src/osprey/services/xopt_optimizer/service.py, models.py
  • ReAct agents: state_identification/agent.py, yaml_generation/agent.py
  • Workflow nodes: decision/node.py, execution/node.py, analysis/node.py, approval/node.py
  • Capability gateway: src/osprey/capabilities/optimization.py
  • Comparison: src/osprey/services/python_executor/service.py, src/osprey/capabilities/python.py
  • Framework: src/osprey/graph/graph_builder.py

|

# Conflicts:
#	CHANGELOG.md
#	src/osprey/approval/__init__.py
#	src/osprey/capabilities/python.py
#	src/osprey/infrastructure/respond_node.py
#	src/osprey/interfaces/tui/app.py
#	src/osprey/interfaces/tui/styles.tcss
#	src/osprey/interfaces/tui/widgets/__init__.py
#	src/osprey/interfaces/tui/widgets/artifact_viewer.py
#	src/osprey/interfaces/tui/widgets/artifacts.py
#	src/osprey/interfaces/tui/widgets/chat_display.py
#	src/osprey/prompts/defaults/__init__.py
#	src/osprey/prompts/loader.py
#	src/osprey/registry/registry.py
#	src/osprey/state/artifacts.py
#	tests/test_artifacts.py
#	tests/test_tui_artifacts.py
idx = int(choice) - 1
if 0 <= idx < len(available):
return available[idx]
except ValueError:
idx = int(choice) - 1
if 0 <= idx < len(environments):
return environments[idx]
except ValueError:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants