feat(services): XOpt Optimizer Service infrastructure by thellert · Pull Request #89 · als-apg/osprey

thellert · 2026-01-13T23:16:32Z

Summary

This PR adds the foundational agent infrastructure for an XOpt/Badger-based optimizer service. This is a work-in-progress - the agent scaffolding and workflow nodes are implemented, but the actual optimization routines are not yet connected.

What's implemented:

State identification node: ReAct agent that assesses machine readiness using channel access tools
Strategy decision node: Selects exploration vs. optimization approach based on current state
YAML generation agent: Creates XOpt configurations using ReAct pattern with reference file tools
Approval node: Human-in-the-loop integration for reviewing proposed optimizations
Execution and analysis nodes: Scaffolding for running and evaluating optimizations
Multi-iteration approval support: Custom reducers enable approval fields to be reset between iterations
Optimization capability: Routes optimization requests to the XOpt service
Default optimization prompts: Prompt builder for XOpt workflows

What's NOT yet implemented:

Actual XOpt/Badger optimization execution
Real control system integration
End-to-end optimization workflows

Test plan

Unit tests pass: pytest tests/services/xopt_optimizer/ -v
Existing tests unaffected: pytest tests/ --ignore=tests/e2e -v
Type checking passes: mypy src/

- Add ArtifactType enum (IMAGE, NOTEBOOK, COMMAND, HTML, FILE) - Create register_artifact() as single source of truth - Legacy methods delegate to new API with dual-write pattern - Add populate_legacy_fields_from_artifacts() for backward compatibility - Python capability uses new API directly with clean accumulation

- ArtifactGallery with keyboard navigation (Ctrl+a, j/k, Enter, o, Esc) - ArtifactViewer modal with type-specific details and external open - Native image rendering via textual-image (Sixel/Kitty protocols) - New/seen tracking with [NEW] badges for current turn artifacts - Integration with ChatDisplay for artifact state management

- Replace _extract_figures/commands/notebooks_for_cli with single method - Read from ui_artifacts directly instead of legacy fields - Support all artifact types: IMAGE, NOTEBOOK, COMMAND, HTML, FILE - Add type-specific formatting with icons and grouped display - Remove ~150 lines of redundant legacy extraction code

Enable approval_approved and approved_payload fields to be overwritten across multiple approval iterations. Without this, LangGraph's default LastValue channel throws errors when setting approval fields multiple times in multi-iteration workflows (e.g., XOpt optimization).

Display repeat approval requests in styled Panel matching the first approval prompt, rather than plain text. Ensures consistent visual treatment for multi-iteration approval workflows.

Implement XOpt-based autonomous machine parameter optimization service with complete workflow: state identification, strategy decision, YAML generation, human approval, and execution. Key components: - State identification: ReAct agent assesses machine readiness - Decision: LLM selects exploration vs optimization strategy - YAML generation: ReAct agent creates XOpt configurations - Approval: Human-in-the-loop with structured interrupt - Configurable modes: react (LLM) or mock (fast tests) Includes optimization capability, prompt builder, config templates, approval helpers, and comprehensive test suite.

tests/test_tui_artifacts.py

src/osprey/interfaces/tui/widgets/artifacts.py

tests/test_tui_artifacts.py

src/osprey/interfaces/tui/widgets/artifacts.py

Document how XOpt/Badger's Environment abstraction maps to Osprey's ConnectorFactory. The Environment defines variables/observables while the Interface (OspreyInterface) bridges to Osprey's connector system. This provides the architectural insight for XOpt integration without implementing domain-specific logic that requires expert input. Also includes ruff formatting fixes from quick_check.

wenatuhs · 2026-02-01T00:17:50Z

@claude please review this PR, focus on if Xopt/Badger optimization ability should be added as a service to osprey, or should we have it as a subagent since it looks like a ReAct agent itself

claude · 2026-02-01T00:18:04Z

Claude finished @wenatuhs's task in 3m 1s —— View job

Architectural Review: XOpt Service vs. Subagent Pattern

TL;DR Recommendation

✅ The current service architecture is correct. The XOpt optimizer should remain a service, not be converted to a subagent.

Detailed Analysis

Understanding the Architecture Layers

The Osprey framework has a clear separation of concerns:

Main Graph Layer: The orchestrator manages conversation flow and capability routing
Capability Layer: Thin gateways that classify intent and route to services
Service Layer: Complex, multi-stage workflows with their own LangGraph instances
Component Layer: Individual agents (like ReAct agents) that are nodes within services

Why XOpt Should Be a Service

1. It's a Workflow, Not Just an Agent

The XOpt optimizer is not a single ReAct agent—it's an orchestrated workflow with 6 distinct stages:

State Identification → Decision → YAML Generation → Approval → Execution → Analysis
         ↑                                                                      ↓
         └──────────────────────────── (iteration loop) ─────────────────────┘

File references:

Workflow definition: src/osprey/services/xopt_optimizer/service.py:185-244
State machine: src/osprey/services/xopt_optimizer/models.py:215-273

Each stage has different concerns:

State identification (ReAct agent): Assesses machine readiness using channel access tools
Decision: Routes between exploration/optimization/abort strategies
YAML generation (ReAct agent): Creates XOpt configs using reference files
Approval: Human-in-the-loop gate with interrupt/resume
Execution: Runs XOpt/Badger optimization (placeholder)
Analysis: Evaluates results and controls iteration loop

Two of these six stages happen to use ReAct agents internally, but that doesn't make the whole system a ReAct agent.

2. Services Handle Complex Cross-Cutting Concerns

The XOpt service manages capabilities that go beyond what a simple subagent can handle:

Multi-iteration loops with state persistence:

# src/osprey/services/xopt_optimizer/service.py:232-237
workflow.add_conditional_edges(
    "analysis",
    self._loop_router,
    {"continue": "state_identification", "complete": "__end__"},
)

LangGraph-native approval workflows:

Interrupt support with checkpointing: service.py:95-109
Resumable workflows after human approval
Approval state that persists across iterations

Complex conditional routing:

4 different routers with multi-way branching: service.py:246-300
YAML generation retry logic: service.py:261-276
Failure mode handling

Error tracking and recovery:

Structured error chain: models.py:103-131
Attempt numbering for retry strategies
Service-level exception handling: service.py:302-317

These features require a separate LangGraph instance with its own checkpointer—exactly what the service pattern provides.

3. Follows Established Framework Patterns

The XOpt service exactly mirrors the Python executor service:

Aspect	Python Executor	XOpt Optimizer
Service class	`PythonExecutorService`	`XOptOptimizerService`
Capability gateway	`PythonCapability`	`OptimizationCapability`
Request model	`PythonExecutionRequest`	`XOptExecutionRequest`
Result model	`PythonServiceResult`	`XOptServiceResult`
Approval integration	✅ LangGraph interrupts	✅ LangGraph interrupts
Retry logic	✅ Multi-stage	✅ Multi-stage
Internal workflow	✅ LangGraph	✅ LangGraph

File references:

Python service: src/osprey/services/python_executor/service.py:31-100
XOpt service: src/osprey/services/xopt_optimizer/service.py:48-369
Python capability: src/osprey/capabilities/python.py:1-100
XOpt capability: src/osprey/capabilities/optimization.py:1-150

Breaking this consistency would create architectural confusion.

4. Capability Gateway Pattern is Intentional

The OptimizationCapability is a thin gateway, not redundant code:

# src/osprey/capabilities/optimization.py:144-150
def _create_optimization_context(service_result: XOptServiceResult):
    """Create OptimizationResultContext from service result."""
    return OptimizationResultContext(
        run_artifact=service_result.run_artifact,
        strategy=service_result.strategy.value,
        # ... transforms service result to capability context
    )

The capability layer:

Provides a stable interface for the orchestrator
Handles context transformation (service results → capability contexts)
Manages approval system integration
Classifies errors for the orchestrator
Generates orchestrator/classifier guidance

If you made XOpt a direct subagent in the main graph, you'd lose:

Clean separation between conversation orchestration and optimization workflow
The ability to checkpoint/resume complex multi-stage optimization workflows
Consistent error handling abstractions
The capability's context management features

What About the ReAct Agents?

The two ReAct agents are implementation details of the service:

State Identification Agent

File: src/osprey/services/xopt_optimizer/state_identification/agent.py:106-335

class StateIdentificationAgent:
    """ReAct agent for assessing machine readiness."""
    
    async def assess_state(self, objective, additional_context):
        # Uses ReAct pattern to:
        # 1. Read reference docs about ready criteria
        # 2. Check channel values via control system
        # 3. Return structured assessment

This agent is a node in the service workflow, not the whole service.

YAML Generation Agent

File: src/osprey/services/xopt_optimizer/yaml_generation/agent.py:225-488

class YamlGenerationAgent:
    """ReAct agent for generating XOpt YAML configs."""
    
    async def generate_yaml(self, objective, strategy, additional_context):
        # Uses ReAct pattern to:
        # 1. Optionally read example YAMLs
        # 2. Generate config based on strategy
        # 3. Return validated YAML

Again, this is one stage in a six-stage workflow.

Neither agent has the complexity to justify being in the main orchestrator graph. They're specialized components that:

Don't need direct user interaction
Don't manage their own approval workflows
Don't handle multi-iteration loops
Are tightly coupled to the optimization domain

Hypothetical: What If It Were a Subagent?

If you tried to make XOpt a direct subagent in the main graph:

Problems You'd Face:

No separate checkpoint space: The main graph's checkpointer would be polluted with optimization-specific state (iteration counts, YAML configs, approval states, error chains). Currently, each service has its own checkpointer.
Approval workflow complexity: You'd need to handle LangGraph interrupts at the main graph level, making the orchestrator's approval logic much more complex.
Loss of encapsulation: The orchestrator would need to understand XOpt-specific routing logic (exploration vs. optimization strategies, YAML retry logic, iteration control).
Can't follow the Python executor pattern: You'd have architectural inconsistency—Python uses a service, but optimization uses a subagent? Why?
Harder to test: Services can be tested independently. tests/services/xopt_optimizer/test_xopt_service.py tests the entire workflow in isolation.

What You'd Gain:

Marginally simpler capability layer (one fewer hop)
Slightly less code

The trade-off is terrible. You'd sacrifice clean architecture for minimal simplicity gains.

CodeQL Findings (Bonus Review)

Since I reviewed the code, here are the legitimate issues to fix:

1. Unused Variables

File: tests/test_tui_artifacts.py:75, tests/test_tui_artifacts.py:365
File: src/osprey/interfaces/tui/widgets/artifacts.py:187

These are legitimate—variables assigned but never used. Fix by either using them or calling the functions for side effects only.

2. Empty Exception Handlers

File: src/osprey/interfaces/tui/widgets/artifacts.py:313, 336

except Exception:
    pass  # CodeQL: empty handler with no comment

Add explanatory comments about why exceptions are intentionally swallowed (non-critical UI updates).

3. Non-iterable in for loop

File: tests/test_tui_artifacts.py:365

This might be a false positive, but worth investigating.

These are unrelated to the architectural question and should be fixed separately.

Final Recommendation

Keep the service architecture. Here's why:

✅ Matches existing patterns: Consistent with Python executor service
✅ Proper separation of concerns: Orchestrator doesn't need to know about XOpt internals
✅ Handles complexity correctly: Multi-iteration loops, approval gates, error recovery
✅ Testable in isolation: Service can be tested without the full agent graph
✅ Scalable: Easy to add more optimization strategies or workflow stages
✅ Production-ready: LangGraph checkpointing, interrupt/resume, structured errors

The fact that the service uses ReAct agents internally doesn't mean the service should be a ReAct agent in the main graph. That's a category error—services orchestrate workflows; agents are components within those workflows.

Additional Comments

The PR description correctly labels this as infrastructure. That's exactly right. This is foundational service infrastructure that provides:

A reusable pattern for any optimization backend (XOpt, Badger, custom optimizers)
A framework for machine state awareness in optimization workflows
A template for approval-gated, multi-iteration scientific workflows

The "work in progress" nature (placeholder execution, mock components) is appropriate—you're building the scaffolding first, which is the right approach. The architecture is sound; you just need to connect the actual XOpt/Badger execution when ready.

Files reviewed:

Service layer: src/osprey/services/xopt_optimizer/service.py, models.py
ReAct agents: state_identification/agent.py, yaml_generation/agent.py
Workflow nodes: decision/node.py, execution/node.py, analysis/node.py, approval/node.py
Capability gateway: src/osprey/capabilities/optimization.py
Comparison: src/osprey/services/python_executor/service.py, src/osprey/capabilities/python.py
Framework: src/osprey/graph/graph_builder.py

|

# Conflicts: # CHANGELOG.md # src/osprey/approval/__init__.py # src/osprey/capabilities/python.py # src/osprey/infrastructure/respond_node.py # src/osprey/interfaces/tui/app.py # src/osprey/interfaces/tui/styles.tcss # src/osprey/interfaces/tui/widgets/__init__.py # src/osprey/interfaces/tui/widgets/artifact_viewer.py # src/osprey/interfaces/tui/widgets/artifacts.py # src/osprey/interfaces/tui/widgets/chat_display.py # src/osprey/prompts/defaults/__init__.py # src/osprey/prompts/loader.py # src/osprey/registry/registry.py # src/osprey/state/artifacts.py # tests/test_artifacts.py # tests/test_tui_artifacts.py

…rrors

…d tuning API client

…on refactor

…errupt

src/osprey/services/xopt_optimizer/config_generation/node.py

src/osprey/capabilities/optimization.py

+            idx = int(choice) - 1
+            if 0 <= idx < len(available):
+                return available[idx]
+        except ValueError:


src/osprey/services/xopt_optimizer/config_generation/node.py

+        idx = int(choice) - 1
+        if 0 <= idx < len(environments):
+            return environments[idx]
+    except ValueError:


thellert added 6 commits January 10, 2026 15:10

fix(cli): Style multi-iteration approval prompts with Panel

f4133c2

Display repeat approval requests in styled Panel matching the first approval prompt, rather than plain text. Ensures consistent visual treatment for multi-iteration approval workflows.

github-code-quality bot found potential problems Jan 13, 2026

View reviewed changes

gian21391 added 7 commits February 26, 2026 10:57

fix(xopt): replace str+Enum with StrEnum to resolve ruff UP042 lint e…

d4bd694

…rrors

refactor(xopt): replace yaml_generation with config_generation and ad…

b54cd84

…d tuning API client

feat(xopt): add tuning service templates and update optimization prompts

d94d69c

test(xopt): add API client tests and update tests for config_generati…

23e9f9d

…on refactor

feat(xopt): add question interrupt support for environment selection

3d8c73f

feat(xopt): add API-driven objective resolution and fix iteration int…

8343234

…errupt

github-code-quality bot found potential problems Mar 13, 2026

View reviewed changes

src/osprey/services/xopt_optimizer/config_generation/node.py Fixed Show fixed Hide fixed

src/osprey/capabilities/optimization.py

idx = int(choice) - 1

if 0 <= idx < len(available):

return available[idx]

except ValueError:

style: fix import sorting and formatting

50966ef

github-code-quality bot found potential problems Mar 13, 2026

View reviewed changes

src/osprey/services/xopt_optimizer/config_generation/node.py

idx = int(choice) - 1

if 0 <= idx < len(environments):

return environments[idx]

except ValueError:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(services): XOpt Optimizer Service infrastructure#89

feat(services): XOpt Optimizer Service infrastructure#89
thellert wants to merge 15 commits intomainfrom
feature/xopt-optimizer-service

thellert commented Jan 13, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenatuhs commented Feb 1, 2026

Uh oh!

claude bot commented Feb 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

thellert commented Jan 13, 2026

Summary

What's implemented:

What's NOT yet implemented:

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wenatuhs commented Feb 1, 2026

Uh oh!

claude bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architectural Review: XOpt Service vs. Subagent Pattern

TL;DR Recommendation

Detailed Analysis

Understanding the Architecture Layers

Why XOpt Should Be a Service

1. It's a Workflow, Not Just an Agent

2. Services Handle Complex Cross-Cutting Concerns

3. Follows Established Framework Patterns

4. Capability Gateway Pattern is Intentional

What About the ReAct Agents?

State Identification Agent

YAML Generation Agent

Hypothetical: What If It Were a Subagent?

Problems You'd Face:

What You'd Gain:

CodeQL Findings (Bonus Review)

1. Unused Variables

2. Empty Exception Handlers

3. Non-iterable in for loop

Final Recommendation

Additional Comments

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude bot commented Feb 1, 2026 •

edited

Loading