Adaptive Architect: Simulation Testing (Phase 2)

## User Story
**As a** manager  
**I want** to test my agents before deploying them to real interviewees  
**So that** I can verify agent behavior and catch issues early

## Priority: Phase 2
Simulation testing is deferred to Phase 2 to focus on core architecture first.

## Acceptance Criteria

### Manual Role-Play Testing
- [ ] Manager can "chat with" any designed agent as if they were an interviewee
- [ ] Real-time conversation in a test sandbox
- [ ] See how agent asks questions, responds, extracts data
- [ ] Observe exit conditions being evaluated
- [ ] Test different interviewee behaviors (cooperative, vague, reluctant)

### Automated Simulation Scenarios (Phase 3)
- [ ] System generates synthetic interviewee personas
- [ ] Personas based on domain and rubric:
  - **Cooperative Expert**: Full knowledge, eager to share
  - **Vague Responder**: Brief/unclear answers, needs probing
  - **Reluctant Participant**: Hesitant, needs rapport building
  - **Time-Pressed**: Wants to finish quickly
- [ ] Run automated simulations against agent
- [ ] Compare extracted data against ground truth
- [ ] Calculate coverage metrics

### Simulation Results
- [ ] Show conversation transcript
- [ ] Show extracted entities and rubric field coverage
- [ ] Highlight what was captured vs. missed
- [ ] Flag potential issues (e.g., agent stuck in loop, poor coverage)

### API Requirements
- [ ] `POST /api/v1/design-sessions/{id}/simulate` - Run automated simulation
- [ ] `WebSocket /api/v1/design-sessions/{id}/roleplay/{agent_id}` - Manual role-play

## Technical Notes
- Manual role-play uses same Interview Agent infrastructure
- Automated simulations use LLM to play interviewee role
- Ground truth stored in simulation scenario
- See TESTING-EVALUATION-FRAMEWORK.md for evaluation criteria

## Definition of Done
- [ ] Manual role-play working
- [ ] Simulation results displayed
- [ ] Coverage metrics calculated
- [ ] Code reviewed and merged

---
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive Architect: Simulation Testing (Phase 2) #15

User Story

Priority: Phase 2

Acceptance Criteria

Manual Role-Play Testing

Automated Simulation Scenarios (Phase 3)

Simulation Results

API Requirements

Technical Notes

Definition of Done

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adaptive Architect: Simulation Testing (Phase 2) #15

Description

User Story

Priority: Phase 2

Acceptance Criteria

Manual Role-Play Testing

Automated Simulation Scenarios (Phase 3)

Simulation Results

API Requirements

Technical Notes

Definition of Done

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions