A Python-based agent that integrates research providers with Claude Code through the Model Context Protocol (MCP). It supports OpenAI (Responses API with web search and code interpreter, or Chat Completions API for broad provider compatibility), Gemini Deep Research via the Interactions API, Allen AI's DR-Tulu research agent, and the open-source Open Deep Research stack (based on smolagents).
- Python 3.11+
- uv installed
- One of:
- OpenAI API access (Responses API models, e.g.,
o4-mini-deep-research-2025-06-26) - Gemini API access with the Interactions API / Deep Research agent enabled
- DR-Tulu service running locally or remotely (see DR-Tulu setup)
- Open Deep Research dependencies (installed via
uv sync --extra open-deep-research)
- OpenAI API access (Responses API models, e.g.,
- Claude Code, or any other assistant supporting MCP integration
Recommended setup (resolves the latest compatible versions):
# Install runtime dependencies + project in editable mode
uv sync --upgrade
# Development tooling (pytest, black, pylint, mypy, pre-commit)
uv sync --upgrade --extra dev
# Enable the pre-commit hook so black runs automatically before each commit
uv run pre-commit install
# Optional docs tooling
uv sync --upgrade --extra docs
# Optional Open Deep Research provider dependencies
uv sync --upgrade --extra open-deep-researchCompatibility setup (pip-based):
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .src/deep_research_mcp/agent.py: orchestration layer; owns clarification, instruction building, callbacks, and delegates provider work to backendssrc/deep_research_mcp/backends/: provider-specific implementations for OpenAI, Gemini, DR-Tulu, and Open Deep Researchsrc/deep_research_mcp/mcp_server.py: FastMCP server and tool entrypointssrc/deep_research_mcp/clarification.py: clarification agents, sessions, and enrichment flowsrc/deep_research_mcp/prompts/: YAML prompt templates used by clarification and instruction buildingcli/deep-research-cli.py: unified CLI for agent mode, MCP client mode, and configuration viewingcli/deep-research-tui.py: interactive full-screen terminal UI for clarification, research, status checks, and saving output to disktests/:pytestsuite covering configuration, MCP integration, prompts, results, and clarification flows
Create a ~/.deep_research file in your home directory using TOML format.
Library note: ResearchConfig.load() explicitly reads this file and applies environment variable overrides. ResearchConfig.from_env() reads environment variables only.
Common settings:
[research] # Core Deep Research functionality
provider = "openai" # Available options: "openai", "dr-tulu", "gemini", "open-deep-research" -- defaults to "openai"
api_style = "responses" # Only applies to provider="openai"; use "chat_completions" for Perplexity, Groq, Ollama, etc.
model = "o4-mini-deep-research-2025-06-26" # OpenAI: model identifier; Dr Tulu: logical provider id; Gemini: agent id; ODR: LiteLLM model identifier
api_key = "your-api-key" # API key, optional
base_url = "https://api.openai.com/v1" # OpenAI: OpenAI-compatible endpoint; Dr Tulu: service base URL; Gemini: https://generativelanguage.googleapis.com; ODR: LiteLLM-compatible endpoint
# Task behavior
timeout = 1800
poll_interval = 30
# Largely based on https://cookbook.openai.com/examples/deep_research_api/introduction_to_deep_research_api_agents
[clarification] # Optional query clarification component
enable = true
triage_model = "gpt-5-mini"
clarifier_model = "gpt-5-mini"
instruction_builder_model = "gpt-5-mini"
api_key = "YOUR_OPENAI_API_KEY" # Optional, overrides api_key
base_url = "https://api.openai.com/v1" # Optional, overrides base_url
[logging]
level = "INFO"OpenAI provider example:
[research]
provider = "openai"
model = "o4-mini-deep-research-2025-06-26" # OpenAI model
api_key = "YOUR_OPENAI_API_KEY" # Defaults to OPENAI_API_KEY
base_url = "https://api.openai.com/v1" # OpenAI-compatible endpoint
timeout = 1800
poll_interval = 30Gemini Deep Research provider example:
[research]
provider = "gemini"
model = "deep-research-pro-preview-12-2025" # Gemini Deep Research agent id
api_key = "YOUR_GEMINI_API_KEY" # Defaults to GEMINI_API_KEY or GOOGLE_API_KEY
base_url = "https://generativelanguage.googleapis.com"
timeout = 1800
poll_interval = 30Dr Tulu provider example:
[research]
provider = "dr-tulu"
model = "dr-tulu" # Logical provider model id; currently informational
base_url = "http://localhost:8080/" # Dr Tulu service base URL; the backend calls /chat
api_key = "" # Optional; defaults to RESEARCH_API_KEY / DR_TULU_API_KEY if set
timeout = 1800
poll_interval = 30Running Dr Tulu locally:
- Clone and configure
dr-tuluseparately. - In
dr-tulu/agent/.env, set at least:SERPER_API_KEYJINA_API_KEYS2_API_KEY(optional but recommended)
- Start the DR-Tulu model server:
cd /path/to/dr-tulu/agent
conda run -n vllm bash -lc '
CUDA_VISIBLE_DEVICES=0 \
vllm serve rl-research/DR-Tulu-8B \
--port 30001 \
--dtype auto \
--max-model-len 16384 \
--gpu-memory-utilization 0.60 \
--enforce-eager
'- Start the Dr Tulu MCP backend:
cd /path/to/dr-tulu/agent
conda run -n vllm python -m dr_agent.mcp_backend.main --port 8000- Start the Dr Tulu app service:
cd /path/to/dr-tulu/agent
conda run -n vllm python workflows/auto_search_sft.py serve \
--port 8080 \
--config workflows/auto_search_sft.yaml \
--config-overrides "search_agent_max_tokens=12000,browse_agent_max_tokens=12000"- Point
deep-research-mcpat that service:
[research]
provider = "dr-tulu"
base_url = "http://localhost:8080/"
timeout = 1800The dr-tulu backend calls POST {base_url}/chat, so if you front Dr Tulu behind a different host, port, or reverse proxy, update base_url accordingly.
Perplexity (via Sonar Deep Research and Perplexity's OpenAI-compatible endpoint) provider example:
[research]
provider = "openai"
api_style = "chat_completions" # Required for Perplexity (no Responses API)
model = "sonar-deep-research" # Perplexity's Sonar Deep Research
api_key = "ppl-..." # Defaults to OPENAI_API_KEY
base_url = "https://api.perplexity.ai" # Perplexity's OpenAI-compatible endpoint
timeout = 1800Open Deep Research provider example:
[research]
provider = "open-deep-research"
model = "openai/qwen/qwen3-coder-30b" # LiteLLM-compatible model id
base_url = "http://localhost:1234/v1" # LiteLLM-compatible endpoint (local or remote)
api_key = "" # Optional if endpoint requires it
timeout = 1800Ollama (local) provider example:
[research]
provider = "openai"
api_style = "chat_completions"
model = "llama3.1" # Any model available in your Ollama instance
base_url = "http://localhost:11434/v1" # Ollama's OpenAI-compatible endpoint
api_key = "" # Not required for local Ollama
timeout = 600llama-server (local llama.cpp server) provider example:
[research]
provider = "openai"
api_style = "chat_completions"
model = "qwen2.5-0.5b" # Must match the --alias passed to llama-server
base_url = "http://127.0.0.1:8081/v1" # llama-server OpenAI-compatible endpoint
api_key = "test" # Must match the --api-key passed to llama-server
timeout = 600Generic OpenAI-compatible Chat Completions provider (Groq, Together AI, vLLM, etc.):
[research]
provider = "openai"
api_style = "chat_completions"
model = "your-model-name"
api_key = "your-api-key"
base_url = "https://api.your-provider.com/v1"
timeout = 600Optional env variables for Open Deep Research tools:
SERPAPI_API_KEYorSERPER_API_KEY: enable Google-style searchHF_TOKEN: optional, logs into Hugging Face Hub for gated models
This repository also ships a repo-specific skill guide at
skills/deep-research-mcp/SKILL.md.
Installing that file as a local skill gives Claude Code or Codex a focused
playbook for this repository's CLI, Python API, providers, and MCP server.
It does not install Python dependencies, start the MCP server, or replace
the provider configuration in ~/.deep_research. Treat it as complementary to
the MCP setup below, not a substitute for it.
Claude Code's skills docs use these locations:
- personal skill:
~/.claude/skills/<skill-name>/SKILL.md - project skill:
.claude/skills/<skill-name>/SKILL.md
Personal install:
mkdir -p ~/.claude/skills/deep-research-mcp
cp /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.claude/skills/deep-research-mcp/SKILL.mdIf you prefer to keep the installed skill linked to this checkout:
mkdir -p ~/.claude/skills/deep-research-mcp
ln -sf /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.claude/skills/deep-research-mcp/SKILL.mdProject-local install from the repository root:
mkdir -p .claude/skills/deep-research-mcp
cp skills/deep-research-mcp/SKILL.md .claude/skills/deep-research-mcp/SKILL.mdAfter that, restart Claude Code or open a new session if the skill does not
appear immediately. The skill can be invoked directly as
/deep-research-mcp, and Claude can also load it automatically when the task
matches the skill description. For a reusable shared extension, package the
same skill into a Claude Code plugin instead of copying it by hand.
Official docs:
- Claude Code skills: https://docs.claude.com/en/docs/claude-code/skills
- Claude Code plugins: https://docs.claude.com/en/docs/claude-code/plugins
Current Codex docs describe skills as skill folders containing SKILL.md,
stored globally in $HOME/.agents/skills or repo-locally in .agents/skills.
Personal install:
mkdir -p ~/.agents/skills/deep-research-mcp
cp /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.agents/skills/deep-research-mcp/SKILL.mdOr keep the installed skill linked to this checkout:
mkdir -p ~/.agents/skills/deep-research-mcp
ln -sf /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.agents/skills/deep-research-mcp/SKILL.mdProject-local install from the repository root:
mkdir -p .agents/skills/deep-research-mcp
cp skills/deep-research-mcp/SKILL.md .agents/skills/deep-research-mcp/SKILL.mdIf your Codex setup still follows older CLI conventions that use
~/.codex/skills or .codex/skills, mirror the same directory structure
there instead.
Codex can invoke the skill implicitly when the task matches its description, or
explicitly by mentioning $deep-research-mcp in the prompt. If the skill does
not show up immediately, restart Codex.
Official docs:
- Codex skills: https://developers.openai.com/codex/skills
- Codex customization and skill locations: https://developers.openai.com/codex/concepts/customization#skills
- Configure MCP Server
Choose one of the transports below.
Option A: stdio (recommended when Claude Code should spawn the server itself)
If your provider credentials are already stored in ~/.deep_research, the
minimal setup is:
claude mcp add deep-research -- uv run --directory /path/to/deep-research-mcp deep-research-mcpIf you want Claude Code to pass OPENAI_API_KEY through to the spawned MCP
process explicitly, use:
claude mcp add -e OPENAI_API_KEY="$OPENAI_API_KEY" \
deep-research -- \
uv run --directory /path/to/deep-research-mcp deep-research-mcpOption B: HTTP (recommended when you want to run the server separately)
Start the server in one terminal:
OPENAI_API_KEY="$OPENAI_API_KEY" \
uv run --directory /path/to/deep-research-mcp \
deep-research-mcp --transport http --host 127.0.0.1 --port 8080Then add the HTTP MCP server in Claude Code:
claude mcp add --transport http deep-research-http http://127.0.0.1:8080/mcpReplace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
The verified Streamable HTTP endpoint is http://127.0.0.1:8080/mcp.
For multi-hour research, raise Claude Code's tool timeout before launching the CLI and rely on incremental status polls:
export MCP_TOOL_TIMEOUT=14400000 # 4 hours
claude --mcp-config ./.mcp.jsonKick off work with deep_research or research_with_context, note the returned job ID, and call research_status to stream progress without letting any single tool call stagnate.
- Use in Claude Code:
- The research tools will appear in Claude Code's tool palette
- Simply ask Claude to "research [your topic]" and it will use the Deep Research agent
- For clarified research, ask Claude to "research [topic] with clarification" to get follow-up questions
- Configure MCP Server
Choose one of the transports below.
Option A: stdio (recommended when Codex should spawn the server itself)
Add the MCP server configuration to your ~/.codex/config.toml file:
[mcp_servers.deep-research]
command = "uv"
args = ["run", "--directory", "/path/to/deep-research-mcp", "deep-research-mcp"]
# If your provider credentials live in shell env vars rather than ~/.deep_research,
# pass them through to the MCP subprocess explicitly:
env_vars = ["OPENAI_API_KEY"]
startup_timeout_ms = 30000 # 30 seconds for server startup
request_timeout_ms = 7200000 # 2 hours for long-running research tasks
# Alternatively, set tool_timeout_sec when using newer Codex clients
# tool_timeout_sec = 14400.0 # 4 hours for deep research runsReplace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
If your credentials are already configured in ~/.deep_research, env_vars is
optional. It is required when you expect the spawned MCP server to inherit
OPENAI_API_KEY from the parent shell.
Option B: HTTP (recommended when you want to run the server separately)
Start the server in one terminal:
OPENAI_API_KEY="$OPENAI_API_KEY" \
uv run --directory /path/to/deep-research-mcp \
deep-research-mcp --transport http --host 127.0.0.1 --port 8080Then add this to ~/.codex/config.toml:
[mcp_servers.deep-research-http]
url = "http://127.0.0.1:8080/mcp"
tool_timeout_sec = 14400.0The verified Streamable HTTP endpoint is http://127.0.0.1:8080/mcp.
Important timeout configuration:
startup_timeout_ms: Time allowed for the MCP server to start (default: 30000ms / 30 seconds)request_timeout_ms: Maximum time for research queries to complete (recommended: 7200000ms / 2 hours for comprehensive research)tool_timeout_sec: Preferred for newer Codex clients; set this to a large value (e.g.,14400.0for 4 hours) when you expect long-running research.- Kick off research once to capture the job ID, then poll
research_statusso each tool call remains short and avoids hitting client timeouts.
Without proper timeout configuration, long-running research queries may fail with "request timed out" errors.
- Use in OpenAI Codex:
- The research tools will be available automatically when you start Codex
- Ask Codex to "research [your topic]" and it will use the Deep Research MCP server
- For clarified research, ask for "research [topic] with clarification"
- Configure MCP Server
Add the MCP server using Gemini CLI's built-in command:
gemini mcp add deep-research -- uv run --directory /path/to/deep-research-mcp deep-research-mcpOr manually add to your ~/.gemini/settings.json file:
{
"mcpServers": {
"deep-research": {
"command": "uv",
"args": ["run", "--directory", "/path/to/deep-research-mcp", "deep-research-mcp"],
"env": {
"RESEARCH_PROVIDER": "gemini",
"GEMINI_API_KEY": "$GEMINI_API_KEY"
}
}
}
}Replace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
- Use in Gemini CLI:
- Start Gemini CLI with
gemini - The research tools will be available automatically
- Ask Gemini to "research [your topic]" and it will use the Deep Research MCP server
- Use
/mcpcommand to view server status and available tools
- Start Gemini CLI with
HTTP transport: If your Gemini environment supports MCP-over-HTTP, you may run
the server with --transport http and configure Gemini with the server URL.
import asyncio
from deep_research_mcp.agent import DeepResearchAgent
from deep_research_mcp.config import ResearchConfig
async def main():
# Initialize configuration
config = ResearchConfig.load()
# Create agent
agent = DeepResearchAgent(config)
# Perform research
result = await agent.research(
query="What are the latest advances in quantum computing?",
system_prompt="Focus on practical applications and recent breakthroughs"
)
# Print results
print(f"Report: {result.final_report}")
print(f"Citations: {result.citations}")
print(f"Research steps: {result.reasoning_steps}")
print(f"Execution time: {result.execution_time:.2f}s")
# Run the research
asyncio.run(main())Two transports are supported: stdio (default) and HTTP streaming.
# 1) stdio (default) — for editors/CLIs that spawn a local process
uv run deep-research-mcp
# 2) HTTP streaming — start a local HTTP MCP server
uv run deep-research-mcp --transport http --host 127.0.0.1 --port 8080Notes:
- HTTP mode uses streaming responses provided by FastMCP. The tools in this server return their full results when a research task completes; streaming is still beneficial for compatible clients and for future incremental outputs.
- The verified Streamable HTTP endpoint is
/mcp, so the default local URL ishttp://127.0.0.1:8080/mcp. - If you start the server outside the client and rely on environment variables
for credentials, export them before launching the server process. If you use
stdioand let the client spawn the server, make sure the client passes the required env vars through.
The unified CLI at cli/deep-research-cli.py provides direct access to all
research functionality from the terminal. It supports two modes of operation:
agent mode (default) which runs DeepResearchAgent directly, and MCP
client mode which connects to a running MCP server over HTTP.
Configuration is loaded from ~/.deep_research by default. Every
ResearchConfig parameter can be overridden via CLI flags, which take
precedence over both the TOML file and environment variables.
The repository also ships with a full-screen terminal UI at
cli/deep-research-tui.py. It presents the same core functionality as the
CLI in a dark, keyboard-driven interface for running clarification, deep
research, task status checks, and saving output to disk.
The TUI features a split-panel layout:
- Left panel: Configuration controls for mode selection (Agent/MCP), provider settings, model configuration, query input, and system prompt
- Right panel: Output display showing research results, clarification questions, or status information
The animation above demonstrates the TUI workflow: selecting the Chat Completions API style, entering a research query about nuclear fusion, running the research, viewing results in the output panel, and saving the output to file.
# Start in direct agent mode
uv run python cli/deep-research-tui.py
# Start in MCP client mode
uv run python cli/deep-research-tui.py --mode mcp \
--server-url http://127.0.0.1:8080/mcp
# Start with Gemini selected
uv run python cli/deep-research-tui.py --provider gemini- Use the left control panel to edit provider settings, query text, system prompt, and save path.
- The TUI starts focused on the
Modeselector rather than inside the query editor. - Use
Tab/Shift+Tabto move through all controls. - Use
Up/Downto move between non-editor controls, including single-line inputs. - Use
Left/Rightto toggle booleans and cycle through choice fields when aSwitchorSelecthas focus. - Press
Enterto activate buttons, toggle switches, cycle selects, or move forward from a single-line input. TextAreawidgets such asQueryandSystem Promptkeep normal cursor-key editing behavior.- Use
cto run clarification,rto run deep research,tto check task status,sto save the current output, andqto quit. - The right panel shows the latest clarification output, research report, or status response.
In agent mode, the TUI applies provider-aware defaults:
openai+responses: modelo4-mini-deep-research-2025-06-26, base URLhttps://api.openai.com/v1openai+chat_completions: modelgpt-5-mini, base URLhttps://api.openai.com/v1dr-tulu: modeldr-tulu, base URLhttp://localhost:8080/gemini: modeldeep-research-pro-preview-12-2025, base URLhttps://generativelanguage.googleapis.comopen-deep-research: modelopenai/qwen/qwen3-coder-30b, base URLhttp://localhost:1234/v1
Switching provider or OpenAI API style automatically refreshes the model and base URL defaults. You can still override those fields manually afterward.
- In
agentmode,Run Clarificationcalls the clarification flow directly throughDeepResearchAgent. - In
mcpmode,Run Clarificationcalls the MCPdeep_researchtool withrequest_clarification=true. - If clarification questions are returned, the TUI adds answer fields dynamically and uses those answers on the next research run.
Run Deep Researchexecutes either the direct agent flow or the MCP client flow, depending on the selected mode.Save Outputwrites the current contents of the output panel to the configured path, creating parent directories if needed.
- In
mcpmode, setMCP Server URLto a running Streamable HTTP endpoint such ashttp://127.0.0.1:8080/mcp. - The TUI reuses the same config-loading behavior as
cli/deep-research-cli.py, so~/.deep_researchand any startup overrides still apply. - Direct-agent JSON rendering is available through the
JSON Outputtoggle; MCP mode saves the textual tool response exactly as returned by the server. - The current layout is designed for terminals at least 100 columns wide and 28 rows tall.
Gemini Deep Research:
uv run python cli/deep-research-cli.py \
--provider gemini \
--model deep-research-pro-preview-12-2025 \
--base-url https://generativelanguage.googleapis.com \
research "What is the capital of France?"Expected output (snippet):
============================================================
RESEARCH REPORT
============================================================
Task ID: v1_Chd5YUxjYVpXRUotYWxrZFVQOF9lcTRRVRIX...
Total steps: 1
Execution time: 245.94s
# Comprehensive Analysis of the French Capital: Demographic,
# Economic, and Historical Dimensions of Paris
**Key Points**
* **The capital of France is Paris**, acting as the undisputed
political, economic, and cultural epicenter of the nation.
* Data indicates that the Paris metropolitan area commands a
Gross Domestic Product (GDP) exceeding $1.03 trillion ...
OpenAI o4-mini-deep-research:
uv run python cli/deep-research-cli.py \
--provider openai \
--model o4-mini-deep-research-2025-06-26 \
--base-url https://api.openai.com/v1 \
research "What is the capital of France?"Expected output:
============================================================
RESEARCH REPORT
============================================================
Task ID: resp_0e55abdae3d3ce390069dca3c7d78c819f91...
Total steps: 22
Search queries: 10
Citations: 1
Execution time: 34.34s
The capital of France is **Paris**
(www.britannica.com/place/Paris)
============================================================
CITATIONS
============================================================
1. Paris | Definition, Map, Population, Facts, & History
https://www.britannica.com/place/Paris
DR-Tulu (requires a running dr-tulu service; see Dr Tulu provider example):
uv run python cli/deep-research-cli.py \
--provider dr-tulu \
--base-url http://localhost:8080/ \
research "What is the capital of France?"Expected output:
============================================================
RESEARCH REPORT
============================================================
Task ID: 7f3a1b2c-...
Total steps: 4
Citations: 2
Execution time: 18.72s
The capital of France is Paris. Paris has served as the
French capital since the late 10th century ...
============================================================
CITATIONS
============================================================
1. Source 1
https://en.wikipedia.org/wiki/Paris
2. Source 2
https://www.britannica.com/place/Paris
View resolved configuration or all available options:
uv run python cli/deep-research-cli.py config --pretty
uv run python cli/deep-research-cli.py --helpresearch QUERY -- perform deep research on a query.
# Simple research (agent mode)
uv run python cli/deep-research-cli.py research "Economic impact of AI adoption"
# Override provider and model for a single run
uv run python cli/deep-research-cli.py --provider gemini research "Climate change policies"
# Use a custom system prompt from a file
uv run python cli/deep-research-cli.py research "Healthcare trends" --system-prompt-file prompts/health.txt
# Or pass the system prompt inline
uv run python cli/deep-research-cli.py research "Healthcare trends" --system-prompt "Focus on peer-reviewed sources only"
# Output as JSON (includes metadata, citations, execution time)
uv run python cli/deep-research-cli.py research "AI safety" --json
# Save the report to a file
uv run python cli/deep-research-cli.py research "Renewable energy" --output-file report.md
# Disable code interpreter / data analysis tools
uv run python cli/deep-research-cli.py research "Simple topic" --no-analysis
# Notify a webhook when research completes
uv run python cli/deep-research-cli.py research "Long query" --callback-url https://example.com/webhookThe unified CLI works with local servers that expose an OpenAI-compatible
Chat Completions API. The commands below were tested against local Ollama and
local llama-server (from llama.cpp).
Basic research flow:
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://localhost:11434/v1 \
--api-key test \
--model qwen3.5:0.8b \
--timeout 180 \
research "Reply with exactly: ok"Observed output:
HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
============================================================
RESEARCH REPORT
============================================================
Task ID: chatcmpl-215
Total steps: 1
Execution time: 14.33s
ok
Interactive clarification with Ollama needs the clarification models pinned to
the same local endpoint. In testing, qwen3.5:4b worked for clarification,
while qwen3.5:0.8b was too small to reliably satisfy the structured triage
step.
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://localhost:11434/v1 \
--api-key test \
--model qwen3.5:0.8b \
--clarification-base-url http://localhost:11434/v1 \
--clarification-api-key test \
--triage-model qwen3.5:4b \
--clarifier-model qwen3.5:4b \
--instruction-builder-model qwen3.5:4b \
--timeout 180 \
research "best laptop" --clarifyObserved interaction:
Starting clarification process...
Please answer the following clarifying questions:
1. What is your budget range?
Your answer (or press Enter to skip): Under $1500
2. What will you primarily use the laptop for (gaming, work, students, creative tasks)?
Your answer (or press Enter to skip): Programming and general work
3. Do you have a preferred operating system (macOS, Windows,)?
Your answer (or press Enter to skip): macOS preferred, Windows acceptable
...
Enriched query: What are the best laptops under $1500 for professional programming work, preferably macOS with 16GB RAM, 512GB SSD, 13-15 inch display, and good battery life?
Start a local OpenAI-compatible server and let it download a small GGUF model from Hugging Face automatically:
llama-server \
--host 127.0.0.1 \
--port 8081 \
--api-key test \
--ctx-size 4096 \
--alias qwen2.5-0.5b \
-hf Qwen/Qwen2.5-0.5B-Instruct-GGUF:Q4_K_MThen point the CLI at the server:
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://127.0.0.1:8081/v1 \
--api-key test \
--model qwen2.5-0.5b \
--timeout 120 \
research "Reply with exactly: ok"Observed output:
HTTP Request: POST http://127.0.0.1:8081/v1/chat/completions "HTTP/1.1 200 OK"
============================================================
RESEARCH REPORT
============================================================
Task ID: chatcmpl-uzqSNDzRgYchZRxn1Kq2MNYyc2w6hpc6
Total steps: 1
Execution time: 0.15s
Ok
--clarify is model-sensitive on llama-server. Small tested models
(qwen2.5-0.5b and qwen2.5-3b) completed the basic research command but
did not reliably ask follow-up questions. Example output with
qwen2.5-3b:
Starting clarification process...
Triage assessment: The query is focused on finding the best laptop, which is a common and specific research topic.
Assessment: The query 'best laptop' is clear and specific enough for direct research.
Proceeding with original query
research QUERY --clarify -- interactive clarification before research.
The --clarify flag runs an interactive clarification flow: the agent
analyzes your query, asks follow-up questions to improve specificity, and
then performs research using an enriched query. This works in both agent mode
and MCP client mode.
In agent mode, --clarify automatically enables the clarification pipeline
regardless of the enable_clarification setting in your config file.
# Interactive clarification (agent mode)
uv run python cli/deep-research-cli.py research "Quantum computing applications" --clarify
# Interactive clarification (MCP client mode)
uv run python cli/deep-research-cli.py research "Quantum computing" --clarify \
--server-url http://localhost:8080/mcpExample session:
Starting clarification process...
Please answer the following clarifying questions:
1. Are you interested in near-term applications or long-term theoretical possibilities?
Your answer (or press Enter to skip): Near-term commercial applications
2. Which industries are you most interested in?
Your answer (or press Enter to skip): Finance and pharmaceuticals
3. Should the report focus on specific hardware platforms?
Your answer (or press Enter to skip):
Enriched query: Quantum computing applications in finance and pharmaceuticals...
Starting research with query: '...'
research QUERY --server-url URL -- use MCP client mode.
Instead of running the agent directly, connect to a running Deep Research MCP server over Streamable HTTP.
# First, start the MCP server in another terminal:
uv run deep-research-mcp --transport http --host 127.0.0.1 --port 8080
# Then run queries against it:
uv run python cli/deep-research-cli.py research "AI trends" \
--server-url http://127.0.0.1:8080/mcpstatus TASK_ID -- check the status of a running research task.
# Agent mode
uv run python cli/deep-research-cli.py status abc123-def456-ghi789
# MCP client mode
uv run python cli/deep-research-cli.py status abc123-def456-ghi789 \
--server-url http://127.0.0.1:8080/mcpconfig -- display the resolved configuration.
Shows the final configuration after merging the TOML file, environment variables, and any CLI overrides.
# JSON output (default), with secrets masked
uv run python cli/deep-research-cli.py config
# Human-readable output
uv run python cli/deep-research-cli.py config --pretty
# Show full API keys
uv run python cli/deep-research-cli.py config --pretty --show-secrets
# See the effect of CLI overrides
uv run python cli/deep-research-cli.py --provider gemini --timeout 600 config --pretty
# Skip config validation
uv run python cli/deep-research-cli.py config --no-validateAll global flags are placed before the subcommand and override the
corresponding ResearchConfig field:
| Flag | Description |
|---|---|
--config PATH |
Path to TOML config file (default: ~/.deep_research) |
--provider {openai,dr-tulu,gemini,open-deep-research} |
Research provider |
--model MODEL |
Model or agent ID |
--api-key KEY |
Provider API key |
--base-url URL |
Provider API base URL |
--api-style {responses,chat_completions} |
OpenAI API style |
--timeout SECONDS |
Max research timeout |
--poll-interval SECONDS |
Task poll interval |
--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL} |
Logging level |
--enable-clarification |
Enable the clarification pipeline |
--enable-reasoning-summaries |
Enable reasoning summaries |
--triage-model MODEL |
Model for query triage |
--clarifier-model MODEL |
Model for query enrichment |
--clarification-base-url URL |
Base URL for clarification models |
--clarification-api-key KEY |
API key for clarification models |
--instruction-builder-model MODEL |
Model for instruction building |
Configuration precedence (highest to lowest): CLI flags > environment
variables > TOML config file (~/.deep_research) > built-in defaults.
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Research or configuration error |
| 2 | MCP tool error |
| 3 | Unexpected error |
# Basic research query
result = await agent.research("Explain the transformer architecture in AI")
# Research with code analysis
result = await agent.research(
query="Analyze global temperature trends over the last 50 years",
include_code_interpreter=True
)
# Custom system instructions
result = await agent.research(
query="Review the safety considerations for AGI development",
system_prompt="""
Provide a balanced analysis including:
- Technical challenges
- Current safety research
- Regulatory approaches
- Industry perspectives
Include specific examples and data where available.
"""
)
# With clarification (requires ENABLE_CLARIFICATION=true)
clarification_result = agent.start_clarification("quantum computing applications")
if clarification_result.get("needs_clarification"):
# Answer questions programmatically or present to user
answers = ["Hardware applications", "Last 5 years", "Commercial products"]
agent.add_clarification_answers(clarification_result["session_id"], answers)
enriched_query = agent.get_enriched_query(clarification_result["session_id"])
result = await agent.research(enriched_query)The agent includes an optional clarification system to improve research quality through follow-up questions.
Enable clarification in your ~/.deep_research file:
[clarification]
enable_clarification = true
triage_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini
clarifier_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini
instruction_builder_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini
clarification_api_key = "YOUR_CLARIFICATION_API_KEY" # Optional custom API key for clarification models
clarification_base_url = "https://custom-api.example.com/v1" # Optional custom endpoint for clarification modelsClarification and instruction-building remain OpenAI-compatible chat flows. If your main research provider is dr-tulu, gemini, or open-deep-research, set clarification_api_key / clarification_base_url explicitly, or provide OPENAI_API_KEY / OPENAI_BASE_URL in the environment for those helper models.
-
Start Clarification:
result = agent.start_clarification("your research query")
-
Check if Questions are Needed:
if result.get("needs_clarification"): questions = result["questions"] session_id = result["session_id"]
-
Provide Answers:
answers = ["answer1", "answer2", "answer3"] agent.add_clarification_answers(session_id, answers)
-
Get Enriched Query:
enriched_query = agent.get_enriched_query(session_id) final_result = await agent.research(enriched_query)
When using with AI Assistants via MCP tools:
- Request Clarification: Use
deep_research()withrequest_clarification=True - Answer Questions: The AI Assistant will present questions to you
- Deep Research: The AI Asssitant will automatically use
research_with_context()with your answers
The main class for performing research operations.
-
research(query, system_prompt=None, include_code_interpreter=True, callback_url=None)- Performs deep research on a query
callback_url: optional webhook notified when research completes- Returns: Dictionary with final report, citations, and metadata
-
get_task_status(task_id)- Check the status of a research task
- Returns: Task status information
-
start_clarification(query)- Analyze query and generate clarifying questions if needed
- Returns: Dictionary with questions and session ID
-
add_clarification_answers(session_id, answers)- Add answers to clarification questions
- Returns: Session status information
-
get_enriched_query(session_id)- Generate enriched query from clarification session
- Returns: Enhanced query string
Configuration class for the research agent.
provider: Research provider (openai,dr-tulu,gemini, oropen-deep-research; default:openai)api_style: API style for theopenaiprovider (responsesorchat_completions; default:responses). Ignored fordr-tulu,gemini, andopen-deep-research.model: Model identifier- OpenAI: Responses model (e.g.,
gpt-5-mini) - Dr Tulu: logical provider id (default:
dr-tulu) - Gemini: Deep Research agent id (for example
deep-research-pro-preview-12-2025) - Open Deep Research: LiteLLM model id (e.g.,
openai/qwen/qwen3-coder-30b)
- OpenAI: Responses model (e.g.,
api_key: API key for the configured endpoint (optional). Defaults to envOPENAI_API_KEYforopenai,DR_TULU_API_KEYfordr-tulu,GEMINI_API_KEY/GOOGLE_API_KEYforgemini.base_url: Provider API base URL (optional). Defaults tohttps://api.openai.com/v1foropenai,http://localhost:8080/fordr-tulu,https://generativelanguage.googleapis.comforgemini, andhttp://localhost:1234/v1foropen-deep-research.timeout: Maximum time for research in seconds (default: 1800)poll_interval: Polling interval in seconds (default: 30)enable_clarification: Enable clarifying questions (default: False)triage_model: Model for query analysis (default:gpt-5-mini)clarifier_model: Model for query enrichment (default:gpt-5-mini)clarification_api_key: Custom API key for clarification models (optional; defaults to the main OpenAI credentials whenprovider=openai, otherwise falls back to envOPENAI_API_KEYif present)clarification_base_url: Custom OpenAI-compatible endpoint for clarification models (optional; defaults to the main OpenAI endpoint whenprovider=openai, otherwise falls back to envOPENAI_BASE_URLif present)
# Install dev dependencies
uv sync --extra dev
# Run all tests
uv run pytest -v
# Run with coverage
uv run pytest --cov=deep_research_mcp tests/
# Run specific test file
uv run pytest tests/test_agents.pyuv run black .
uv run pylint src/deep_research_mcp tests
uv run mypy src/deep_research_mcp