feat(debug): add Debug Companion — AI-powered debugging for Gemini CLI#22472
feat(debug): add Debug Companion — AI-powered debugging for Gemini CLI#22472SUNDRAM07 wants to merge 12 commits intogoogle-gemini:mainfrom
Conversation
…n and export - PerformanceCollector: latency P50/P90/P99 percentiles, token efficiency, v8 heap utilization, startup phase analysis, optimization suggestions - CostEstimator: per-model token cost tracking for Gemini 2.0/2.5/3, cache savings calculation, cheapest-model recommendations - PerformanceExporter: JSON export (CI pipelines) and Markdown export (human-readable reports) with configurable sections - 42 tests across 3 test files, all passing - Exported from telemetry/index.ts GSoC 2026 Idea google-gemini#5 proof-of-concept
Implements the foundation of the Debug Companion: - DAPClient: Full DAP wire protocol with TCP transport, message framing, request/response correlation, and event handling (675 lines) - SourceMapResolver: TypeScript source map resolution for accurate breakpoint placement (308 lines) - DebugAdapterRegistry: Adapter configuration for Node.js, Python, Go, and Ruby debug adapters (183 lines) - DebugConfigPresets: Pre-built launch configurations for common debugging scenarios (290 lines) Part of google-gemini#20674
Adds 9 LLM-facing debug tools following the Gemini CLI tool architecture: - debug_launch: Start debug sessions with auto-configured adapters - debug_attach: Attach to running processes - debug_set_breakpoint: Set line breakpoints with conditions - debug_set_function_breakpoint: Set breakpoints on function names - debug_step: Step in/out/over/next with granular control - debug_evaluate: Evaluate expressions in debug context - debug_get_stacktrace: Retrieve enriched stack traces - debug_get_variables: Inspect variables with scope filtering - debug_disconnect: Graceful session termination Includes full tool definitions with JSON schemas for each tool. Part of google-gemini#20674
Comprehensive breakpoint management with 5 specialized modules: - BreakpointStore: Persistent storage with file/line indexing (178 lines) - SmartBreakpointSuggester: 4 strategies for auto-suggesting breakpoints based on error patterns, hot paths, and entry points (253 lines) - DataBreakpointManager: DAP watchpoints that break on data changes (225 lines) - ExceptionBreakpointManager: Caught/uncaught exception breakpoints with condition support and exception history tracking (287 lines) - BreakpointValidator: Pre-validates breakpoint locations before sending to the adapter — checks executability, suggests nearest valid line (411 lines) Part of google-gemini#20674
Intelligent error analysis with 5 modules: - StackTraceAnalyzer: Enriches stack frames with source context (365 lines) - FixSuggestionEngine: 11 error pattern matchers with actionable fixes (529 lines) - ErrorKnowledgeBase: Curated error patterns with examples (331 lines) - RootCauseAnalyzer: Generates ranked root cause hypotheses from exceptions, detects infinite recursion, suggests debugging next steps (482 lines) - DebugErrorClassifier: 17 error patterns across 8 categories with severity, recovery strategies, and retry logic (484 lines) Part of google-gemini#20674
Robust session lifecycle management with 4 modules: - DebugSessionStateMachine: 8-state FSM (Idle, Connecting, Initializing, Stopped, Running, Stepping, Disconnecting, Error) with validated transitions and timing analytics (263 lines) - DebugSessionHistory: Step-by-step history with debug loop detection to prevent infinite step cycles (235 lines) - DebugSessionSerializer: Save/load debug sessions for resumption across Gemini CLI restarts (238 lines) - ConditionalStepRunner: Execute step sequences with conditions (e.g., step until variable changes) (252 lines) Part of google-gemini#20674
LLM-optimized context generation and observability with 6 modules: - DebugContextBuilder: Priority-ranked, token-budget-aware context builder that feeds optimal debug state to the LLM (375 lines) - DebugPrompt: System prompt augmentation for debug-aware conversations (121 lines) - WatchExpressionManager: Persistent watch expressions with evaluation history and markdown reporting (207 lines) - VariableDiffTracker: Track variable changes between debug stops, detect nullifications and volatile variables (331 lines) - DebugTelemetryCollector: Usage metrics and session analytics (246 lines) - PerformanceProfiler: Operation timing and bottleneck detection (215 lines) Part of google-gemini#20674
Production-grade infrastructure completing the Debug Companion: - AdapterProcessManager: Spawn, monitor, and manage debug adapter processes for Node.js, Python, Go, Ruby (390 lines) - DebugInputSanitizer: Validates and sanitizes all debug inputs to prevent injection attacks (335 lines) - DebugPolicyGuard: Risk classification for debug operations, enforcing safety policies (331 lines) - DebugTestGenerator: Auto-generates test cases from debug sessions for regression testing (294 lines) - DebugWorkflowOrchestrator: Coordinates multi-step debug workflows with rollback support (290 lines) - InlineFixPreview: Shows fix previews before applying changes (226 lines) - Barrel exports (index.ts) for all 33 debug modules Part of google-gemini#20674
…UIRING_NARROWING - Split monolithic debugTools.ts (1,045 lines) into 9 per-tool files under tools/debug/ following the repo's one-file-per-tool convention - Created shared session-manager.ts for singleton DAP client management - Added barrel index.ts for clean imports - Original debugTools.ts now re-exports from debug/ for backward compat - Added DebugLaunchTool, DebugEvaluateTool, DebugAttachTool to TOOLS_REQUIRING_NARROWING for human-in-the-loop security - Registered DebugAttachTool and DebugSetFunctionBreakpointTool in config.ts (were previously missing from tool registration)
…sconnect - New top-level /debugger command for interactive debug companion - Subcommands: launch <file>, attach <port>, status, disconnect - Uses submit_prompt to delegate to the LLM agent which invokes the debug tools (debug_launch, debug_attach, etc.) - Registered in BuiltinCommandLoader alongside other built-in commands - Named /debugger to avoid conflict with existing nightly /debug subcommand
a9f85ba to
fbbd519
Compare
…command - Add session-manager.test.ts: singleton lifecycle, formatting helpers, error result structure, intelligence layer singletons (32 tests) - Add debug-tools.test.ts: mock DAPClient tests for all 8 tool wrappers covering: no session errors, DAP timeouts, empty responses, missing response keys, non-Error throws, frame index out of range, session cleanup on disconnect failure, terminateDebuggee variants (34 tests) - Add debuggerCommand.test.ts: help text, all 4 subcommands, edge cases for undefined args, whitespace-only input, paths with spaces, non-numeric ports, extra trailing args (23 tests) - Total new tests: 89 | Total project tests: 508
- Lifecycle server simulates realistic DAP adapter with events - Full 15-step test: connect → initialize → setBreakpoints → launch → configurationDone → stopped(entry) → stackTrace → scopes → variables → evaluate → step(next/stepIn/stepOut) → continue → stopped(breakpoint) → disconnect - Tests concurrent operations (3 parallel setBreakpoints) - Tests server crash recovery (adapter killed mid-session) - Tests post-disconnect operation rejection - All 4 E2E tests pass in under 350ms
|
Hi there! Thank you for your interest in contributing to Gemini CLI. To ensure we maintain high code quality and focus on our prioritized roadmap, we have updated our contribution policy (see Discussion #17383). We only guarantee review and consideration of pull requests for issues that are explicitly labeled as 'help wanted'. All other community pull requests are subject to closure after 14 days if they do not align with our current focus areas. For this reason, we strongly recommend that contributors only submit pull requests against issues explicitly labeled as 'help-wanted'. This pull request is being closed as it has been open for 14 days without a 'help wanted' designation. We encourage you to find and contribute to existing 'help wanted' issues in our backlog! Thank you for your understanding and for being part of our community! |
Summary
This PR adds the Debug Companion — a production-grade, AI-powered debugging subsystem for Gemini CLI. It implements a proof-of-concept for Idea #7: Debug Companion, providing 9 debug tools, a full DAP (Debug Adapter Protocol) client, and 33 supporting modules spanning 7 architectural layers.
What does this PR do?
It enables Gemini CLI to debug programs by communicating with debug adapters (Node.js, Python, Go, Ruby) through the Debug Adapter Protocol. The LLM can launch debug sessions, set breakpoints, step through code, inspect variables, evaluate expressions, and analyze errors — all through natural language.
9 Debug Tools
debug_launchdebug_attachdebug_set_breakpointdebug_set_function_breakpointdebug_stepdebug_evaluatedebug_get_stacktracedebug_get_variablesdebug_disconnectArchitecture: 33 Modules, 7 Layers
graph TB subgraph Protocol["Protocol Layer"] DAP["DAPClient — Wire Protocol + TCP"] SRC["SourceMapResolver"] REG["DebugAdapterRegistry"] CFG["DebugConfigPresets"] end subgraph Tools["Tool Layer — 9 Tools"] T["debug_launch / attach / breakpoint<br/>step / evaluate / stacktrace<br/>variables / disconnect / function_bp"] end subgraph Breakpoints["Breakpoint Layer"] BS["BreakpointStore"] SS["SmartSuggester — 4 strategies"] DB["DataBreakpointManager"] EB["ExceptionBreakpointManager"] BV["BreakpointValidator — Pre-validation"] end subgraph Analysis["Analysis Layer"] STA["StackTraceAnalyzer"] FIX["FixSuggestionEngine — 11 patterns"] KB["ErrorKnowledgeBase"] RCA["RootCauseAnalyzer"] EC["DebugErrorClassifier — 17 patterns"] end subgraph State["State Layer"] SM["SessionStateMachine — 8-state FSM"] SH["SessionHistory — Loop detection"] SER["SessionSerializer"] CSR["ConditionalStepRunner"] end subgraph Context["Context Layer"] CTX["DebugContextBuilder — Token-aware"] DP["DebugPrompt"] WM["WatchExpressionManager"] VDT["VariableDiffTracker"] TEL["TelemetryCollector"] PERF["PerformanceProfiler"] end subgraph Infra["Infrastructure Layer"] APM["AdapterProcessManager — 4 languages"] SAN["InputSanitizer"] PG["PolicyGuard"] TG["TestGenerator"] WO["WorkflowOrchestrator"] IFP["InlineFixPreview"] end Tools --> Protocol Tools --> Breakpoints Tools --> Analysis Tools --> State Context --> Tools Infra --> ProtocolKey Design Decisions
Protocol-first architecture: The
DAPClientimplements the full DAP wire protocol with TCP transport, Content-Length framing, and request/response correlation — the same protocol used by VS Code.Pre-validation over post-failure:
BreakpointValidatorchecks if a line is executable before sending to the adapter, preventing the common "breakpoint not verified" frustration.LLM-optimized context:
DebugContextBuildercreates priority-ranked, token-budget-aware context for the LLM, ensuring the most relevant debug information is always available regardless of context window limits.Production resilience:
DebugErrorClassifiertransforms raw error strings into structured, actionable intelligence with 17 patterns across 8 categories — each with severity, recovery strategies, and retry logic.Root cause analysis:
RootCauseAnalyzergoes beyond "what crashed" to answer "why" — generating ranked hypotheses with confidence scores and concrete debugging next steps.Testing
What's Next (GSoC Timeline)
/debugslash command & interactive debug modeStats
Related to #20674