Conversation
…port to ClickHouse
# Conflicts: # pyproject.toml
website/docs/_blogs/2026-02-08-AG2-OpenTelemetry-Tracing/index.mdx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
- A Page / Section on instrument agent is missing should be included in the mdx as well ,
same for SpanKind and SpanType a section on both is need in mdx files.
everything looks great, a notebook and documentation update is needed I suppose, its a good feature.
|
@claude final review before merge please |
|
Claude finished @marklysze's task in 2m 45s —— View job Final Review Complete ✅I've conducted a comprehensive review of this OpenTelemetry tracing implementation. This is an excellent, production-ready feature that follows industry best practices. Here's my assessment: Architecture & Design ⭐⭐⭐⭐⭐Strengths:
Implementation highlights:
Security & Privacy 🔒Strengths:
Recommendations:
Error Handling ✅Strengths:
Backwards Compatibility ✅Excellent - Zero breaking changes:
Test Coverage 🧪Strong coverage (89.27% patch coverage):
Coverage gaps (83 lines missing per Codecov):
Recommendation: The coverage is strong overall. The missing lines are mostly in async code paths and opt-in features (A2A), which is acceptable for a first release. Documentation 📚Comprehensive and high-quality:
Notable documentation highlights:
Integration Points ✅Well-designed integrations:
Minor Issues & Suggestions1. Tool data privacy consideration: Current behavior captures all tool calls by default ( 2. Documentation clarification:
3. A2A trace context extraction: 4. Copyright year inconsistency: Code Quality ✅High-quality implementation:
CI/CD Status ✅All checks passing:
Final AssessmentThis PR is ready to merge. It represents a significant, well-architected addition to AG2 that:
The minor suggestions above (tool data privacy flag, documentation clarification, A2A error handling) are nice-to-haves that can be addressed in follow-up PRs if desired. They don't block this release. Recommendation: APPROVE AND MERGE ✅ This is high-quality work that will significantly improve AG2's observability story. Congratulations to @marklysze and @Lancetnik on an excellent implementation! 🎉 --- | Branch: feat/tracing |
Why are these changes needed?
This PR introduces OpenTelemetry-based distributed tracing for AG2 multi-agent conversations. It enables observability into agent workflows, LLM calls, tool executions, code execution, and human-in-the-loop interactions.
Installation
pip install "ag2[tracing]"This installs
opentelemetry-api,opentelemetry-sdk, and the OTLP gRPC exporter.Approach
OpenTelemetry GenAI Semantic Conventions
The implementation follows the OpenTelemetry GenAI Semantic Conventions with AG2-specific extensions. This ensures compatibility with standard observability tools (Grafana, Jaeger, Datadog, Honeycomb, etc.) while capturing AG2-specific context.
Trace Hierarchy
For group chats with a pattern, the tree includes speaker selection:
Span Types
ag2.span.typeconversationconversationrun,initiate_chat,a_initiate_chat,resume,run_chat,a_run_chatmulti_conversationinitiate_chatsinitiate_chats,a_initiate_chats(sequential or parallel)agentinvoke_agentgenerate_reply,a_generate_reply,a_generate_remote_replyllmchatOpenAIWrapper.create()(every LLM API call)toolexecute_toolexecute_function,a_execute_functioncode_executionexecute_codehuman_inputawait_human_inputget_human_input,a_get_human_inputspeaker_selectionspeaker_selection_auto_select_speaker,a_auto_select_speaker(group chat)Central LLM Instrumentation
All LLM providers (OpenAI, Anthropic, Gemini, Bedrock, Mistral, etc.) are instrumented through a single point:
OpenAIWrapper.create(). This captures:Distributed Tracing (A2A)
For remote agents using the A2A protocol, trace context is automatically propagated via W3C Trace Context headers, enabling end-to-end traces across service boundaries.
Instrumentation API
All functions are exported from
autogen.opentelemetryand take atracer_providerkeyword argument:Standard Attributes (OTEL GenAI)
gen_ai.operation.name- Operation typegen_ai.agent.name- Agent namegen_ai.provider.name- LLM providergen_ai.request.model/gen_ai.response.modelgen_ai.usage.input_tokens/gen_ai.usage.output_tokensgen_ai.tool.name,gen_ai.tool.call.id,gen_ai.tool.call.arguments,gen_ai.tool.call.resultgen_ai.input.messages/gen_ai.output.messagesgen_ai.response.finish_reasonsgen_ai.conversation.id/gen_ai.conversation.turns/gen_ai.conversation.max_turnsAG2-Specific Extensions
ag2.span.type- Span classificationag2.speaker_selection.candidates/ag2.speaker_selection.selectedag2.human_input.prompt/ag2.human_input.responseag2.code_execution.exit_code/ag2.code_execution.outputag2.chats.count,ag2.chats.mode,ag2.chats.recipientsgen_ai.usage.cost- AG2 cost trackinggen_ai.agent.remote/server.address- Remote A2A agent attributeserror.type- Error type on failureFiles
autogen/opentelemetry/setup.py,utils.py,consts.py,instrumentators/)test/opentelemetry/website/docs/user-guide/tracing/opentelemetry.mdxwebsite/docs/user-guide/tracing/remote-agents.mdxwebsite/docs/user-guide/tracing/local-setup.mdxwebsite/mkdocs/docs/docs/blog/posts/2026-02-08-AG2-OpenTelemetry-Tracing/notebook/agentchat_tracing.ipynbTracing examples
Related issue number
N/A
Checks