Skip to content

feat(interop): add A2A (Agent-to-Agent) protocol support#4166

Closed
5queezer wants to merge 8 commits intozeroclaw-labs:masterfrom
5queezer:claude/fix-issue-3566-UyTkH
Closed

feat(interop): add A2A (Agent-to-Agent) protocol support#4166
5queezer wants to merge 8 commits intozeroclaw-labs:masterfrom
5queezer:claude/fix-issue-3566-UyTkH

Conversation

@5queezer
Copy link
Copy Markdown
Contributor

@5queezer 5queezer commented Mar 21, 2026

Summary

  • Base branch target: master
  • Problem: ZeroClaw agents cannot discover or communicate with external agents — multi-agent is intra-instance only
  • Why it matters: no standardized discovery or task delegation between instances
  • What changed: native A2A protocol support — outbound client tool (src/tools/a2a.rs), inbound JSON-RPC 2.0 server (src/gateway/a2a.rs), auto-generated agent card, A2aConfig schema, same-host localhost A2A for multi-instance Pi setups
  • What did not change: no SSE streaming, webhook push, mTLS/OAuth, or agent registry. Core agent loop unchanged — inbound tasks route through existing process_message pipeline

Label Snapshot

  • Risk: high
  • Size: M
  • Scope: gateway, tool, config, onboard
  • Module: tool: a2a, gateway: a2a

Change Metadata

  • Type: feature
  • Scope: multi

Linked Issue

Validation Evidence

cargo fmt --all -- --check                        #
cargo clippy --all-targets -- -D warnings         #
cargo test --lib -- gateway::a2a tools::a2a       # ✅ 40 tests pass
  • 5 unit tests (agent card, URL validation, RPC error format, task store, MAX_TASKS)
  • 15 handler integration tests (auth matrix, task lifecycle, capacity limit, message/send)
  • 7 wiremock HTTP tests (discover, send, bearer auth, error codes)
  • 8 tool execute tests (missing params, unknown action, read-only autonomy, SSRF)
  • 1 allow_local test (localhost permitted when flag is set)
  • 4 SSRF helper tests (cloud metadata, IPv4-mapped IPv6, resolution)
  • E2E: five live instances on a Pi Zero 2 W communicating via A2A

Security Impact

  • New permissions/capabilities? Yes
  • New external network calls? Yes
  • Secrets/tokens handling changed? Yes
  • File system access scope changed? No

Mitigations:

  • Outbound: SSRF protection (private IP blocking, IPv4-mapped IPv6, redirect policy, DNS resolution check). allow_local only when public_url points to localhost. Bearer tokens per-call, not logged.
  • Inbound: GET /.well-known/agent-card.json unauthenticated (metadata only). POST /a2a requires bearer token (PairingGuard or a2a.bearer_token, constant-time comparison).
  • TaskStore capped at 10,000 entries to prevent memory exhaustion.
  • Localhost-only by default, fully opt-in (a2a.enabled = false).
  • Startup warnings for missing auth and exposed internal addresses.
  • Known residual: allow_local is a blanket bypass — peer allowlist planned in [Feature][interop]: A2A peer discovery for same-host and LAN #4643.

Privacy and Data Hygiene

  • Status: pass
  • No PII in agent cards or task responses. Bearer tokens excluded from tool output.

Compatibility / Migration

  • Backward compatible? Yes
  • Config changes? Yes — new optional [a2a] section with #[serde(default)]
  • Migration needed? No

i18n Follow-Through

  • Triggered? Yes
  • Config reference updated for en, vi, zh-CN: Yes

Human Verification

  • Verified: compilation, formatting, clippy, 40 tests
  • Edge cases: missing params, unknown actions, invalid URL schemes, empty bearer tokens, disabled feature → 404, store full → 503
  • E2E: five live Pi Zero 2 W instances (Kerf, Sentinel, Architect, Critic, Researcher) exchanging messages via A2A with gpt-5.1-codex-mini

Side Effects / Blast Radius

  • Affected: gateway (2 routes), tool registry (1 conditional tool), config schema (1 section), onboard wizard (1 field), agent bootstrap prompt (1 tool description)
  • Unintended effects: none — fully opt-in, routes return 404 when disabled

Rollback Plan

  • Rollback: revert commit or set a2a.enabled = false
  • Feature flag: a2a.enabled (default false)
  • Failure symptoms: A2A routes returning errors, tool execution failures in agent logs

Risks and Mitigations

@5queezer 5queezer marked this pull request as draft March 21, 2026 17:29
@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch 4 times, most recently from 0d05a27 to f036781 Compare March 22, 2026 09:39
@5queezer 5queezer marked this pull request as ready for review March 22, 2026 11:09
@github-actions github-actions Bot added docs Auto scope: docs/markdown/template files changed. config Auto scope: src/config/** changed. gateway Auto scope: src/gateway/** changed. onboard Auto scope: src/onboard/** changed. tool Auto scope: src/tools/** changed. labels Mar 24, 2026
@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch from 602d5aa to c27e48c Compare March 24, 2026 21:30
@github-actions github-actions Bot added the agent Auto scope: src/agent/** changed. label Mar 24, 2026
@github-actions github-actions Bot added the memory Auto scope: src/memory/** changed. label Mar 25, 2026
@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch from 6c90740 to 23ef01d Compare March 25, 2026 06:42
@github-actions github-actions Bot removed the memory Auto scope: src/memory/** changed. label Mar 25, 2026
@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch from dca06fc to a3a7c5f Compare March 25, 2026 20:11
@theonlyhennygod
Copy link
Copy Markdown
Collaborator

Hey @5queezer — this PR currently has failing CI checks. Could you rebase against current master and fix the failing checks so we can review and merge? Run cargo fmt --all -- --check && cargo clippy --all-targets -- -D warnings && cargo test locally before pushing. Thanks!

@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch from 6083a16 to 3622fa4 Compare March 26, 2026 21:26
@github-actions github-actions Bot added the channel Auto scope: src/channels/** changed. label Mar 26, 2026
@5queezer
Copy link
Copy Markdown
Contributor Author

Rebased onto master and all checks pass locally (fmt, clippy, test). This is intentionally scoped as an MVP. SSE streaming, multi-turn, and cancel are tracked in #3566 for follow-up PRs.

Tested manually via Telegram: single-turn A2A calls work between agents, each agent prints its reasoning to a shared group chat for verification. Happy to record a demo if helpful.

@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch from faf2a66 to 6bded3b Compare March 28, 2026 00:52
@github-actions github-actions Bot added the memory Auto scope: src/memory/** changed. label Mar 28, 2026
Implement native A2A protocol support enabling ZeroClaw agents to
communicate with external A2A-compatible agents across hosts.

Components:
- A2A server (src/gateway/a2a.rs): inbound JSON-RPC 2.0 handlers for
  message/send and tasks/get, agent card at /.well-known/agent-card.json
- A2A client tool (src/tools/a2a.rs): outbound tool with discover, send,
  status, and result actions

Security hardening:
- Constant-time bearer token comparison (timing side-channel prevention)
- SSRF protection: private IP blocking, DNS resolution validation,
  redirect hop validation
- Security policy enforcement: autonomy gating and rate limiting
- Error redaction: generic messages to callers, full details logged
- Config API masking for a2a.bearer_token
- Manual Debug impl redacting bearer_token

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5queezer and others added 6 commits March 28, 2026 01:56
- Add MAX_TASKS capacity limit to prevent memory exhaustion DoS
- Warn at startup when A2A has no auth configured
- Warn at startup when agent card exposes internal bind address
- Stop echoing user-supplied task_id in error messages
- Document DNS rebinding TOCTOU in SSRF validation
- Derive Default for A2aConfig (fixes clippy)
- Add 39 tests covering auth, task lifecycle, capacity, and HTTP actions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the new A2A protocol configuration keys, security notes,
and defaults across all maintained config reference locales.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t A2A

The A2A tool was registered in the tool registry but missing from the
bootstrap system prompt tool_descs list, making it invisible to models
that rely on text-based tool instructions (e.g. OpenAI Codex).

Additionally, the SSRF protection unconditionally blocked localhost and
private IPs, preventing same-host multi-instance A2A communication
(multiple ZeroClaw bots on a single Raspberry Pi). The new allow_local
flag on A2aTool, derived from whether a2a.public_url points to a local
address, permits same-host A2A while maintaining SSRF protection for
public deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests were calling validate_url as a static method after it was changed
to take &self for the allow_local flag. Updated all test callsites to
use instance method and added a test for allow_local=true. Fixed cargo
fmt on A2aTool::new call in mod.rs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a2a.notify_chat_id is set, inbound A2A task results are posted
to the configured Telegram chat (e.g. a group). This makes inter-agent
communication visible to users watching the group — useful for
multi-instance setups where each bot has its own persona.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add `debouncer` field to ChannelRuntimeContext test initializer and
`auth_limiter` field to A2A test AppState to fix compilation after
rebase onto master which introduced these new required struct fields.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch from 6bded3b to b86c192 Compare March 28, 2026 00:56
@github-actions github-actions Bot removed channel Auto scope: src/channels/** changed. memory Auto scope: src/memory/** changed. labels Mar 28, 2026
Add module-level doc comments to both A2A files listing what's
implemented and what's missing vs the full A2A spec. Links to
issue #3566 for tracking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@5queezer 5queezer force-pushed the claude/fix-issue-3566-UyTkH branch from b86c192 to c885dd4 Compare March 28, 2026 01:12
JacobGaoZQ pushed a commit to JacobGaoZQ/zeroclaw that referenced this pull request Apr 14, 2026
Compare current A2A branch code with upstream PR zeroclaw-labs#4166,
listing all additions including stream action, new config
fields, frontend test page, and documentation.

🤖 Generated with [Qoder][https://qoder.com]
JacobGaoZQ pushed a commit to JacobGaoZQ/zeroclaw that referenced this pull request Apr 15, 2026
Remove files not part of PR zeroclaw-labs#4166:
- A2A implementation status document
- A2A test report
- A2A test screenshots (12 files)
- Strands A2A server script
- LangChain A2A server script
- A2A server startup script
- A2A requirements files

Keep:
- docs/a2a-comparison.md (comprehensive A2A documentation)
- python/zeroclaw_tools/* (Python companion package, from version bump)
- python/pyproject.toml, README.md, tests/* (from version bump)

🤖 Generated with [Qoder](https://qoder.com)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Auto scope: src/agent/** changed. config Auto scope: src/config/** changed. docs Auto scope: docs/markdown/template files changed. gateway Auto scope: src/gateway/** changed. onboard Auto scope: src/onboard/** changed. tool Auto scope: src/tools/** changed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature][interop]: A2A (Agent-to-Agent) Protocol Support

2 participants