Skip to content

fix(deserialization): replace pickle.loads with JSON across all sinks#2709

Open
amabito wants to merge 2 commits intoag2ai:mainfrom
amabito:pr/01-rce-deser-hardening
Open

fix(deserialization): replace pickle.loads with JSON across all sinks#2709
amabito wants to merge 2 commits intoag2ai:mainfrom
amabito:pr/01-rce-deser-hardening

Conversation

@amabito
Copy link
Copy Markdown
Contributor

@amabito amabito commented Apr 21, 2026

Why are these changes needed?

autogen/cache/redis_cache.py:77 is pickle.loads(cache.get(...)). That is the
default production cache path -- anything that can write to your Redis instance
gets RCE on every AG2 process using it. Four sibling sinks had the same shape:
pickle.loads in the Cosmos DB cache, pickle.loads on inbound Redis pub/sub
messages, pickle.load from a caller-supplied path_to_db_dir in teachability
(RCE on the next agent startup if you can drop a file there), and
importlib.import_module on the __type__ field arriving over Redis pub/sub
JSON -- a channel writer can trigger arbitrary import-time side effects with any
importable path. All five are fixed here because splitting the rollout would just
extend the exposure window.

JSON replaces pickle everywhere. Cache stores now write a one-byte version prefix
(\x01 + JSON). Legacy pickle reads still work, but only when
AG2_ALLOW_PICKLE_CACHE_READ=1 is set -- they log a DeprecationWarning on every
hit, noisy by design. Without the flag, the reader raises. The importlib resolver
is gone; a module-level _EVENT_REGISTRY dict takes its place. Custom
BaseEvent subclasses published over Redis pub/sub need @register_event_class;
the rejection test in test_serializer_security.py demonstrates what happens
when an unregistered __type__ arrives.

Breaking changes: existing .pkl cache entries and teachability stores need
migration. A migration helper ships with this PR
(python -m autogen.agentchat.contrib.capabilities.teachability_migrate_pickle_to_json).
Stub docs at docs/security/deserialization.md and docs/cache-migration.md;
website integration is a follow-up.

Related issue number

Addresses: N/A (proactive security audit of deserialization paths)

Checks

  • I've included any doc changes needed for https://docs.ag2.ai/.
    (Stub docs added to docs/security/deserialization.md and
    docs/cache-migration.md; website integration is a follow-up.)
  • I've added tests (if relevant) corresponding to the changes introduced in this PR.
  • I've made sure all auto checks have passed.

Five deserialization sinks consumed untrusted bytes via pickle.loads or
importlib.import_module on attacker-controlled strings. Each path allowed
remote code execution for any actor with write access to the respective
backend (Redis pub/sub, Redis GET, Cosmos DB items, local teachability
store). This change replaces pickle with JSON across all five sites and
replaces the dynamic class resolver with a registry lookup. Pickle read
paths remain accessible behind explicit opt-in environment variables with
deprecation warnings for smooth migration.

Addresses: R1R3-B1, R1R3-B2, R3R2-B1, R3R2-B2, R4R1-B1
@amabito amabito requested a review from Lancetnik as a code owner April 21, 2026 07:31
@github-actions github-actions Bot added documentation Improvements or additions to documentation beta labels Apr 21, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 0% with 25 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
autogen/beta/streams/redis/serializer.py 0.00% 25 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (49be53e) and HEAD (faf020c). Click for more details.

HEAD has 984 uploads less than BASE
Flag BASE (49be53e) HEAD (faf020c)
3.10 93 0
ubuntu-latest 99 0
optional-deps 108 0
commsagent-slack 6 0
browser-use 3 0
3.14 84 0
retrievechat-pgvector 6 0
windows-latest 94 1
retrievechat-mongodb 6 0
3.13 35 2
core-without-llm 15 0
3.11 46 0
graph-rag-falkor-db 6 0
3.12 32 0
macos-latest 97 1
interop-crewai 6 0
commsagent-discord 6 0
interop 6 0
retrievechat 10 0
twilio 6 0
jupyter-executor 6 0
docs 6 0
commsagent-telegram 6 0
interop-langchain 6 0
crawl4ai 4 0
websockets 6 0
interop-pydantic-ai 6 0
google-api 6 0
wikipedia-api 6 0
agent-eval 1 0
gpt-assistant-agent 3 0
teachable 3 0
llama-index-agent 3 0
mcp 5 0
retrievechat-couchbase 2 0
retrievechat-qdrant 6 0
long-context 3 0
websurfer 14 0
lmm 3 0
bedrock 7 0
swarm 13 0
together 13 0
cerebras 12 0
cohere 15 0
mistral 12 0
groq 13 0
anthropic 11 0
gemini 14 0
ollama 14 0
beta 7 2
Files with missing lines Coverage Δ
autogen/beta/streams/redis/serializer.py 0.00% <0.00%> (-61.77%) ⬇️

... and 393 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@amabito
Copy link
Copy Markdown
Contributor Author

amabito commented Apr 21, 2026

CI is red on test/beta/tools/search/test_resolve.py:19 -- empty class body, import right after. Introduced in #2701, unrelated to this PR. Breaks ast check, ruff, and pytest collection. Rebasing pulls the same broken file, so nothing to do from this side until upstream patches it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

beta documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant