Skip to content

[Feature]: Token-efficient serialization to reduce agent observability overhead and inference costs #1338

@makroumi

Description

@makroumi

💡 Feature Description and Proposed Solution

AgentOps captures and serializes agent events, LLM calls, and tool results as JSON by default.
At scale this creates compounding overhead that affects both observability costs and the agent pipelines being observed.

The core problem:
~44% of tokens in typical agent payloads are pure JSON syntax overhead before any reasoning begins.

For AgentOps specifically this matters in two ways:

  1. OBSERVABILITY COST
    Every event AgentOps captures is serialized and transmitted as JSON. At high agent volumes this creates significant egress overhead. 4.1x smaller wire format = 4.1x less egress cost.

  2. PIPELINE COST
    Teams using AgentOps to monitor inference costs are often unaware that JSON serialization itself is a major cost driver. 44% token reduction = $59K saved per 10M loops.

Proposed solution: pluggable serialization interface allowing ULMEN as an opt-in replacement for JSON.

ULMEN benchmarks on NVIDIA Tesla T4:

Image

Beyond compression, ULMEN adds a Semantic Firewall that validates agent state transitions:

  • Rejects orphaned tool calls
  • Catches invalid step transitions
  • Validates enum states
  • Raises structured errors vs silent failures

This is particularly valuable for AgentOps because silent failures that pass through JSON undetected are exactly the kind of events AgentOps should be catching but currently can't see.

Proposed API:

Current

agentops.init(api_key="...")

Proposed

agentops.init(
api_key="...",
serializer="ulmen" # opt-in
)

ULMEN is:

  • Drop-in Python/Rust library
  • No schema compilation required
  • Pure Python fallback if Rust unavailable
  • BSL license, free under $10M revenue
  • 1,364 tests, 100% statement coverage

Reproducible benchmark notebook:
github.com/makroumi/ulmen

Happy to submit a PR implementing the serializer interface once maintainers confirm preferred integration approach.

🤔 Related Problem

Teams using AgentOps to monitor and reduce inference costs are often surprised to discover that JSON serialization itself is one of their largest hidden cost drivers.

The irony: the observability tool that helps teams track spending is itself contributing to the overhead through JSON serialization of every captured event.

At 10M agent loops monitored through AgentOps:

  • JSON overhead: ~$59K in wasted token spend
  • AgentOps egress: 4.1x larger than necessary

Both problems solved simultaneously with pluggable ULMEN serialization.

🤝 Contribution

  • Yes, I'd be happy to submit a pull request with these changes.
  • I need some guidance on how to contribute.
  • I'd prefer the AgentOps team to handle this update.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions