🧮 AgentCost

Track, control, and optimize your AI spending.

Watch the 2-min demo → | Live demo →

AI costs are invisible, unpredictable, and uncontrolled. Teams deploy agents across OpenAI, Anthropic, Google, and open-source models with no idea what they're actually spending — or whether cheaper models would work just as well. AgentCost fixes that with a vendored pricing database of 2,610+ models from 40+ providers, automatic cost-tier classification, and intelligent model routing.

Quickstart

pip install agentcostin

from agentcost.sdk import trace
from openai import OpenAI

client = trace(OpenAI(), project="my-app")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Every call is now tracked. Open the dashboard:
# agentcost dashboard
# → http://localhost:8500

That's it. One line wraps your client, and every LLM call is tracked with model, tokens, cost, latency, and status.

Dashboard

# Seed demo data and launch
curl -X POST http://localhost:8500/api/seed -H "Content-Type: application/json" -d '{"days": 14}'
agentcost dashboard

The dashboard gives you nine intelligence views:

View	What it shows
Overview	Total spend, call volume, error rate, cost-over-time charts
Cost Breakdown	Spend by model, project, and provider with trend analysis
Forecasting	Predicted costs for next 7/14/30 days, budget exhaustion alerts
Optimizer	Model downgrade recommendations with estimated savings
Analytics	Token efficiency, top spenders, chargeback reports
Estimator	Pre-call cost estimation across 2,610+ models
Models	Search/filter all models by provider, tier, cost range, context window
Prompts	Version, deploy, and track cost of system prompts per version
Feedback	User thumbs up/down on traces, quality per model and prompt version

Framework Support

AgentCost integrates with the frameworks you already use:

# LangChain
from agentcost.sdk.integrations import langchain_callback
chain.invoke(input, config={"callbacks": [langchain_callback("my-project")]})

# CrewAI
from agentcost.sdk.integrations import crewai_callback
crew = Crew(agents=[...], callbacks=[crewai_callback("my-project")])

# AutoGen
from agentcost.sdk.integrations import autogen_callback
agent = AssistantAgent("assistant", llm_config={..., "callbacks": [autogen_callback("my-project")]})

# LlamaIndex
from agentcost.sdk.integrations import llamaindex_callback
service_context = ServiceContext.from_defaults(callback_manager=llamaindex_callback("my-project"))

Prompt Management

Store, version, deploy, and track the cost of your system prompts — with automatic cost analytics per version.

from agentcost.sdk import get_prompt, trace
from openai import OpenAI

# Create and version prompts
from agentcost.prompts import get_prompt_service
svc = get_prompt_service()
svc.create_prompt("support-bot", content="You are a helpful agent for {{product}}.")
svc.create_version("support-bot", content="You are a concise agent for {{product}}. Be brief.")
svc.deploy("support-bot", version=2, environment="production")

# Use in your app — prompt version is tagged on every trace
prompt = get_prompt("support-bot", environment="production", variables={"product": "AgentCost"})

client = trace(OpenAI(), project="support",
               prompt_id=prompt["prompt_id"], prompt_version=prompt["version"])
response = client.chat.completions.create(
    model=prompt.get("model") or "gpt-4.1",
    messages=[{"role": "system", "content": prompt["content"]},
              {"role": "user", "content": "How do I set a budget?"}]
)

Every prompt change creates an immutable version. Deploy V2 to staging while V1 runs in production. Compare cost per call between versions to answer "did the new prompt cost more or less?" See the Prompt Management Guide.

CLI

# Benchmark models on real professional tasks
agentcost benchmark --model gpt-4o --tasks 10

# Compare models head-to-head
agentcost compare --models "gpt-4o,gpt-4o-mini,claude-sonnet-4-6" --tasks 5

# View the leaderboard
agentcost leaderboard

# Check traces and budgets
agentcost traces --project my-app --summary
agentcost budget --project my-app --daily 50 --monthly 1000

# Manage plugins
agentcost plugin list
agentcost plugin install agentcost-slack-alerts

Architecture

                    ┌─────────────────────────────────────────────┐
                    │              Your Application               │
                    └─────┬──────────┬──────────┬───────┬────────┘
                          │          │          │       │
                    ┌─────▼──┐ ┌─────▼──┐ ┌────▼───┐ ┌─▼──────┐
                    │ Python │ │ Node.js│ │ Proxy  │ │  OTel  │
                    │  SDK   │ │  SDK   │ │Gateway │ │Ingest  │
                    └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘
                        │          │          │          │
                        └──────────┼──────────┼──────────┘
                                   │          │
                    ┌──────────────▼──────────▼───────┐
                    │     AgentCost API Server         │
                    │         (FastAPI)                │
                    ├──────────────────────────────────┤
                    │  Traces │ Prompts  │ Feedback    │
                    │  Budget │ Forecast │ Optimizer   │
                    ├──────────────────────────────────┤
                    │   SQLite / PostgreSQL             │
                    └──────────────┬───────────────────┘
                                   │
              ┌────────────────────┼────────────────────┐
              │                    │                    │
        ┌─────▼─────┐     ┌──────▼──────┐     ┌──────▼──────┐
        │ Dashboard  │     │   OTel /    │     │  Prometheus │
        │  (React)   │     │   Grafana   │     │   /metrics  │
        └────────────┘     └─────────────┘     └─────────────┘

AI Gateway (with Semantic Caching)

Zero-instrumentation cost tracking. Point your agents at the gateway instead of the provider:

from openai import OpenAI

# Just change the base URL — zero code changes to your agent
client = OpenAI(
    base_url="http://localhost:8200/v1",
    api_key="ac_myproject_xxx",
)

# Every call is tracked, policy-checked, and cached automatically
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    temperature=0,  # deterministic calls are cached
)

The gateway provides automatic response caching for deterministic requests (temperature ≤ 0.2), with full cost savings tracking visible in the dashboard. Cache stats include hit rate, total cost saved, per-project and per-model breakdown.

# Start the gateway
python -m agentcost.gateway --port 8200

# Check cache performance
curl http://localhost:8200/v1/gateway/cache/stats

Feature	Description
Response Caching	Exact-match + semantic caching — similar prompts hit the cache, not just identical ones
Cost Savings Tracking	Per-request cost saved, aggregated by project and model
Policy Enforcement	Pre-call policy checks before forwarding to providers
Provider Failover	Automatic routing across OpenAI, Anthropic, Ollama
Rate Limiting	Per-project RPM limits with token-bucket algorithm

Exporters

Exporters & OTel Collector

Send cost data out to your observability stack, or receive spans in from your existing OTel instrumentation:

# Export to OpenTelemetry (Datadog, Jaeger, Grafana Tempo)
from agentcost.otel import install_otel_exporter
install_otel_exporter(endpoint="http://localhost:4317")

# Prometheus (Grafana, AlertManager)
# Enabled automatically at /metrics when server is running

Already using Traceloop, OpenLLMetry, or OpenInference? Just point your OTel exporter at AgentCost — no re-instrumentation needed:

# Zero code changes — just set the endpoint
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8100

# AgentCost accepts OTLP/HTTP spans on POST /v1/traces
# LLM spans are auto-detected, cost is auto-calculated from 2,610+ model pricing
# Non-LLM spans (HTTP, DB, etc.) are silently skipped

MCP Server (Model Context Protocol)

AgentCost runs as an MCP server — Claude Desktop, Cursor, VS Code, and any MCP-compatible agent can query your cost data, check budgets, get optimization recommendations, and manage prompts directly.

// Claude Desktop or Cursor config
{
  "mcpServers": {
    "agentcost": {
      "command": "python",
      "args": ["-m", "agentcost.mcp"]
    }
  }
}

14 tools available: cost summary, cost by model/project, search traces, check/set budgets, optimization recommendations, cost estimation, feedback, prompt resolution, and more. See the MCP Server Guide.

Plugin System

Extend AgentCost with plugins:

# Install community plugins
agentcost plugin install agentcost-slack-alerts
agentcost plugin install agentcost-s3-archive

# Create your own plugin
agentcost plugin create my-plugin

Plugins can export data, add alerting, create custom views, and more. See the Plugin Development Guide.

TypeScript SDK

npm install @agentcost/sdk

import { AgentCost } from "@agentcost/sdk";

const ac = new AgentCost({
    project: "my-app",
    apiUrl: "http://localhost:8500",
});

// Trace any LLM call
const traced = await ac.trace({
    model: "gpt-4o",
    inputTokens: 150,
    outputTokens: 80,
    cost: 0.0035,
    latencyMs: 450,
});

Self-Hosting

Docker Hub (Quickest)

# One command — pulls from Docker Hub, runs with SQLite, dashboard on :8100
docker run -d -p 8100:8100 -v agentcost_data:/data agentcost/agentcost:latest

# Seed demo data
curl -X POST http://localhost:8100/api/seed -H "Content-Type: application/json" -d '{"days": 14}'

# → Open http://localhost:8100

Live Demo: See AgentCost in action at demo.agentcost.in — no install required.

Community Edition (From Source)

git clone https://github.com/agentcost/agentcost.git
cd agentcost
docker compose -f docker-compose.dev.yml up
# → http://localhost:8100

Enterprise Edition

# Full stack: PostgreSQL + SSO + API
docker compose up -d

# Configure SSO
export AGENTCOST_EDITION=enterprise
export AGENTCOST_AUTH_ENABLED=true
export KEYCLOAK_URL=http://localhost:8180

Enterprise Features

For teams and organizations that need governance:

Feature	Description
SSO/SAML	Any OIDC/SAML provider (Okta, Auth0, Azure AD, Keycloak)
Organizations	Multi-tenant team management with roles
Budget Enforcement	Cost centers, allocations, pre-call validation
Policy Engine	JSON rules: block models, cap costs, require approval
Approval Workflows	Human-in-the-loop for policy exceptions
Notifications	Slack, email, webhook, PagerDuty alerts
Agent Scorecards	Monthly agent grading (A–F) with recommendations
Audit Log	Hash-chained compliance trail
Anomaly Detection	ML-based cost/latency spike detection
AI Gateway	Transparent LLM proxy with policy enforcement

Enterprise features are source-available under BSL 1.1. See enterprise/LICENSE.

→ Contact us or read the docs

Configuration

Variable	Default	Description
`AGENTCOST_PORT`	`8500`	Server port
`AGENTCOST_EDITION`	`auto`	`community`, `enterprise`, or `auto`
`AGENTCOST_AUTH_ENABLED`	`false`	Enable SSO (enterprise)
`AGENTCOST_DB_URL`	SQLite	PostgreSQL connection string
`OIDC_ISSUER_URL`	—	OIDC provider URL (e.g., https://auth.example.com/realms/app)
`OIDC_CLIENT_ID`	agentcost-api	OIDC client ID
`OIDC_CLIENT_SECRET`	—	OIDC client secret

Contributing

We welcome contributions! See CONTRIBUTING.md for setup instructions.

git clone https://github.com/agentcost/agentcost.git
cd agentcost
pip install -e ".[dev,server]"
pytest tests/ -v

License

Core (agentcost SDK, dashboard, CLI, forecasting, optimizer, analytics, estimator, plugins): MIT
Enterprise (auth, org, budgets, policies, notifications, anomaly, gateway): BSL 1.1 — converts to Apache 2.0 after 3 years

Documentation · Issues · Discord · Twitter

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github		.github
acp-client		acp-client
acp-server		acp-server
agentcost-cost-limiter		agentcost-cost-limiter
agentcost-my-llm		agentcost-my-llm
agentcost-s3-archive		agentcost-s3-archive
agentcost-slack-alerts		agentcost-slack-alerts
agentcost		agentcost
dashboard		dashboard
docker		docker
docs		docs
enterprise		enterprise
examples		examples
sample-data		sample-data
scripts		scripts
sdks/typescript		sdks/typescript
site		site
tests		tests
vscode-extension		vscode-extension
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
PUBLISHING.md		PUBLISHING.md
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.demo.yml		docker-compose.demo.yml
docker-compose.dev.yml		docker-compose.dev.yml
mkdocs.yml		mkdocs.yml
package.json		package.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.sh		setup.sh
sync_output.txt		sync_output.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧮 AgentCost

Quickstart

Dashboard

Framework Support

Prompt Management

CLI

Architecture

AI Gateway (with Semantic Caching)

Exporters

Exporters & OTel Collector

MCP Server (Model Context Protocol)

Plugin System

TypeScript SDK

Self-Hosting

Docker Hub (Quickest)

Community Edition (From Source)

Enterprise Edition

Enterprise Features

Configuration

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧮 AgentCost

Quickstart

Dashboard

Framework Support

Prompt Management

CLI

Architecture

AI Gateway (with Semantic Caching)

Exporters

Exporters & OTel Collector

MCP Server (Model Context Protocol)

Plugin System

TypeScript SDK

Self-Hosting

Docker Hub (Quickest)

Community Edition (From Source)

Enterprise Edition

Enterprise Features

Configuration

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages