Skip to content

agentcostin/agentcost

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

🧮 AgentCost

Track, control, and optimize your AI spending.

PyPI npm Docker Hub CI License Discord

Watch the 2-min demo → | Live demo →


AI costs are invisible, unpredictable, and uncontrolled. Teams deploy agents across OpenAI, Anthropic, Google, and open-source models with no idea what they're actually spending — or whether cheaper models would work just as well. AgentCost fixes that with a vendored pricing database of 2,610+ models from 40+ providers, automatic cost-tier classification, and intelligent model routing.

Quickstart

pip install agentcostin
from agentcost.sdk import trace
from openai import OpenAI

client = trace(OpenAI(), project="my-app")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Every call is now tracked. Open the dashboard:
# agentcost dashboard
# → http://localhost:8500

That's it. One line wraps your client, and every LLM call is tracked with model, tokens, cost, latency, and status.

Dashboard

# Seed demo data and launch
curl -X POST http://localhost:8500/api/seed -H "Content-Type: application/json" -d '{"days": 14}'
agentcost dashboard

The dashboard gives you nine intelligence views:

View What it shows
Overview Total spend, call volume, error rate, cost-over-time charts
Cost Breakdown Spend by model, project, and provider with trend analysis
Forecasting Predicted costs for next 7/14/30 days, budget exhaustion alerts
Optimizer Model downgrade recommendations with estimated savings
Analytics Token efficiency, top spenders, chargeback reports
Estimator Pre-call cost estimation across 2,610+ models
Models Search/filter all models by provider, tier, cost range, context window
Prompts Version, deploy, and track cost of system prompts per version
Feedback User thumbs up/down on traces, quality per model and prompt version

Framework Support

AgentCost integrates with the frameworks you already use:

# LangChain
from agentcost.sdk.integrations import langchain_callback
chain.invoke(input, config={"callbacks": [langchain_callback("my-project")]})

# CrewAI
from agentcost.sdk.integrations import crewai_callback
crew = Crew(agents=[...], callbacks=[crewai_callback("my-project")])

# AutoGen
from agentcost.sdk.integrations import autogen_callback
agent = AssistantAgent("assistant", llm_config={..., "callbacks": [autogen_callback("my-project")]})

# LlamaIndex
from agentcost.sdk.integrations import llamaindex_callback
service_context = ServiceContext.from_defaults(callback_manager=llamaindex_callback("my-project"))

Prompt Management

Store, version, deploy, and track the cost of your system prompts — with automatic cost analytics per version.

from agentcost.sdk import get_prompt, trace
from openai import OpenAI

# Create and version prompts
from agentcost.prompts import get_prompt_service
svc = get_prompt_service()
svc.create_prompt("support-bot", content="You are a helpful agent for {{product}}.")
svc.create_version("support-bot", content="You are a concise agent for {{product}}. Be brief.")
svc.deploy("support-bot", version=2, environment="production")

# Use in your app — prompt version is tagged on every trace
prompt = get_prompt("support-bot", environment="production", variables={"product": "AgentCost"})

client = trace(OpenAI(), project="support",
               prompt_id=prompt["prompt_id"], prompt_version=prompt["version"])
response = client.chat.completions.create(
    model=prompt.get("model") or "gpt-4.1",
    messages=[{"role": "system", "content": prompt["content"]},
              {"role": "user", "content": "How do I set a budget?"}]
)

Every prompt change creates an immutable version. Deploy V2 to staging while V1 runs in production. Compare cost per call between versions to answer "did the new prompt cost more or less?" See the Prompt Management Guide.

CLI

# Benchmark models on real professional tasks
agentcost benchmark --model gpt-4o --tasks 10

# Compare models head-to-head
agentcost compare --models "gpt-4o,gpt-4o-mini,claude-sonnet-4-6" --tasks 5

# View the leaderboard
agentcost leaderboard

# Check traces and budgets
agentcost traces --project my-app --summary
agentcost budget --project my-app --daily 50 --monthly 1000

# Manage plugins
agentcost plugin list
agentcost plugin install agentcost-slack-alerts

Architecture

                    ┌─────────────────────────────────────────────┐
                    │              Your Application               │
                    └─────┬──────────┬──────────┬───────┬────────┘
                          │          │          │       │
                    ┌─────▼──┐ ┌─────▼──┐ ┌────▼───┐ ┌─▼──────┐
                    │ Python │ │ Node.js│ │ Proxy  │ │  OTel  │
                    │  SDK   │ │  SDK   │ │Gateway │ │Ingest  │
                    └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘
                        │          │          │          │
                        └──────────┼──────────┼──────────┘
                                   │          │
                    ┌──────────────▼──────────▼───────┐
                    │     AgentCost API Server         │
                    │         (FastAPI)                │
                    ├──────────────────────────────────┤
                    │  Traces │ Prompts  │ Feedback    │
                    │  Budget │ Forecast │ Optimizer   │
                    ├──────────────────────────────────┤
                    │   SQLite / PostgreSQL             │
                    └──────────────┬───────────────────┘
                                   │
              ┌────────────────────┼────────────────────┐
              │                    │                    │
        ┌─────▼─────┐     ┌──────▼──────┐     ┌──────▼──────┐
        │ Dashboard  │     │   OTel /    │     │  Prometheus │
        │  (React)   │     │   Grafana   │     │   /metrics  │
        └────────────┘     └─────────────┘     └─────────────┘

AI Gateway (with Semantic Caching)

Zero-instrumentation cost tracking. Point your agents at the gateway instead of the provider:

from openai import OpenAI

# Just change the base URL — zero code changes to your agent
client = OpenAI(
    base_url="http://localhost:8200/v1",
    api_key="ac_myproject_xxx",
)

# Every call is tracked, policy-checked, and cached automatically
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    temperature=0,  # deterministic calls are cached
)

The gateway provides automatic response caching for deterministic requests (temperature ≤ 0.2), with full cost savings tracking visible in the dashboard. Cache stats include hit rate, total cost saved, per-project and per-model breakdown.

# Start the gateway
python -m agentcost.gateway --port 8200

# Check cache performance
curl http://localhost:8200/v1/gateway/cache/stats
Feature Description
Response Caching Exact-match + semantic caching — similar prompts hit the cache, not just identical ones
Cost Savings Tracking Per-request cost saved, aggregated by project and model
Policy Enforcement Pre-call policy checks before forwarding to providers
Provider Failover Automatic routing across OpenAI, Anthropic, Ollama
Rate Limiting Per-project RPM limits with token-bucket algorithm

Exporters

Exporters & OTel Collector

Send cost data out to your observability stack, or receive spans in from your existing OTel instrumentation:

# Export to OpenTelemetry (Datadog, Jaeger, Grafana Tempo)
from agentcost.otel import install_otel_exporter
install_otel_exporter(endpoint="http://localhost:4317")

# Prometheus (Grafana, AlertManager)
# Enabled automatically at /metrics when server is running

Already using Traceloop, OpenLLMetry, or OpenInference? Just point your OTel exporter at AgentCost — no re-instrumentation needed:

# Zero code changes — just set the endpoint
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:8100

# AgentCost accepts OTLP/HTTP spans on POST /v1/traces
# LLM spans are auto-detected, cost is auto-calculated from 2,610+ model pricing
# Non-LLM spans (HTTP, DB, etc.) are silently skipped

MCP Server (Model Context Protocol)

AgentCost runs as an MCP server — Claude Desktop, Cursor, VS Code, and any MCP-compatible agent can query your cost data, check budgets, get optimization recommendations, and manage prompts directly.

// Claude Desktop or Cursor config
{
  "mcpServers": {
    "agentcost": {
      "command": "python",
      "args": ["-m", "agentcost.mcp"]
    }
  }
}

14 tools available: cost summary, cost by model/project, search traces, check/set budgets, optimization recommendations, cost estimation, feedback, prompt resolution, and more. See the MCP Server Guide.

Plugin System

Extend AgentCost with plugins:

# Install community plugins
agentcost plugin install agentcost-slack-alerts
agentcost plugin install agentcost-s3-archive

# Create your own plugin
agentcost plugin create my-plugin

Plugins can export data, add alerting, create custom views, and more. See the Plugin Development Guide.

TypeScript SDK

npm install @agentcost/sdk
import { AgentCost } from "@agentcost/sdk";

const ac = new AgentCost({
    project: "my-app",
    apiUrl: "http://localhost:8500",
});

// Trace any LLM call
const traced = await ac.trace({
    model: "gpt-4o",
    inputTokens: 150,
    outputTokens: 80,
    cost: 0.0035,
    latencyMs: 450,
});

Self-Hosting

Docker Hub (Quickest)

# One command — pulls from Docker Hub, runs with SQLite, dashboard on :8100
docker run -d -p 8100:8100 -v agentcost_data:/data agentcost/agentcost:latest

# Seed demo data
curl -X POST http://localhost:8100/api/seed -H "Content-Type: application/json" -d '{"days": 14}'

# → Open http://localhost:8100

Live Demo: See AgentCost in action at demo.agentcost.in — no install required.

Community Edition (From Source)

git clone https://github.com/agentcost/agentcost.git
cd agentcost
docker compose -f docker-compose.dev.yml up
# → http://localhost:8100

Enterprise Edition

# Full stack: PostgreSQL + SSO + API
docker compose up -d

# Configure SSO
export AGENTCOST_EDITION=enterprise
export AGENTCOST_AUTH_ENABLED=true
export KEYCLOAK_URL=http://localhost:8180

Enterprise Features

For teams and organizations that need governance:

Feature Description
SSO/SAML Any OIDC/SAML provider (Okta, Auth0, Azure AD, Keycloak)
Organizations Multi-tenant team management with roles
Budget Enforcement Cost centers, allocations, pre-call validation
Policy Engine JSON rules: block models, cap costs, require approval
Approval Workflows Human-in-the-loop for policy exceptions
Notifications Slack, email, webhook, PagerDuty alerts
Agent Scorecards Monthly agent grading (A–F) with recommendations
Audit Log Hash-chained compliance trail
Anomaly Detection ML-based cost/latency spike detection
AI Gateway Transparent LLM proxy with policy enforcement

Enterprise features are source-available under BSL 1.1. See enterprise/LICENSE.

Contact us or read the docs

Configuration

Variable Default Description
AGENTCOST_PORT 8500 Server port
AGENTCOST_EDITION auto community, enterprise, or auto
AGENTCOST_AUTH_ENABLED false Enable SSO (enterprise)
AGENTCOST_DB_URL SQLite PostgreSQL connection string
OIDC_ISSUER_URL OIDC provider URL (e.g., https://auth.example.com/realms/app)
OIDC_CLIENT_ID agentcost-api OIDC client ID
OIDC_CLIENT_SECRET OIDC client secret

Contributing

We welcome contributions! See CONTRIBUTING.md for setup instructions.

git clone https://github.com/agentcost/agentcost.git
cd agentcost
pip install -e ".[dev,server]"
pytest tests/ -v

License

  • Core (agentcost SDK, dashboard, CLI, forecasting, optimizer, analytics, estimator, plugins): MIT
  • Enterprise (auth, org, budgets, policies, notifications, anomaly, gateway): BSL 1.1 — converts to Apache 2.0 after 3 years

Documentation · Issues · Discord · Twitter

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors