Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 77 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -368,13 +368,22 @@ Use cascadeflow with LangChain for intelligent model cascading with full LCEL, s

### Installation

**<img src=".github/assets/CF_ts_color.svg" width="18" height="18" alt="TypeScript" style="vertical-align: middle;"/> TypeScript**

```bash
npm install @cascadeflow/langchain @langchain/core @langchain/openai
```

**<img src=".github/assets/CF_python_color.svg" width="18" height="18" alt="Python" style="vertical-align: middle;"/> Python**

```bash
pip install cascadeflow[langchain]
```

### Quick Start

Drop-in replacement for any LangChain chat model:
<details open>
<summary><b><img src=".github/assets/CF_ts_color.svg" width="18" height="18" alt="TypeScript" style="vertical-align: middle;"/> TypeScript - Drop-in replacement for any LangChain chat model</b></summary>

```typescript
import { ChatOpenAI } from '@langchain/openai';
Expand All @@ -397,8 +406,66 @@ const result = await cascade.invoke('Explain quantum computing');
const chain = prompt.pipe(cascade).pipe(new StringOutputParser());
```

</details>

<details>
<summary><b><img src=".github/assets/CF_python_color.svg" width="18" height="18" alt="Python" style="vertical-align: middle;"/> Python - Drop-in replacement for any LangChain chat model</b></summary>

```python
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
from cascadeflow.integrations.langchain import CascadeFlow

cascade = CascadeFlow(
drafter=ChatOpenAI(model="gpt-4o-mini"), # $0.15/$0.60 per 1M tokens
verifier=ChatAnthropic(model="claude-sonnet-4-5"), # $3/$15 per 1M tokens
quality_threshold=0.8, # 80% queries use drafter
)

# Use like any LangChain chat model
result = await cascade.ainvoke("Explain quantum computing")

# Optional: Enable LangSmith tracing (see https://smith.langchain.com)
# Set LANGSMITH_API_KEY, LANGSMITH_PROJECT, LANGSMITH_TRACING=true

# Or with LCEL chains
chain = prompt | cascade | StrOutputParser()
```

</details>

<details>
<summary><b>πŸ’‘ Optional: Cost Tracking with Callbacks (Python)</b></summary>

Track costs, tokens, and cascade decisions with LangChain-compatible callbacks:

```python
from cascadeflow.integrations.langchain.langchain_callbacks import get_cascade_callback

# Track costs similar to get_openai_callback()
with get_cascade_callback() as cb:
response = await cascade.ainvoke("What is Python?")

print(f"Total cost: ${cb.total_cost:.6f}")
print(f"Drafter cost: ${cb.drafter_cost:.6f}")
print(f"Verifier cost: ${cb.verifier_cost:.6f}")
print(f"Total tokens: {cb.total_tokens}")
print(f"Successful requests: {cb.successful_requests}")
```

**Features:**
- 🎯 Compatible with `get_openai_callback()` pattern
- πŸ’° Separate drafter/verifier cost tracking
- πŸ“Š Token usage (including streaming)
- πŸ”„ Works with LangSmith tracing
- ⚑ Near-zero overhead

**Full example:** See [langchain_cost_tracking.py](./examples/langchain_cost_tracking.py)

</details>

<details>
<summary><b>πŸ’‘ Optional: Model Discovery & Analysis Helpers</b></summary>
<summary><b>πŸ’‘ Optional: Model Discovery & Analysis Helpers (TypeScript)</b></summary>

For discovering optimal cascade pairs from your existing LangChain models, use the built-in discovery helpers:

Expand Down Expand Up @@ -458,9 +525,10 @@ console.log(`Warnings: ${validation.warnings}`);
- βœ… Streaming with pre-routing
- βœ… Tool calling and structured output
- βœ… LangSmith cost tracking metadata
- βœ… Cost tracking callbacks (Python)
- βœ… Works with all LangChain features

🦜 **Learn more:** [LangChain Integration Guide](./docs/guides/langchain_integration.md) | [Package README](./packages/langchain-cascadeflow/)
🦜 **Learn more:** [LangChain Integration Guide](./docs/guides/langchain_integration.md) | [TypeScript Package](./packages/langchain-cascadeflow/) | [Python Examples](./examples/)

---

Expand Down Expand Up @@ -508,6 +576,12 @@ console.log(`Warnings: ${validation.warnings}`);
| **Cost Forecasting** | Forecast costs and detect anomalies | [View](./examples/cost_forecasting_anomaly_detection.py) |
| **Semantic Quality Detection** | ML-based domain and quality detection | [View](./examples/semantic_quality_domain_detection.py) |
| **Profile Database Integration** | Integrate user profiles with databases | [View](./examples/profile_database_integration.py) |
| **LangChain Basic** | Simple LangChain cascade setup | [View](./examples/langchain_basic_usage.py) |
| **LangChain Streaming** | Stream responses with LangChain | [View](./examples/langchain_streaming.py) |
| **LangChain Model Discovery** | Discover and analyze LangChain models | [View](./examples/langchain_model_discovery.py) |
| **LangChain LangSmith** | Cost tracking with LangSmith integration | [View](./examples/langchain_langsmith.py) |
| **LangChain Cost Tracking** | Track costs with callback handlers | [View](./examples/langchain_cost_tracking.py) |
| **LangChain Benchmark** | Comprehensive cascade benchmarking | [View](./examples/langchain_cascade_benchmark.py) |

</details>

Expand Down
4 changes: 2 additions & 2 deletions cascadeflow/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,7 @@ def _get_provider(self, model: ModelConfig):
async def run(
self,
query: str,
max_tokens: int = 500,
max_tokens: int = 100,
temperature: float = 0.7,
complexity_hint: Optional[str] = None,
force_direct: bool = False,
Expand All @@ -517,7 +517,7 @@ async def run(

Args:
query: User query
max_tokens: Max tokens to generate (default: 500)
max_tokens: Max tokens to generate
temperature: Sampling temperature
complexity_hint: Override complexity detection
force_direct: Force direct routing
Expand Down
65 changes: 43 additions & 22 deletions cascadeflow/integrations/langchain/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,30 @@
extract_token_usage,
MODEL_PRICING,
)
from .models import (
MODEL_PRICING_REFERENCE,
analyze_cascade_pair,
suggest_cascade_pairs,
discover_cascade_pairs,
analyze_model,
compare_models,
find_best_cascade_pair,
validate_cascade_pair,
extract_model_name,
get_provider,

# Model discovery utilities - optional feature
# TODO: Implement models.py module
# from .models import (
# MODEL_PRICING_REFERENCE,
# analyze_cascade_pair,
# suggest_cascade_pairs,
# discover_cascade_pairs,
# analyze_model,
# compare_models,
# find_best_cascade_pair,
# validate_cascade_pair,
# extract_model_name,
# get_provider,
# )
from .cost_tracking import (
BudgetTracker,
CostHistory,
CostEntry,
track_costs,
)
from .langchain_callbacks import (
CascadeFlowCallbackHandler,
get_cascade_callback,
)

__all__ = [
Expand All @@ -58,15 +71,23 @@
"create_cost_metadata",
"extract_token_usage",
"MODEL_PRICING",
# Model discovery
"MODEL_PRICING_REFERENCE",
"analyze_cascade_pair",
"suggest_cascade_pairs",
"discover_cascade_pairs",
"analyze_model",
"compare_models",
"find_best_cascade_pair",
"validate_cascade_pair",
"extract_model_name",
"get_provider",
# Model discovery (TODO: implement models.py module)
# "MODEL_PRICING_REFERENCE",
# "analyze_cascade_pair",
# "suggest_cascade_pairs",
# "discover_cascade_pairs",
# "analyze_model",
# "compare_models",
# "find_best_cascade_pair",
# "validate_cascade_pair",
# "extract_model_name",
# "get_provider",
# Cost tracking (Python-specific features)
"BudgetTracker",
"CostHistory",
"CostEntry",
"track_costs",
# LangChain callback handlers
"CascadeFlowCallbackHandler",
"get_cascade_callback",
]
Loading
Loading