feat: Support direct API key auth and cheap model routing#20
feat: Support direct API key auth and cheap model routing#20desamtralized wants to merge 1 commit intonearai:mainfrom
Conversation
Allow using IronClaw with any OpenAI-compatible API provider (e.g. Anthropic Claude) via API key, without requiring NEAR AI session auth. Changes: - Skip session authentication in chat_completions mode (API key auth) - Skip first-run onboard check when NEARAI_API_KEY is configured - Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a secondary lightweight model used for heartbeat, routing, evaluation - Add `create_cheap_llm_provider()` factory in llm module - Add `cheap_llm` to AgentDeps with fallback to main model - Route heartbeat through cheap model to reduce costs - Fix wizard compilation for new config field Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This made my day |
Reviews PRs nearai#10, nearai#13, nearai#14, nearai#17, nearai#18, nearai#20, nearai#28 covering: - Critical: hand-rolled NEAR tx serialization and key mgmt (PR nearai#14) - High: hooks system can bypass safety layer (PR nearai#18) - High: DM pairing token security needs verification (PR nearai#17) - Medium: auth bypass when API key mode set without key (PR nearai#20) - Medium: safety error retry classification in failover (PR nearai#28) - Low: Okta WASM tool and benchmarking harness https://claude.ai/code/session_01B75Rq9u593YG9Kc4FG487Z
serrrfirat
left a comment
There was a problem hiding this comment.
Review — PR #20
Files: 5 files, 59+/6- | Assessment: APPROVE ✅
Clean, minimal change. Direct API key auth skip and cheap model routing are both well-implemented with proper fallback behavior.
⚠️ Coordination note
This PR overlaps with #28 (LLM failover) — both add a secondary model to NearAiConfig and modify src/llm/mod.rs. This PR adds cheap_model for cost optimization, while #28 adds fallback_model for reliability. These should be merged in coordination to avoid conflicts. Consider wiring the cheap model through FailoverProvider from #28 so it also benefits from failover.
Minor notes (non-blocking)
- Only heartbeat uses
cheap_llm()currently — the description mentions "routing" and "evaluation" but those aren't wired up yet. Fine for incremental delivery. check_onboard_neededchecksNEARAI_API_KEYenv var directly instead of going through config — slightly inconsistent but pragmatic
ilblackdragon
left a comment
There was a problem hiding this comment.
Review: feat: Support direct API key auth and cheap model routing
Overall
Small, focused PR adding two features: API key auth bypass and cheap model routing for heartbeat. Both are useful but need some refinements.
Issues
1. Medium: Auth skip uses api_mode as a proxy for "has API key"
if config.llm.nearai.api_mode == ironclaw::config::NearAiApiMode::Responses {
session.ensure_authenticated().await?;
}This skips auth for ALL chat_completions mode users, but chat_completions mode could theoretically be used without an API key. The condition should check for the API key directly:
if config.llm.nearai.api_key.is_none() {
session.ensure_authenticated().await?;
}2. Medium: check_onboard_needed reads env var directly
if std::env::var("NEARAI_API_KEY").is_err() {The config struct already has config.llm.nearai.api_key. Reading the env var directly bypasses config resolution and is inconsistent. However, since check_onboard_needed runs before config is fully loaded, this might be intentional. If so, add a comment explaining why.
3. Low: cheap_llm only works with NEAR AI backend
create_cheap_llm_provider clones config.nearai and swaps the model. If the user runs LLM_BACKEND=openai, the cheap model setting is silently ignored. Consider either:
- Making it backend-agnostic (each backend gets its own cheap model config)
- Documenting that it's NEAR AI only
- Logging a warning when the setting is ignored
4. Low: No tests
No tests for create_cheap_llm_provider or the auth skip logic. At minimum add a test that create_cheap_llm_provider returns None when cheap_model is None.
Good
cheap_llm()fallback to main provider is clean- Heartbeat routing through cheap model is a sensible cost optimization
- Wizard test config updated properly
|
@desamtralized could resolve conflicts? |
SummaryThis PR adds support for a dual-LLM architecture with a "cheap/fast" model for lightweight tasks (heartbeats, routing, evaluation) and modifies authentication flow to skip session management when using API key authentication. The implementation is generally sound but has security validation gaps and missing test coverage. Pros
ConcernsSecurity Issues
Model Routing Issues
Code Quality Issues
SuggestionsSecurity (Critical)// In main.rs after line 193, add validation:
if config.llm.nearai.api_mode == ironclaw::config::NearAiApiMode::ChatCompletions {
// Validate API key format before proceeding
let api_key = std::env::var("NEARAI_API_KEY").map_err(|_| {
anyhow::anyhow!("NEARAI_API_KEY required for ChatCompletions mode")
})?;
if !api_key.starts_with("near_") || api_key.len() < 20 {
return Err(anyhow::anyhow!("Invalid NEARAI_API_KEY format"));
}
}Model Routing (High Priority)// In src/llm/mod.rs, add validation:
pub fn create_cheap_llm_provider(
config: &LlmConfig,
session: Arc<SessionManager>,
) -> Result<Option<Arc<dyn LlmProvider>>, LlmError> {
let Some(ref cheap_model) = config.nearai.cheap_model else {
return Ok(None);
};
// Add basic validation
if cheap_model.is_empty() || cheap_model.len() > 200 {
tracing::warn!("Invalid cheap_model name: {}, using main LLM", cheap_model);
return Ok(None);
}
let mut cheap_config = config.nearai.clone();
cheap_config.model = cheap_model.clone();
tracing::info!("Cheap LLM provider: {}", cheap_model);
match cheap_config.api_mode {
NearAiApiMode::Responses => Ok(Some(Arc::new(NearAiProvider::new(
cheap_config,
session,
)))),
NearAiApiMode::ChatCompletions => {
match NearAiChatProvider::new(cheap_config) {
Ok(provider) => Ok(Some(Arc::new(provider))),
Err(e) => {
tracing::error!("Failed to initialize cheap model '{}': {}, falling back to main LLM", cheap_model, e);
Ok(None)
}
}
}
}
}Documentation (Medium)// Fix confusing comment in main.rs line 191:
// OLD: "Skip session auth when using API key (chat_completions mode)"
// NEW: "Session-based auth required for Responses mode; ChatCompletions uses API key"
Testing (High Priority)// Add tests in src/llm/tests.rs:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_cheap_llm_fallback_to_main() {
let deps = AgentDeps { cheap_llm: None, ... };
let agent = Agent::new(...);
assert_eq!(agent.cheap_llm().model_name(), agent.llm().model_name());
}
#[test]
fn test_create_cheap_llm_with_invalid_model() {
let mut config = LlmConfig::default();
config.nearai.cheap_model = Some("invalid-model-name".to_string());
let session = create_session_manager(...).await;
let result = create_cheap_llm_provider(&config, session);
assert!(matches!(result, Ok(None)));
}
#[test]
fn test_auth_skip_with_api_key() {
// Mock NEARAI_API_KEY env var
// Verify onboarding skipped
}
}Configuration (Low Priority)// src/config.rs: Add helpful default for documentation
impl NearAiConfig {
fn default_cheap_model() -> Option<String> {
Some("llama4-maverick-instruct-basic".to_string()) // Document as example
}
}ConclusionThis PR implements a useful cost-optimization feature for dual-LLM architecture. The core logic is sound, but security validation for API keys is missing, which is a critical gap for production use. Model routing has insufficient validation for cheap model existence/compatibility. The code is maintainable but requires test coverage and better documentation. Recommendation: Request changes to address API key validation (#1) and cheap model validation (#3) before merging. Add tests (#5) and fix comment (#2) as secondary concerns. |
- Check API key presence (not api_mode) for auth skip (ilblackdragon) - Add Settings::load() call in check_onboard_needed (ilblackdragon) - Warn and ignore cheap_model for non-NearAi backends (ilblackdragon) - Add unit tests for create_cheap_llm_provider (ilblackdragon) - Minor formatting cleanup in cheap provider match arm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: Support direct API key auth and cheap model routing Allow using IronClaw with any OpenAI-compatible API provider (e.g. Anthropic Claude) via API key, without requiring NEAR AI session auth. Changes: - Skip session authentication in chat_completions mode (API key auth) - Skip first-run onboard check when NEARAI_API_KEY is configured - Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a secondary lightweight model used for heartbeat, routing, evaluation - Add `create_cheap_llm_provider()` factory in llm module - Add `cheap_llm` to AgentDeps with fallback to main model - Route heartbeat through cheap model to reduce costs - Fix wizard compilation for new config field Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR #20 review feedback - Check API key presence (not api_mode) for auth skip (ilblackdragon) - Add Settings::load() call in check_onboard_needed (ilblackdragon) - Warn and ignore cheap_model for non-NearAi backends (ilblackdragon) - Add unit tests for create_cheap_llm_provider (ilblackdragon) - Minor formatting cleanup in cheap provider match arm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Samuel Barbosa <sambarbosaa@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
Merged in #116. Thanks! |
* feat: Support direct API key auth and cheap model routing Allow using IronClaw with any OpenAI-compatible API provider (e.g. Anthropic Claude) via API key, without requiring NEAR AI session auth. Changes: - Skip session authentication in chat_completions mode (API key auth) - Skip first-run onboard check when NEARAI_API_KEY is configured - Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a secondary lightweight model used for heartbeat, routing, evaluation - Add `create_cheap_llm_provider()` factory in llm module - Add `cheap_llm` to AgentDeps with fallback to main model - Route heartbeat through cheap model to reduce costs - Fix wizard compilation for new config field Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR nearai#20 review feedback - Check API key presence (not api_mode) for auth skip (ilblackdragon) - Add Settings::load() call in check_onboard_needed (ilblackdragon) - Warn and ignore cheap_model for non-NearAi backends (ilblackdragon) - Add unit tests for create_cheap_llm_provider (ilblackdragon) - Minor formatting cleanup in cheap provider match arm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Samuel Barbosa <sambarbosaa@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat: Support direct API key auth and cheap model routing Allow using IronClaw with any OpenAI-compatible API provider (e.g. Anthropic Claude) via API key, without requiring NEAR AI session auth. Changes: - Skip session authentication in chat_completions mode (API key auth) - Skip first-run onboard check when NEARAI_API_KEY is configured - Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a secondary lightweight model used for heartbeat, routing, evaluation - Add `create_cheap_llm_provider()` factory in llm module - Add `cheap_llm` to AgentDeps with fallback to main model - Route heartbeat through cheap model to reduce costs - Fix wizard compilation for new config field Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR nearai#20 review feedback - Check API key presence (not api_mode) for auth skip (ilblackdragon) - Add Settings::load() call in check_onboard_needed (ilblackdragon) - Warn and ignore cheap_model for non-NearAi backends (ilblackdragon) - Add unit tests for create_cheap_llm_provider (ilblackdragon) - Minor formatting cleanup in cheap provider match arm Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Samuel Barbosa <sambarbosaa@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
chat_completionsmode withNEARAI_API_KEY, allowing direct connection to any OpenAI-compatible provider (e.g. Anthropic Claude)NEARAI_CHEAP_MODELenv var configures a secondary lightweight model (e.g. Claude Haiku) for cost-sensitive tasks like heartbeat, routing, and evaluationChanges
src/main.rschat_completionsmode; create cheap LLM provider; pass toAgentDeps; skip first-run check when API key setsrc/config.rscheap_model: Option<String>toNearAiConfig, load fromNEARAI_CHEAP_MODELenvsrc/llm/mod.rscreate_cheap_llm_provider()factory functionsrc/agent/agent_loop.rscheap_llmfield toAgentDeps,cheap_llm()accessor with fallback, route heartbeat through cheap modelsrc/setup/wizard.rscheap_model: Noneto fix struct initializationConfiguration
# .env example — direct Anthropic usage with dual models NEARAI_API_KEY=sk-ant-api03-... NEARAI_BASE_URL=https://api.anthropic.com NEARAI_MODEL=claude-sonnet-4-5-20250929 NEARAI_CHEAP_MODEL=claude-haiku-4-5-20251001 NEARAI_API_MODE=chat_completionsWhen
NEARAI_CHEAP_MODELis not set, all tasks use the main model (no behavior change).Test plan
cargo build --release— compiles cleanlyLLM provider initialized+Cheap LLM provider initializedin logs)Responses) still works as before (no regression)🤖 Generated with Claude Code