Skip to content

feat: Support direct API key auth and cheap model routing#20

Closed
desamtralized wants to merge 1 commit intonearai:mainfrom
desamtralized:feat/direct-api-key-and-cheap-model
Closed

feat: Support direct API key auth and cheap model routing#20
desamtralized wants to merge 1 commit intonearai:mainfrom
desamtralized:feat/direct-api-key-and-cheap-model

Conversation

@desamtralized
Copy link
Copy Markdown
Contributor

Summary

  • Direct API key auth: Skip NEAR AI session/OAuth when using chat_completions mode with NEARAI_API_KEY, allowing direct connection to any OpenAI-compatible provider (e.g. Anthropic Claude)
  • Cheap model routing: New NEARAI_CHEAP_MODEL env var configures a secondary lightweight model (e.g. Claude Haiku) for cost-sensitive tasks like heartbeat, routing, and evaluation
  • First-run fix: Skip onboard session file check when an API key is already configured

Changes

File What changed
src/main.rs Skip session auth in chat_completions mode; create cheap LLM provider; pass to AgentDeps; skip first-run check when API key set
src/config.rs Add cheap_model: Option<String> to NearAiConfig, load from NEARAI_CHEAP_MODEL env
src/llm/mod.rs Add create_cheap_llm_provider() factory function
src/agent/agent_loop.rs Add cheap_llm field to AgentDeps, cheap_llm() accessor with fallback, route heartbeat through cheap model
src/setup/wizard.rs Add cheap_model: None to fix struct initialization

Configuration

# .env example — direct Anthropic usage with dual models
NEARAI_API_KEY=sk-ant-api03-...
NEARAI_BASE_URL=https://api.anthropic.com
NEARAI_MODEL=claude-sonnet-4-5-20250929
NEARAI_CHEAP_MODEL=claude-haiku-4-5-20251001
NEARAI_API_MODE=chat_completions

When NEARAI_CHEAP_MODEL is not set, all tasks use the main model (no behavior change).

Test plan

  • Build with cargo build --release — compiles cleanly
  • Run with Anthropic API key — agent starts, connects, responds
  • Both models initialize (LLM provider initialized + Cheap LLM provider initialized in logs)
  • Heartbeat uses cheap model
  • Verify NEAR AI session mode (Responses) still works as before (no regression)

🤖 Generated with Claude Code

Allow using IronClaw with any OpenAI-compatible API provider (e.g.
Anthropic Claude) via API key, without requiring NEAR AI session auth.

Changes:
- Skip session authentication in chat_completions mode (API key auth)
- Skip first-run onboard check when NEARAI_API_KEY is configured
- Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a
  secondary lightweight model used for heartbeat, routing, evaluation
- Add `create_cheap_llm_provider()` factory in llm module
- Add `cheap_llm` to AgentDeps with fallback to main model
- Route heartbeat through cheap model to reduce costs
- Fix wizard compilation for new config field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nightfullstar
Copy link
Copy Markdown
Contributor

This made my day

serrrfirat pushed a commit to serrrfirat/ironclaw that referenced this pull request Feb 11, 2026
Reviews PRs nearai#10, nearai#13, nearai#14, nearai#17, nearai#18, nearai#20, nearai#28 covering:
- Critical: hand-rolled NEAR tx serialization and key mgmt (PR nearai#14)
- High: hooks system can bypass safety layer (PR nearai#18)
- High: DM pairing token security needs verification (PR nearai#17)
- Medium: auth bypass when API key mode set without key (PR nearai#20)
- Medium: safety error retry classification in failover (PR nearai#28)
- Low: Okta WASM tool and benchmarking harness

https://claude.ai/code/session_01B75Rq9u593YG9Kc4FG487Z
serrrfirat
serrrfirat previously approved these changes Feb 12, 2026
Copy link
Copy Markdown
Collaborator

@serrrfirat serrrfirat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — PR #20

Files: 5 files, 59+/6- | Assessment: APPROVE ✅

Clean, minimal change. Direct API key auth skip and cheap model routing are both well-implemented with proper fallback behavior.

⚠️ Coordination note

This PR overlaps with #28 (LLM failover) — both add a secondary model to NearAiConfig and modify src/llm/mod.rs. This PR adds cheap_model for cost optimization, while #28 adds fallback_model for reliability. These should be merged in coordination to avoid conflicts. Consider wiring the cheap model through FailoverProvider from #28 so it also benefits from failover.

Minor notes (non-blocking)

  • Only heartbeat uses cheap_llm() currently — the description mentions "routing" and "evaluation" but those aren't wired up yet. Fine for incremental delivery.
  • check_onboard_needed checks NEARAI_API_KEY env var directly instead of going through config — slightly inconsistent but pragmatic

Copy link
Copy Markdown
Member

@ilblackdragon ilblackdragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: Support direct API key auth and cheap model routing

Overall

Small, focused PR adding two features: API key auth bypass and cheap model routing for heartbeat. Both are useful but need some refinements.

Issues

1. Medium: Auth skip uses api_mode as a proxy for "has API key"

if config.llm.nearai.api_mode == ironclaw::config::NearAiApiMode::Responses {
    session.ensure_authenticated().await?;
}

This skips auth for ALL chat_completions mode users, but chat_completions mode could theoretically be used without an API key. The condition should check for the API key directly:

if config.llm.nearai.api_key.is_none() {
    session.ensure_authenticated().await?;
}

2. Medium: check_onboard_needed reads env var directly

if std::env::var("NEARAI_API_KEY").is_err() {

The config struct already has config.llm.nearai.api_key. Reading the env var directly bypasses config resolution and is inconsistent. However, since check_onboard_needed runs before config is fully loaded, this might be intentional. If so, add a comment explaining why.

3. Low: cheap_llm only works with NEAR AI backend

create_cheap_llm_provider clones config.nearai and swaps the model. If the user runs LLM_BACKEND=openai, the cheap model setting is silently ignored. Consider either:

  • Making it backend-agnostic (each backend gets its own cheap model config)
  • Documenting that it's NEAR AI only
  • Logging a warning when the setting is ignored

4. Low: No tests

No tests for create_cheap_llm_provider or the auth skip logic. At minimum add a test that create_cheap_llm_provider returns None when cheap_model is None.

Good

  • cheap_llm() fallback to main provider is clean
  • Heartbeat routing through cheap model is a sensible cost optimization
  • Wizard test config updated properly

@serrrfirat
Copy link
Copy Markdown
Collaborator

@desamtralized could resolve conflicts?

@tribendu
Copy link
Copy Markdown

Summary

This PR adds support for a dual-LLM architecture with a "cheap/fast" model for lightweight tasks (heartbeats, routing, evaluation) and modifies authentication flow to skip session management when using API key authentication. The implementation is generally sound but has security validation gaps and missing test coverage.

Pros

  • Cost optimization: Enables separate model selection for lightweight vs heavy-weight tasks
  • Graceful fallback: cheap_llm() getter falls back to main LLM if cheap provider unavailable
  • Backward compatible: cheap_model is optional, maintaining existing behavior
  • Clean API: Simple configuration via NEARAI_CHEAP_MODEL environment variable
  • Consistent pattern: Uses existing LlmConfig structure well

Concerns

Security Issues

  1. No API key validation (src/main.rs:190-193, src/setup/wizard.rs:463-468):

    • Code only checks if NEARAI_API_KEY env var is set, not if it's valid or properly formatted
    • If API key is malformed/expired, system proceeds without any error and fails later during actual API calls
    • No authentication verification before initializing services
  2. Authentication logic comment confusion (src/main.rs:191-192):

    • Comment says "Skip session auth when using API key" but condition checks for Responses mode (which uses token-based auth)
    • Logic is technically correct but inverted from comment - confusing for maintainers

Model Routing Issues

  1. Missing cheap_model validation (src/llm/mod.rs:70-72):

    • No check that cheap_model actually exists/works before creating provider
    • If cheap model name is typoed or not available, provider creation succeeds first call, fails later
    • Creates provider with invalid model configuration
  2. No model type compatibility check:

    • No validation that cheap model supports same API mode as main model
    • If main uses Responses mode and cheap model only works in ChatCompletions, error will occur at runtime

Code Quality Issues

  1. Missing test coverage:

    • No tests for create_cheap_llm_provider()
    • No tests for cheap_llm() fallback behavior
    • No tests for authentication skip logic with/without API keys
    • No tests for invalid cheap_model configuration
  2. Silent config mismatches:

    • If cheap_model configured but fails to initialize, errors silently return None in some cases
    • Hard to debug why cheap model isn't being used
  3. Inconsistent error handling (src/llm/mod.rs:54):

    • Uses ? for NearAiChatProvider::new() but not for NearAiProvider::new() which can't fail
    • Returns errors from chat provider but not responses provider
  4. Missing configuration in wizard (src/setup/wizard.rs:456):

    • Hardcoded cheap_model: None in wizard dummy config - no user-facing cheap model configuration

Suggestions

Security (Critical)

// In main.rs after line 193, add validation:
if config.llm.nearai.api_mode == ironclaw::config::NearAiApiMode::ChatCompletions {
    // Validate API key format before proceeding
    let api_key = std::env::var("NEARAI_API_KEY").map_err(|_| {
        anyhow::anyhow!("NEARAI_API_KEY required for ChatCompletions mode")
    })?;
    if !api_key.starts_with("near_") || api_key.len() < 20 {
        return Err(anyhow::anyhow!("Invalid NEARAI_API_KEY format"));
    }
}

Model Routing (High Priority)

// In src/llm/mod.rs, add validation:
pub fn create_cheap_llm_provider(
    config: &LlmConfig,
    session: Arc<SessionManager>,
) -> Result<Option<Arc<dyn LlmProvider>>, LlmError> {
    let Some(ref cheap_model) = config.nearai.cheap_model else {
        return Ok(None);
    };

    // Add basic validation
    if cheap_model.is_empty() || cheap_model.len() > 200 {
        tracing::warn!("Invalid cheap_model name: {}, using main LLM", cheap_model);
        return Ok(None);
    }

    let mut cheap_config = config.nearai.clone();
    cheap_config.model = cheap_model.clone();

    tracing::info!("Cheap LLM provider: {}", cheap_model);

    match cheap_config.api_mode {
        NearAiApiMode::Responses => Ok(Some(Arc::new(NearAiProvider::new(
            cheap_config,
            session,
        )))),
        NearAiApiMode::ChatCompletions => {
            match NearAiChatProvider::new(cheap_config) {
                Ok(provider) => Ok(Some(Arc::new(provider))),
                Err(e) => {
                    tracing::error!("Failed to initialize cheap model '{}': {}, falling back to main LLM", cheap_model, e);
                    Ok(None)
                }
            }
        }
    }
}

Documentation (Medium)

// Fix confusing comment in main.rs line 191:
// OLD: "Skip session auth when using API key (chat_completions mode)"
// NEW: "Session-based auth required for Responses mode; ChatCompletions uses API key"
  • Document environment variable requirements in README/AGENTS.md
  • Add warning if cheap_model same as main model (defeats optimization)
  • Document fallback behavior in docstrings

Testing (High Priority)

// Add tests in src/llm/tests.rs:
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_cheap_llm_fallback_to_main() {
        let deps = AgentDeps { cheap_llm: None, ... };
        let agent = Agent::new(...);
        assert_eq!(agent.cheap_llm().model_name(), agent.llm().model_name());
    }

    #[test]
    fn test_create_cheap_llm_with_invalid_model() {
        let mut config = LlmConfig::default();
        config.nearai.cheap_model = Some("invalid-model-name".to_string());
        let session = create_session_manager(...).await;
        let result = create_cheap_llm_provider(&config, session);
        assert!(matches!(result, Ok(None)));
    }

    #[test]
    fn test_auth_skip_with_api_key() {
        // Mock NEARAI_API_KEY env var
        // Verify onboarding skipped
    }
}

Configuration (Low Priority)

// src/config.rs: Add helpful default for documentation
impl NearAiConfig {
    fn default_cheap_model() -> Option<String> {
        Some("llama4-maverick-instruct-basic".to_string()) // Document as example
    }
}

Conclusion

This PR implements a useful cost-optimization feature for dual-LLM architecture. The core logic is sound, but security validation for API keys is missing, which is a critical gap for production use. Model routing has insufficient validation for cheap model existence/compatibility. The code is maintainable but requires test coverage and better documentation.

Recommendation: Request changes to address API key validation (#1) and cheap model validation (#3) before merging. Add tests (#5) and fix comment (#2) as secondary concerns.

ilblackdragon added a commit that referenced this pull request Feb 17, 2026
- Check API key presence (not api_mode) for auth skip (ilblackdragon)
- Add Settings::load() call in check_onboard_needed (ilblackdragon)
- Warn and ignore cheap_model for non-NearAi backends (ilblackdragon)
- Add unit tests for create_cheap_llm_provider (ilblackdragon)
- Minor formatting cleanup in cheap provider match arm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ilblackdragon added a commit that referenced this pull request Feb 17, 2026
* feat: Support direct API key auth and cheap model routing

Allow using IronClaw with any OpenAI-compatible API provider (e.g.
Anthropic Claude) via API key, without requiring NEAR AI session auth.

Changes:
- Skip session authentication in chat_completions mode (API key auth)
- Skip first-run onboard check when NEARAI_API_KEY is configured
- Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a
  secondary lightweight model used for heartbeat, routing, evaluation
- Add `create_cheap_llm_provider()` factory in llm module
- Add `cheap_llm` to AgentDeps with fallback to main model
- Route heartbeat through cheap model to reduce costs
- Fix wizard compilation for new config field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR #20 review feedback

- Check API key presence (not api_mode) for auth skip (ilblackdragon)
- Add Settings::load() call in check_onboard_needed (ilblackdragon)
- Warn and ignore cheap_model for non-NearAi backends (ilblackdragon)
- Add unit tests for create_cheap_llm_provider (ilblackdragon)
- Minor formatting cleanup in cheap provider match arm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Samuel Barbosa <sambarbosaa@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@ilblackdragon
Copy link
Copy Markdown
Member

Merged in #116.

Thanks!

jaswinder6991 pushed a commit to jaswinder6991/ironclaw that referenced this pull request Feb 26, 2026
* feat: Support direct API key auth and cheap model routing

Allow using IronClaw with any OpenAI-compatible API provider (e.g.
Anthropic Claude) via API key, without requiring NEAR AI session auth.

Changes:
- Skip session authentication in chat_completions mode (API key auth)
- Skip first-run onboard check when NEARAI_API_KEY is configured
- Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a
  secondary lightweight model used for heartbeat, routing, evaluation
- Add `create_cheap_llm_provider()` factory in llm module
- Add `cheap_llm` to AgentDeps with fallback to main model
- Route heartbeat through cheap model to reduce costs
- Fix wizard compilation for new config field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR nearai#20 review feedback

- Check API key presence (not api_mode) for auth skip (ilblackdragon)
- Add Settings::load() call in check_onboard_needed (ilblackdragon)
- Warn and ignore cheap_model for non-NearAi backends (ilblackdragon)
- Add unit tests for create_cheap_llm_provider (ilblackdragon)
- Minor formatting cleanup in cheap provider match arm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Samuel Barbosa <sambarbosaa@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
* feat: Support direct API key auth and cheap model routing

Allow using IronClaw with any OpenAI-compatible API provider (e.g.
Anthropic Claude) via API key, without requiring NEAR AI session auth.

Changes:
- Skip session authentication in chat_completions mode (API key auth)
- Skip first-run onboard check when NEARAI_API_KEY is configured
- Add `cheap_model` config field (NEARAI_CHEAP_MODEL env var) for a
  secondary lightweight model used for heartbeat, routing, evaluation
- Add `create_cheap_llm_provider()` factory in llm module
- Add `cheap_llm` to AgentDeps with fallback to main model
- Route heartbeat through cheap model to reduce costs
- Fix wizard compilation for new config field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR nearai#20 review feedback

- Check API key presence (not api_mode) for auth skip (ilblackdragon)
- Add Settings::load() call in check_onboard_needed (ilblackdragon)
- Warn and ignore cheap_model for non-NearAi backends (ilblackdragon)
- Add unit tests for create_cheap_llm_provider (ilblackdragon)
- Minor formatting cleanup in cheap provider match arm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Samuel Barbosa <sambarbosaa@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants