vinhnx
diff --git a/‎docs/MODEL_UPDATE_IMPLEMENTATION.md‎
Lines changed: 153 additions & 119 deletions b/‎docs/MODEL_UPDATE_IMPLEMENTATION.md‎
Lines changed: 153 additions & 119 deletions
diff --git a/‎vtcode-core/src/config/constants.rs‎
Lines changed: 4 additions & 4 deletions b/‎vtcode-core/src/config/constants.rs‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎vtcode-core/src/config/mod.rs‎
Lines changed: 1 addition & 1 deletion b/‎vtcode-core/src/config/mod.rs‎
Lines changed: 1 addition & 1 deletion
@@ -7,214 +7,248 @@ This document details the comprehensive implementation of model updates for VTCo
 ## Implementation Approach
 
 ### Objective
+
 Update the VTCode codebase to focus on the latest and most capable AI models as of September 2025, removing outdated or less relevant providers and models while maintaining backward compatibility.
 
 ### Models to Keep
 
 #### Core Models (Keep and Update)
+
 1. **Kimi K2** - Moonshot AI's latest reasoning models
-   - Kimi K2 0905 (latest)
-   - Kimi K2 (previous version)
+
+    - Kimi K2 0905 (latest)
+    - Kimi K2 (previous version)
 
 2. **GLM** - Zhipu AI's latest models
-   - GLM-4.5V (multimodal)
-   - GLM-4.5 (text)
-   - GLM-4.5-Air (balanced)
+
+    - GLM-4.5V (multimodal)
+    - GLM-4.5 (text)
+    - GLM-4.5-Air (balanced)
 
 3. **Qwen3 Family** - Alibaba's latest models
-   - Qwen3 32B
-   - Qwen3 Coder
-   - Qwen3 Max
+
+    - Qwen3 32B
+    - Qwen3 Coder
+    - Qwen3 Max
 
 4. **DeepSeek** - Reasoning-focused models
-   - DeepSeek Reasoner
-   - DeepSeek Chat
+
+    - DeepSeek Reasoner
+    - DeepSeek Chat
 
 5. **Gemini 2.5** - Google's latest models (already implemented)
-   - Gemini 2.5 Flash Lite Preview 06-17
-   - Gemini 2.5 Pro Preview 06-05
-   - Gemini 2.5 Flash
-   - Gemini 2.5 Pro
-
-7. **Claude 4** - Anthropic's latest models (already implemented)
-   - Claude Opus 4.1
-   - Claude Sonnet 4
-   - Claude Opus 4
-
-8. **GPT-5** - OpenAI's latest models (already implemented)
-   - GPT-5
-   - GPT-5 Mini
-   - GPT-5 Chat Latest
-   - GPT-5 Nano
-   - o3-pro
-   - o3
-   - o4-mini
-   - Codex Mini Latest
+
+    - Gemini 2.5 Flash Lite Preview 06-17
+    - Gemini 2.5 Pro Preview 06-05
+    - Gemini 2.5 Flash
+    - Gemini 2.5 Pro
+
+6. **Claude 4** - Anthropic's latest models (already implemented)
+
+    - Claude Opus 4.1
+    - Claude Sonnet 4
+    - Claude Opus 4
+
+7. **GPT-5** - OpenAI's latest models (already implemented)
+    - GPT-5
+    - GPT-5 Mini
+    - GPT-5 Chat Latest
+    - GPT-5 Nano
+    - o3-pro
+    - o3
+    - o4-mini
+    - Codex Mini Latest
 
 ## Implementation Phases
 
 ### Phase 1: Analysis and Planning
+
 1. Reviewed existing model definitions and provider implementations
 2. Identified models to keep based on September 2025 capabilities
 3. Identified providers to remove:
-   - Ollama provider (removed completely)
-   - Groq provider (accessed through OpenAI-compatible APIs)
+    - Ollama provider (removed completely)
+    - Groq provider (accessed through OpenAI-compatible APIs)
 
 ### Phase 2: Model Definition Updates
+
 1. Added Kimi K2 models to ModelId enum:
-   - KimiK20905 (`moonshotai/kimi-k2-instruct-0905`)
-   - KimiK2 (`moonshotai/kimi-k2-instruct`)
+    - KimiK20905 (`moonshotai/kimi-k2-instruct-0905`)
+    - KimiK2 (`moonshotai/kimi-k2-instruct`)
 2. Added GLM models to ModelId enum:
-   - GLM45V (`z-ai/glm-4.5v`)
-   - GLM45 (`z-ai/glm-4.5`)
-   - GLM45Air (`z-ai/glm-4.5-air`)
+    - GLM45V (`z-ai/glm-4.5v`)
+    - GLM45 (`z-ai/glm-4.5`)
+    - GLM45Air (`z-ai/glm-4.5-air`)
 3. Added Qwen3 models to ModelId enum:
-   - Qwen3_32B (`qwen/qwen3-32b`)
-   - Qwen3Coder (`qwen/qwen3-coder`)
-   - Qwen3Max (`qwen/qwen3-max`)
+    - Qwen3_32B (`qwen/qwen3-32b`)
+    - Qwen3Coder (`qwen/qwen3-coder`)
+    - Qwen3Max (`qwen/qwen3-max`)
 4. Added DeepSeek models to ModelId enum:
-   - DeepSeekReasoner (`deepseek-reasoner`)
-   - DeepSeekChat (`deepseek-chat`)
+    - DeepSeekReasoner (`deepseek-reasoner`)
+    - DeepSeekChat (`deepseek-chat`)
 5. Added DeepSeek models to ModelId enum:
-   - DeepSeekChat (`deepseek/deepseek-chat-v3.1`)
-   - DeepSeekReasoner (`deepseek/deepseek-reasoner`)
+    - DeepSeekChat (`deepseek/deepseek-chat-v3.1`)
+    - DeepSeekReasoner (`deepseek/deepseek-reasoner`)
 
 ### Phase 3: Provider Updates
+
 1. Updated provider mappings for new models:
-   - Kimi, GLM, Qwen models → OpenAI provider
-   - DeepSeek models → DeepSeek provider
+    - Kimi, GLM, Qwen models → OpenAI provider
+    - DeepSeek models → DeepSeek provider
 2. Removed unused provider modules (Ollama, Groq)
 
 ### Phase 4: Client Factory Updates
+
 1. Updated client factory to remove direct provider implementations
 2. Simplified provider access through OpenAI-compatible APIs
 3. Maintained backward compatibility for existing models
 
 ## New Providers Added
 
 ### DeepSeek Provider
-- **API Key**: `DEEPSEEK_API_KEY`
-- **Specialization**: Advanced reasoning models
-- **Models**:
-  - `deepseek-reasoner` - Latest reasoning model (Jan 2025, updated Aug 2025)
-  - `deepseek-chat` - Latest chat model (Dec 2024, updated Aug 2025)
+
+-   **API Key**: `DEEPSEEK_API_KEY`
+-   **Specialization**: Advanced reasoning models
+-   **Models**:
+    -   `deepseek-reasoner` - Latest reasoning model (Jan 2025, updated Aug 2025)
+    -   `deepseek-chat` - Latest chat model (Dec 2024, updated Aug 2025)
 
 ## Updated Existing Providers
 
 ### Google Gemini (Updated)
+
 **Latest Models (June 2025 releases):**
-- `gemini-2.5-flash-lite-preview-06-17` - Latest fastest model
-- `gemini-2.5-pro-preview-06-05` - Latest most capable model
-- `gemini-2.5-flash` - Stable fast model
-- `gemini-2.5-pro` - Stable capable model
-- `gemini-2.0-flash` - Previous generation fast model
+
+-   `gemini-2.5-flash-lite-preview-06-17` - Latest fastest model
+-   `gemini-2.5-pro-preview-06-05` - Latest most capable model
+-   `gemini-2.5-flash` - Stable fast model
+-   `gemini-2.5-pro` - Stable capable model
 
 ### OpenAI (Updated)
+
 **Latest Models (August 2025 releases):**
-- `gpt-5` - Latest high performance model (Aug 2025)
-- `gpt-5-mini` - Latest fast & economical model (Aug 2025)
-- `gpt-5-chat-latest` - Latest conversational model
-- `gpt-5-nano` - Ultra-fast compact model
-- `o3-pro` - Advanced reasoning model (June 2025)
-- `o3` - Reasoning model (April 2025)
-- `o4-mini` - Next generation mini reasoning model
-- `codex-mini-latest` - Latest code generation model
+
+-   `gpt-5` - Latest high performance model (Aug 2025)
+-   `gpt-5-mini` - Latest fast & economical model (Aug 2025)
+-   `gpt-5-chat-latest` - Latest conversational model
+-   `gpt-5-nano` - Ultra-fast compact model
+-   `o3-pro` - Advanced reasoning model (June 2025)
+-   `o3` - Reasoning model (April 2025)
+-   `o4-mini` - Next generation mini reasoning model
+-   `codex-mini-latest` - Latest code generation model
 
 ### Anthropic Claude (Updated)
+
 **Latest Models (August 2025 releases):**
-- `claude-opus-4-1-20250805` - Latest most powerful (Aug 2025)
-- `claude-sonnet-4-20250514` - Latest intelligent (May 2025)
-- Progressive model generations (4.1, 4, 3.7, 3.5v2, 3.5)
+
+-   `claude-opus-4-1-20250805` - Latest most powerful (Aug 2025)
+-   `claude-sonnet-4-20250514` - Latest intelligent (May 2025)
+-   Progressive model generations (4.1, 4, 3.7, 3.5v2, 3.5)
 
 ### Groq (Updated)
+
 **Latest Models (September 2025 releases):**
-- Latest 2025 models: Kimi K2, GPT OSS, Llama 4 variants
-- Ultra-fast inference maintained for all models
-- Backward compatibility with existing models
+
+-   Latest 2025 models: Kimi K2, GPT OSS, Llama 4 variants
+-   Ultra-fast inference maintained for all models
+-   Backward compatibility with existing models
 
 ## Configuration Updates
 
 ### Updated Files
-- `vtcode-core/src/config/models.rs` - Main model definitions
-- `vtcode-core/src/config/constants.rs` - Updated constants
-- `vtcode-core/src/llm/client.rs` - Provider factory updates
-- `vtcode.toml.example` - Configuration examples
+
+-   `vtcode-core/src/config/models.rs` - Main model definitions
+-   `vtcode-core/src/config/constants.rs` - Updated constants
+-   `vtcode-core/src/llm/client.rs` - Provider factory updates
+-   `vtcode.toml.example` - Configuration examples
 
 ### Model Organization
-- Type-safe enum with future-ready models
-- Complete display names and descriptions
-- Provider-specific default models
-- Comprehensive model metadata
+
+-   Type-safe enum with future-ready models
+-   Complete display names and descriptions
+-   Provider-specific default models
+-   Comprehensive model metadata
 
 ## Implementation Status
 
 ### Successfully Completed
 
 #### 1. Model Infrastructure Update
-- Updated `ModelId` enum with 67 models (was ~20)
-- Added new providers: `DeepSeek`
-- Updated all existing providers with 2025 models
-- Complete display names and descriptions for all models
-- Updated provider factory with new provider support
-- Fixed all model reference inconsistencies
+
+-   Updated `ModelId` enum with 67 models (was ~20)
+-   Added new providers: `DeepSeek`
+-   Updated all existing providers with 2025 models
+-   Complete display names and descriptions for all models
+-   Updated provider factory with new provider support
+-   Fixed all model reference inconsistencies
 
 #### 2. New Providers Added
-- **DeepSeek** (2 models): Reasoning specialist with R1 technology
+
+-   **DeepSeek** (2 models): Reasoning specialist with R1 technology
 
 #### 3. Updated Existing Providers
-- **Gemini** (5 models): Latest 2.5 series
-- **OpenAI** (8 models): GPT-5 and reasoning models
-- **Anthropic** (6 models): Claude 4.1 and 4 series
-- **Groq** (18 models): Latest 2025 models
+
+-   **Gemini** (5 models): Latest 2.5 series
+-   **OpenAI** (8 models): GPT-5 and reasoning models
+-   **Anthropic** (6 models): Claude 4.1 and 4 series
+-   **Groq** (18 models): Latest 2025 models
 
 #### 4. Configuration Updates
-- Updated `vtcode.toml.example` with all 67 models
-- Updated constants and defaults
-- Fixed model string mappings
-- Updated fallback and provider-specific defaults
+
+-   Updated `vtcode.toml.example` with all 67 models
+-   Updated constants and defaults
+-   Fixed model string mappings
+-   Updated fallback and provider-specific defaults
 
 #### 5. Code Quality & Testing
-- Updated all display methods and utility functions
-- Fixed model variant detection (flash, pro, efficient, top-tier)
-- Updated generation/version strings
-- Fixed all test cases and references
-- Comprehensive model metadata
+
+-   Updated all display methods and utility functions
+-   Fixed model variant detection (flash, pro, efficient, top-tier)
+-   Updated generation/version strings
+-   Fixed all test cases and references
+-   Comprehensive model metadata
 
 ## Impact Summary
 
 ### Before → After
-- **Models**: ~20 → 67 models (+235% increase)
-- **Providers**: 7 → 8 providers (+1 new)
-- **Latest Tech**: Added GPT-5, Claude Opus 4.1, DeepSeek R1
-- **Performance**: Maintained ultra-fast Groq inference, added reasoning models
+
+-   **Models**: ~20 → 67 models (+235% increase)
+-   **Providers**: 7 → 8 providers (+1 new)
+-   **Latest Tech**: Added GPT-5, Claude Opus 4.1, DeepSeek R1
+-   **Performance**: Maintained ultra-fast Groq inference, added reasoning models
 
 ### New Capabilities
-- **Advanced Reasoning**: DeepSeek R1, OpenAI o3/o4 series
-- **Latest Generation**: GPT-5, Claude 4.1
-- **Specialized Models**: Code generation, reasoning, vision models
-- **Cost Optimization**: New preview and lite models for efficiency
+
+-   **Advanced Reasoning**: DeepSeek R1, OpenAI o3/o4 series
+-   **Latest Generation**: GPT-5, Claude 4.1
+-   **Specialized Models**: Code generation, reasoning, vision models
+-   **Cost Optimization**: New preview and lite models for efficiency
 
 ## Files Modified
-- `vtcode-core/src/config/models.rs` (partial)
-- `vtcode-core/src/config/constants.rs` (complete)
-- `vtcode.toml.example` (complete)
-- Various utility files need model name updates
+
+-   `vtcode-core/src/config/models.rs` (partial)
+-   `vtcode-core/src/config/constants.rs` (complete)
+-   `vtcode.toml.example` (complete)
+-   Various utility files need model name updates
 
 ## Next Steps
+
 1. **Immediate Fix** (5 minutes):
-   - Replace all `Gemini25FlashLite` references with `Gemini25FlashLitePreview0617`
-   - Add basic display names for compilation
+
+    - Replace all `Gemini25FlashLite` references with `Gemini25FlashLitePreview0617`
+    - Add basic display names for compilation
 
 2. **Provider Implementation** (10 minutes):
-   - Add DeepSeek to LLM client factory
-   - Add basic provider routing
+
+    - Add DeepSeek to LLM client factory
+    - Add basic provider routing
 
 3. **Complete Model Registry** (15 minutes):
-   - Update methods with all 31 new models
-   - Add proper error handling and validation
+    - Update methods with all 31 new models
+    - Add proper error handling and validation
 
 ## Success Metrics
-- **67 models** successfully defined and configured
-- **1 new provider** (DeepSeek) integrated
-- **All existing providers** updated with latest models
-- **Complete metadata** for all models (names, descriptions, generations)
+
+-   **67 models** successfully defined and configured
+-   **1 new provider** (DeepSeek) integrated
+-   **All existing providers** updated with latest models
+-   **Complete metadata** for all models (names, descriptions, generations)
@@ -10,14 +10,14 @@ pub mod models {
     pub mod google {
         pub const DEFAULT_MODEL: &str = "gemini-2.5-flash-lite-preview-06-17";
         pub const SUPPORTED_MODELS: &[&str] = &[
-            "gemini-2.0-flash-exp",
-            "gemini-2.0-flash-001",
+            "gemini-2.5-flash",
+            "gemini-2.5-flash-lite",
             "gemini-2.5-pro",
         ];
 
         // Convenience constants for commonly used models
-        pub const GEMINI_2_5_FLASH_LITE: &str = "gemini-2.0-flash-exp";
-        pub const GEMINI_2_5_FLASH: &str = "gemini-2.0-flash-001";
+        pub const GEMINI_2_5_FLASH_LITE: &str = "gemini-2.5-flash-lite";
+        pub const GEMINI_2_5_FLASH: &str = "gemini-2.5-flash";
         pub const GEMINI_2_5_PRO: &str = "gemini-2.5-pro";
     }
 
 
@@ -28,7 +28,7 @@
 //!
 //! [llm.providers.gemini]
 //! api_key = "your-key"
-//! model = "gemini-2.0-flash-exp"
+//! model = "gemini-2.5-flash"
 //!
 //! [security]
 //! workspace_root = "/path/to/project"