Skip to content

Commit 5afbc8e

Browse files
committed
feat: update model references to latest versions and remove outdated models
1 parent a76bcf7 commit 5afbc8e

File tree

5 files changed

+160
-126
lines changed

5 files changed

+160
-126
lines changed

docs/MODEL_UPDATE_IMPLEMENTATION.md

Lines changed: 153 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -7,214 +7,248 @@ This document details the comprehensive implementation of model updates for VTCo
77
## Implementation Approach
88

99
### Objective
10+
1011
Update the VTCode codebase to focus on the latest and most capable AI models as of September 2025, removing outdated or less relevant providers and models while maintaining backward compatibility.
1112

1213
### Models to Keep
1314

1415
#### Core Models (Keep and Update)
16+
1517
1. **Kimi K2** - Moonshot AI's latest reasoning models
16-
- Kimi K2 0905 (latest)
17-
- Kimi K2 (previous version)
18+
19+
- Kimi K2 0905 (latest)
20+
- Kimi K2 (previous version)
1821

1922
2. **GLM** - Zhipu AI's latest models
20-
- GLM-4.5V (multimodal)
21-
- GLM-4.5 (text)
22-
- GLM-4.5-Air (balanced)
23+
24+
- GLM-4.5V (multimodal)
25+
- GLM-4.5 (text)
26+
- GLM-4.5-Air (balanced)
2327

2428
3. **Qwen3 Family** - Alibaba's latest models
25-
- Qwen3 32B
26-
- Qwen3 Coder
27-
- Qwen3 Max
29+
30+
- Qwen3 32B
31+
- Qwen3 Coder
32+
- Qwen3 Max
2833

2934
4. **DeepSeek** - Reasoning-focused models
30-
- DeepSeek Reasoner
31-
- DeepSeek Chat
35+
36+
- DeepSeek Reasoner
37+
- DeepSeek Chat
3238

3339
5. **Gemini 2.5** - Google's latest models (already implemented)
34-
- Gemini 2.5 Flash Lite Preview 06-17
35-
- Gemini 2.5 Pro Preview 06-05
36-
- Gemini 2.5 Flash
37-
- Gemini 2.5 Pro
38-
39-
7. **Claude 4** - Anthropic's latest models (already implemented)
40-
- Claude Opus 4.1
41-
- Claude Sonnet 4
42-
- Claude Opus 4
43-
44-
8. **GPT-5** - OpenAI's latest models (already implemented)
45-
- GPT-5
46-
- GPT-5 Mini
47-
- GPT-5 Chat Latest
48-
- GPT-5 Nano
49-
- o3-pro
50-
- o3
51-
- o4-mini
52-
- Codex Mini Latest
40+
41+
- Gemini 2.5 Flash Lite Preview 06-17
42+
- Gemini 2.5 Pro Preview 06-05
43+
- Gemini 2.5 Flash
44+
- Gemini 2.5 Pro
45+
46+
6. **Claude 4** - Anthropic's latest models (already implemented)
47+
48+
- Claude Opus 4.1
49+
- Claude Sonnet 4
50+
- Claude Opus 4
51+
52+
7. **GPT-5** - OpenAI's latest models (already implemented)
53+
- GPT-5
54+
- GPT-5 Mini
55+
- GPT-5 Chat Latest
56+
- GPT-5 Nano
57+
- o3-pro
58+
- o3
59+
- o4-mini
60+
- Codex Mini Latest
5361

5462
## Implementation Phases
5563

5664
### Phase 1: Analysis and Planning
65+
5766
1. Reviewed existing model definitions and provider implementations
5867
2. Identified models to keep based on September 2025 capabilities
5968
3. Identified providers to remove:
60-
- Ollama provider (removed completely)
61-
- Groq provider (accessed through OpenAI-compatible APIs)
69+
- Ollama provider (removed completely)
70+
- Groq provider (accessed through OpenAI-compatible APIs)
6271

6372
### Phase 2: Model Definition Updates
73+
6474
1. Added Kimi K2 models to ModelId enum:
65-
- KimiK20905 (`moonshotai/kimi-k2-instruct-0905`)
66-
- KimiK2 (`moonshotai/kimi-k2-instruct`)
75+
- KimiK20905 (`moonshotai/kimi-k2-instruct-0905`)
76+
- KimiK2 (`moonshotai/kimi-k2-instruct`)
6777
2. Added GLM models to ModelId enum:
68-
- GLM45V (`z-ai/glm-4.5v`)
69-
- GLM45 (`z-ai/glm-4.5`)
70-
- GLM45Air (`z-ai/glm-4.5-air`)
78+
- GLM45V (`z-ai/glm-4.5v`)
79+
- GLM45 (`z-ai/glm-4.5`)
80+
- GLM45Air (`z-ai/glm-4.5-air`)
7181
3. Added Qwen3 models to ModelId enum:
72-
- Qwen3_32B (`qwen/qwen3-32b`)
73-
- Qwen3Coder (`qwen/qwen3-coder`)
74-
- Qwen3Max (`qwen/qwen3-max`)
82+
- Qwen3_32B (`qwen/qwen3-32b`)
83+
- Qwen3Coder (`qwen/qwen3-coder`)
84+
- Qwen3Max (`qwen/qwen3-max`)
7585
4. Added DeepSeek models to ModelId enum:
76-
- DeepSeekReasoner (`deepseek-reasoner`)
77-
- DeepSeekChat (`deepseek-chat`)
86+
- DeepSeekReasoner (`deepseek-reasoner`)
87+
- DeepSeekChat (`deepseek-chat`)
7888
5. Added DeepSeek models to ModelId enum:
79-
- DeepSeekChat (`deepseek/deepseek-chat-v3.1`)
80-
- DeepSeekReasoner (`deepseek/deepseek-reasoner`)
89+
- DeepSeekChat (`deepseek/deepseek-chat-v3.1`)
90+
- DeepSeekReasoner (`deepseek/deepseek-reasoner`)
8191

8292
### Phase 3: Provider Updates
93+
8394
1. Updated provider mappings for new models:
84-
- Kimi, GLM, Qwen models → OpenAI provider
85-
- DeepSeek models → DeepSeek provider
95+
- Kimi, GLM, Qwen models → OpenAI provider
96+
- DeepSeek models → DeepSeek provider
8697
2. Removed unused provider modules (Ollama, Groq)
8798

8899
### Phase 4: Client Factory Updates
100+
89101
1. Updated client factory to remove direct provider implementations
90102
2. Simplified provider access through OpenAI-compatible APIs
91103
3. Maintained backward compatibility for existing models
92104

93105
## New Providers Added
94106

95107
### DeepSeek Provider
96-
- **API Key**: `DEEPSEEK_API_KEY`
97-
- **Specialization**: Advanced reasoning models
98-
- **Models**:
99-
- `deepseek-reasoner` - Latest reasoning model (Jan 2025, updated Aug 2025)
100-
- `deepseek-chat` - Latest chat model (Dec 2024, updated Aug 2025)
108+
109+
- **API Key**: `DEEPSEEK_API_KEY`
110+
- **Specialization**: Advanced reasoning models
111+
- **Models**:
112+
- `deepseek-reasoner` - Latest reasoning model (Jan 2025, updated Aug 2025)
113+
- `deepseek-chat` - Latest chat model (Dec 2024, updated Aug 2025)
101114

102115
## Updated Existing Providers
103116

104117
### Google Gemini (Updated)
118+
105119
**Latest Models (June 2025 releases):**
106-
- `gemini-2.5-flash-lite-preview-06-17` - Latest fastest model
107-
- `gemini-2.5-pro-preview-06-05` - Latest most capable model
108-
- `gemini-2.5-flash` - Stable fast model
109-
- `gemini-2.5-pro` - Stable capable model
110-
- `gemini-2.0-flash` - Previous generation fast model
120+
121+
- `gemini-2.5-flash-lite-preview-06-17` - Latest fastest model
122+
- `gemini-2.5-pro-preview-06-05` - Latest most capable model
123+
- `gemini-2.5-flash` - Stable fast model
124+
- `gemini-2.5-pro` - Stable capable model
111125

112126
### OpenAI (Updated)
127+
113128
**Latest Models (August 2025 releases):**
114-
- `gpt-5` - Latest high performance model (Aug 2025)
115-
- `gpt-5-mini` - Latest fast & economical model (Aug 2025)
116-
- `gpt-5-chat-latest` - Latest conversational model
117-
- `gpt-5-nano` - Ultra-fast compact model
118-
- `o3-pro` - Advanced reasoning model (June 2025)
119-
- `o3` - Reasoning model (April 2025)
120-
- `o4-mini` - Next generation mini reasoning model
121-
- `codex-mini-latest` - Latest code generation model
129+
130+
- `gpt-5` - Latest high performance model (Aug 2025)
131+
- `gpt-5-mini` - Latest fast & economical model (Aug 2025)
132+
- `gpt-5-chat-latest` - Latest conversational model
133+
- `gpt-5-nano` - Ultra-fast compact model
134+
- `o3-pro` - Advanced reasoning model (June 2025)
135+
- `o3` - Reasoning model (April 2025)
136+
- `o4-mini` - Next generation mini reasoning model
137+
- `codex-mini-latest` - Latest code generation model
122138

123139
### Anthropic Claude (Updated)
140+
124141
**Latest Models (August 2025 releases):**
125-
- `claude-opus-4-1-20250805` - Latest most powerful (Aug 2025)
126-
- `claude-sonnet-4-20250514` - Latest intelligent (May 2025)
127-
- Progressive model generations (4.1, 4, 3.7, 3.5v2, 3.5)
142+
143+
- `claude-opus-4-1-20250805` - Latest most powerful (Aug 2025)
144+
- `claude-sonnet-4-20250514` - Latest intelligent (May 2025)
145+
- Progressive model generations (4.1, 4, 3.7, 3.5v2, 3.5)
128146

129147
### Groq (Updated)
148+
130149
**Latest Models (September 2025 releases):**
131-
- Latest 2025 models: Kimi K2, GPT OSS, Llama 4 variants
132-
- Ultra-fast inference maintained for all models
133-
- Backward compatibility with existing models
150+
151+
- Latest 2025 models: Kimi K2, GPT OSS, Llama 4 variants
152+
- Ultra-fast inference maintained for all models
153+
- Backward compatibility with existing models
134154

135155
## Configuration Updates
136156

137157
### Updated Files
138-
- `vtcode-core/src/config/models.rs` - Main model definitions
139-
- `vtcode-core/src/config/constants.rs` - Updated constants
140-
- `vtcode-core/src/llm/client.rs` - Provider factory updates
141-
- `vtcode.toml.example` - Configuration examples
158+
159+
- `vtcode-core/src/config/models.rs` - Main model definitions
160+
- `vtcode-core/src/config/constants.rs` - Updated constants
161+
- `vtcode-core/src/llm/client.rs` - Provider factory updates
162+
- `vtcode.toml.example` - Configuration examples
142163

143164
### Model Organization
144-
- Type-safe enum with future-ready models
145-
- Complete display names and descriptions
146-
- Provider-specific default models
147-
- Comprehensive model metadata
165+
166+
- Type-safe enum with future-ready models
167+
- Complete display names and descriptions
168+
- Provider-specific default models
169+
- Comprehensive model metadata
148170

149171
## Implementation Status
150172

151173
### Successfully Completed
152174

153175
#### 1. Model Infrastructure Update
154-
- Updated `ModelId` enum with 67 models (was ~20)
155-
- Added new providers: `DeepSeek`
156-
- Updated all existing providers with 2025 models
157-
- Complete display names and descriptions for all models
158-
- Updated provider factory with new provider support
159-
- Fixed all model reference inconsistencies
176+
177+
- Updated `ModelId` enum with 67 models (was ~20)
178+
- Added new providers: `DeepSeek`
179+
- Updated all existing providers with 2025 models
180+
- Complete display names and descriptions for all models
181+
- Updated provider factory with new provider support
182+
- Fixed all model reference inconsistencies
160183

161184
#### 2. New Providers Added
162-
- **DeepSeek** (2 models): Reasoning specialist with R1 technology
185+
186+
- **DeepSeek** (2 models): Reasoning specialist with R1 technology
163187

164188
#### 3. Updated Existing Providers
165-
- **Gemini** (5 models): Latest 2.5 series
166-
- **OpenAI** (8 models): GPT-5 and reasoning models
167-
- **Anthropic** (6 models): Claude 4.1 and 4 series
168-
- **Groq** (18 models): Latest 2025 models
189+
190+
- **Gemini** (5 models): Latest 2.5 series
191+
- **OpenAI** (8 models): GPT-5 and reasoning models
192+
- **Anthropic** (6 models): Claude 4.1 and 4 series
193+
- **Groq** (18 models): Latest 2025 models
169194

170195
#### 4. Configuration Updates
171-
- Updated `vtcode.toml.example` with all 67 models
172-
- Updated constants and defaults
173-
- Fixed model string mappings
174-
- Updated fallback and provider-specific defaults
196+
197+
- Updated `vtcode.toml.example` with all 67 models
198+
- Updated constants and defaults
199+
- Fixed model string mappings
200+
- Updated fallback and provider-specific defaults
175201

176202
#### 5. Code Quality & Testing
177-
- Updated all display methods and utility functions
178-
- Fixed model variant detection (flash, pro, efficient, top-tier)
179-
- Updated generation/version strings
180-
- Fixed all test cases and references
181-
- Comprehensive model metadata
203+
204+
- Updated all display methods and utility functions
205+
- Fixed model variant detection (flash, pro, efficient, top-tier)
206+
- Updated generation/version strings
207+
- Fixed all test cases and references
208+
- Comprehensive model metadata
182209

183210
## Impact Summary
184211

185212
### Before → After
186-
- **Models**: ~20 → 67 models (+235% increase)
187-
- **Providers**: 7 → 8 providers (+1 new)
188-
- **Latest Tech**: Added GPT-5, Claude Opus 4.1, DeepSeek R1
189-
- **Performance**: Maintained ultra-fast Groq inference, added reasoning models
213+
214+
- **Models**: ~20 → 67 models (+235% increase)
215+
- **Providers**: 7 → 8 providers (+1 new)
216+
- **Latest Tech**: Added GPT-5, Claude Opus 4.1, DeepSeek R1
217+
- **Performance**: Maintained ultra-fast Groq inference, added reasoning models
190218

191219
### New Capabilities
192-
- **Advanced Reasoning**: DeepSeek R1, OpenAI o3/o4 series
193-
- **Latest Generation**: GPT-5, Claude 4.1
194-
- **Specialized Models**: Code generation, reasoning, vision models
195-
- **Cost Optimization**: New preview and lite models for efficiency
220+
221+
- **Advanced Reasoning**: DeepSeek R1, OpenAI o3/o4 series
222+
- **Latest Generation**: GPT-5, Claude 4.1
223+
- **Specialized Models**: Code generation, reasoning, vision models
224+
- **Cost Optimization**: New preview and lite models for efficiency
196225

197226
## Files Modified
198-
- `vtcode-core/src/config/models.rs` (partial)
199-
- `vtcode-core/src/config/constants.rs` (complete)
200-
- `vtcode.toml.example` (complete)
201-
- Various utility files need model name updates
227+
228+
- `vtcode-core/src/config/models.rs` (partial)
229+
- `vtcode-core/src/config/constants.rs` (complete)
230+
- `vtcode.toml.example` (complete)
231+
- Various utility files need model name updates
202232

203233
## Next Steps
234+
204235
1. **Immediate Fix** (5 minutes):
205-
- Replace all `Gemini25FlashLite` references with `Gemini25FlashLitePreview0617`
206-
- Add basic display names for compilation
236+
237+
- Replace all `Gemini25FlashLite` references with `Gemini25FlashLitePreview0617`
238+
- Add basic display names for compilation
207239

208240
2. **Provider Implementation** (10 minutes):
209-
- Add DeepSeek to LLM client factory
210-
- Add basic provider routing
241+
242+
- Add DeepSeek to LLM client factory
243+
- Add basic provider routing
211244

212245
3. **Complete Model Registry** (15 minutes):
213-
- Update methods with all 31 new models
214-
- Add proper error handling and validation
246+
- Update methods with all 31 new models
247+
- Add proper error handling and validation
215248

216249
## Success Metrics
217-
- **67 models** successfully defined and configured
218-
- **1 new provider** (DeepSeek) integrated
219-
- **All existing providers** updated with latest models
220-
- **Complete metadata** for all models (names, descriptions, generations)
250+
251+
- **67 models** successfully defined and configured
252+
- **1 new provider** (DeepSeek) integrated
253+
- **All existing providers** updated with latest models
254+
- **Complete metadata** for all models (names, descriptions, generations)

vtcode-core/src/config/constants.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,14 @@ pub mod models {
1010
pub mod google {
1111
pub const DEFAULT_MODEL: &str = "gemini-2.5-flash-lite-preview-06-17";
1212
pub const SUPPORTED_MODELS: &[&str] = &[
13-
"gemini-2.0-flash-exp",
14-
"gemini-2.0-flash-001",
13+
"gemini-2.5-flash",
14+
"gemini-2.5-flash-lite",
1515
"gemini-2.5-pro",
1616
];
1717

1818
// Convenience constants for commonly used models
19-
pub const GEMINI_2_5_FLASH_LITE: &str = "gemini-2.0-flash-exp";
20-
pub const GEMINI_2_5_FLASH: &str = "gemini-2.0-flash-001";
19+
pub const GEMINI_2_5_FLASH_LITE: &str = "gemini-2.5-flash-lite";
20+
pub const GEMINI_2_5_FLASH: &str = "gemini-2.5-flash";
2121
pub const GEMINI_2_5_PRO: &str = "gemini-2.5-pro";
2222
}
2323

vtcode-core/src/config/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
//!
2929
//! [llm.providers.gemini]
3030
//! api_key = "your-key"
31-
//! model = "gemini-2.0-flash-exp"
31+
//! model = "gemini-2.5-flash"
3232
//!
3333
//! [security]
3434
//! workspace_root = "/path/to/project"

0 commit comments

Comments
 (0)