Skip to content

Model Gateway 2.3 Release tracker #13099

@slin1237

Description

@slin1237

🚀 SGLang Model Gateway - New Release!

✨ Headline Features

⚡ Bucket Mode Routing - 20-30% Performance Boost

Introducing our new bucket-based routing algorithm that dramatically improves performance in prefix-disabled (PD) mode. See up to 20-30% improvements in TTFT (Time To First Token) and overall throughput – making your inference workloads faster and more efficient than ever!

💾 PostgreSQL Support for Chat History Management

Flexibility in data storage! We now support PostgreSQL alongside OracleDB and in-memory storage for chat history management. Choose the database solution that best fits your infrastructure and scale requirements.

🛠️ Enhanced Tool & Structured Output Support

  • MinMax M2 model reasoning and function calling support
  • Structured model output for OpenAI and gRPC router
  • Streaming parsing with Tool Choice in chat completions API
  • Tool_choice support for Responses API
  • OutputItemDone events with output item array storage for better observability

🐛 Stability & Quality Improvements

Multiple bug fixes for model validation, streaming logic, reasoning content indexing, and CI stability enhancements.

🔧 Code Quality Enhancements

Refactored builders for chat and responses, restructured modules for better maintainability, and consolidated error handling.


Features

Bug Fixes

Enhancement

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions