feat: Improve Mistral and Qwen25 function call parsing by CatherineSue · Pull Request #6597 · sgl-project/sglang

CatherineSue · 2025-05-25T18:28:58Z

Motivation

This PR focuses to resolve the parallel tool calls parsing for MistralDetector and Qwen25Detector
See Multiple Tool Call Support for MistralDetector and Qwen25Detector for more details.

TL;DR:

Qwen25Detector's bot_token and eot_token is for single function call. But EBNFComposer was designed for using bot_token and eot_token for the entier function call sequence.
MistralDetector doesn't support the parallel tool calls parsing yet.

Modifications

refactor EBNF composer API for multiple tool call support

ebnf_composer.py: Refactor build_ebnf() method to use clearer
parameter names:
- Replace bot_token/eot_token with
  sequence_start_token/sequence_end_token for sequence-level wrapping
- Add individual_call_start_token/individual_call_end_token for
  individual call wrapping
- Rename TOOL_CALLS_MAP to TOOL_CALL_MAP and update logic to handle
  both sequence and individual call patterns
qwen25_detector.py: Update to use new individual_call_* parameters
for per-call token wrapping
mistral_detector.py: Update parameter names and improve regex comment
to clarify multiple tool call support
deepseekv3_detector.py: Update to use new sequence_* parameters
for sequence-level token wrapping
pythonic_detector.py: Update to use new sequence_* parameters

improve streaming function call parsing and token handling in Qwen25Detector and MistralDetector

base_format_detector.py: Add ends_with_partial_token() method to detect
partial bot tokens during streaming, improving buffer management and preventing
premature buffer clearing
qwen25_detector.py: Implement custom streaming parser with buffering to
handle partial end tokens (</tool_call>) that are streamed character-by-character,
preventing them from appearing in normal text output
mistral_detector.py:
- Refactor JSON parsing to properly handle both single
  objects and arrays, improve error handling with logging, and remove deprecated
  _clean_text() method
- Improve JSON parsing when there are nested brackets in the arguments, such
  as {"name":"make_next_step_decision", "arguments":{"decision":"","content":"\nTOOL: Access a weather API or service\nOBSERVATION: Retrieve the current weather data for the top 5 populated cities in the US\nANSWER: The weather in the top 5 populated cities in the US is as follows: [City Name] - [Weather Conditions] - [Temperature]\n"}

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

- **ebnf_composer.py**: Refactor `build_ebnf()` method to use clearer parameter names: - Replace `bot_token`/`eot_token` with `sequence_start_token`/`sequence_end_token` for sequence-level wrapping - Add `individual_call_start_token`/`individual_call_end_token` for individual call wrapping - Rename `TOOL_CALLS_MAP` to `TOOL_CALL_MAP` and update logic to handle both sequence and individual call patterns - **qwen25_detector.py**: Update to use new `individual_call_*` parameters for per-call token wrapping - **mistral_detector.py**: Update parameter names and improve regex comment to clarify multiple tool call support - **deepseekv3_detector.py**: Update to use new `sequence_*` parameters for sequence-level token wrapping - **pythonic_detector.py**: Update to use new `sequence_*` parameters This refactoring provides a clearer API distinction between tokens that wrap the entire sequence of tool calls versus tokens that wrap individual calls, enabling better support for multiple tool call formats across different model types.

- **base_format_detector.py**: Add `ends_with_partial_token()` method to detect partial bot tokens during streaming, improving buffer management and preventing premature buffer clearing - **qwen25_detector.py**: Implement custom streaming parser with buffering to handle partial end tokens (`</tool_call>`) that are streamed character-by-character, preventing them from appearing in normal text output - **mistral_detector.py**: Refactor JSON parsing to properly handle both single objects and arrays, improve error handling with logging, and remove deprecated `_clean_text()` method These changes enhance the robustness of streaming function call detection across different model formats, particularly addressing issues where partial tokens were incorrectly processed or leaked into normal text output.

CatherineSue · 2025-05-25T18:32:58Z

Manual test for chang/tests/examples/test_tool_choice.py

Mistral
mistralai/Mistral-7B-Instruct-v0.3 seems flaky in multiple tool call support.

Qwen25

The MistralDetector was failing to parse tool calls when the JSON content contained nested brackets (e.g., "[City Name]" within string values). **Context** - Regex pattern `r"\[TOOL_CALLS\] (\[.*?\])"` used non-greedy matching - Would stop at first ']' encountered, even if inside a JSON string **Changes** - Replaced regex-based extraction with bracket counting algorithm - New `_extract_json_array()` method properly handles: - Nested brackets within JSON strings - Escaped characters and quotes - Proper string boundary detection - Add UT for MistralDetector

- function_call_unit for pythonic and json should be the same, both are `function_call` - Remove `TOOL_CALL_MAP` as pythonic and json should be the same.

)

MooMoo-Yang · 2025-07-28T13:56:55Z

Does tool call support image return or multi-modal return?

CatherineSue added 3 commits May 25, 2025 11:18

Add TODO in structure_info()

d7f5f1a

CatherineSue requested review from ByronHsu, Ying1123, hnyls2002, ispobock, merrymercy and zhyncs as code owners May 25, 2025 18:28

zhyncs added the high priority label May 25, 2025

zhyncs assigned zhyncs, JustinTong0323 and ispobock May 25, 2025

CatherineSue added 2 commits May 25, 2025 11:45

Fix lint

0e4f78a

CatherineSue requested a review from zhaochenyang20 as a code owner May 25, 2025 18:49

CatherineSue mentioned this pull request May 25, 2025

[Feature] Tool Call Roadmap #6589

Closed

7 tasks

Fix pythonic function_call_unit in build_ebnf

cb6f6e1

- function_call_unit for pythonic and json should be the same, both are `function_call` - Remove `TOOL_CALL_MAP` as pythonic and json should be the same.

seongjiko mentioned this pull request May 26, 2025

[Bug] incorret tool_calls index when there are multi tool_calls and in stream mode #6310

Closed

5 tasks

JustinTong0323 mentioned this pull request May 26, 2025

[New Model] Devstral support #6547

Merged

6 tasks

zhyncs approved these changes May 26, 2025

View reviewed changes

Merge branch 'main' into chang/tool-call-mistral-qwen25

9e14465

zhyncs merged commit 16f69b1 into main May 26, 2025
0 of 36 checks passed

zhyncs deleted the chang/tool-call-mistral-qwen25 branch May 26, 2025 06:07

Layssy pushed a commit to Layssy/sglang-iaas that referenced this pull request Jun 9, 2025

feat: Improve Mistral and Qwen25 function call parsing (sgl-project#6597

07fb14e

)

xwu-intel pushed a commit to xwu-intel/sglang that referenced this pull request Jun 17, 2025

feat: Improve Mistral and Qwen25 function call parsing (sgl-project#6597

7dff701

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Improve Mistral and Qwen25 function call parsing#6597

feat: Improve Mistral and Qwen25 function call parsing#6597
zhyncs merged 7 commits intomainfrom
chang/tool-call-mistral-qwen25

CatherineSue commented May 25, 2025 •

edited

Loading

Uh oh!

CatherineSue commented May 25, 2025

Uh oh!

Uh oh!

MooMoo-Yang commented Jul 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

CatherineSue commented May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

refactor EBNF composer API for multiple tool call support

improve streaming function call parsing and token handling in Qwen25Detector and MistralDetector

Checklist

Uh oh!

CatherineSue commented May 25, 2025

Uh oh!

Uh oh!

MooMoo-Yang commented Jul 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CatherineSue commented May 25, 2025 •

edited

Loading