feat: Improve Mistral and Qwen25 function call parsing#6597
Merged
Conversation
- **ebnf_composer.py**: Refactor `build_ebnf()` method to use clearer parameter names: - Replace `bot_token`/`eot_token` with `sequence_start_token`/`sequence_end_token` for sequence-level wrapping - Add `individual_call_start_token`/`individual_call_end_token` for individual call wrapping - Rename `TOOL_CALLS_MAP` to `TOOL_CALL_MAP` and update logic to handle both sequence and individual call patterns - **qwen25_detector.py**: Update to use new `individual_call_*` parameters for per-call token wrapping - **mistral_detector.py**: Update parameter names and improve regex comment to clarify multiple tool call support - **deepseekv3_detector.py**: Update to use new `sequence_*` parameters for sequence-level token wrapping - **pythonic_detector.py**: Update to use new `sequence_*` parameters This refactoring provides a clearer API distinction between tokens that wrap the entire sequence of tool calls versus tokens that wrap individual calls, enabling better support for multiple tool call formats across different model types.
- **base_format_detector.py**: Add `ends_with_partial_token()` method to detect partial bot tokens during streaming, improving buffer management and preventing premature buffer clearing - **qwen25_detector.py**: Implement custom streaming parser with buffering to handle partial end tokens (`</tool_call>`) that are streamed character-by-character, preventing them from appearing in normal text output - **mistral_detector.py**: Refactor JSON parsing to properly handle both single objects and arrays, improve error handling with logging, and remove deprecated `_clean_text()` method These changes enhance the robustness of streaming function call detection across different model formats, particularly addressing issues where partial tokens were incorrectly processed or leaked into normal text output.
Collaborator
Author
|
Manual test for chang/tests/examples/test_tool_choice.py Mistral |
The MistralDetector was failing to parse tool calls when the JSON content contained nested brackets (e.g., "[City Name]" within string values). **Context** - Regex pattern `r"\[TOOL_CALLS\] (\[.*?\])"` used non-greedy matching - Would stop at first ']' encountered, even if inside a JSON string **Changes** - Replaced regex-based extraction with bracket counting algorithm - New `_extract_json_array()` method properly handles: - Nested brackets within JSON strings - Escaped characters and quotes - Proper string boundary detection - Add UT for MistralDetector
7 tasks
- function_call_unit for pythonic and json should be the same, both are `function_call` - Remove `TOOL_CALL_MAP` as pythonic and json should be the same.
5 tasks
6 tasks
zhyncs
approved these changes
May 26, 2025
Layssy
pushed a commit
to Layssy/sglang-iaas
that referenced
this pull request
Jun 9, 2025
xwu-intel
pushed a commit
to xwu-intel/sglang
that referenced
this pull request
Jun 17, 2025
|
Does tool call support image return or multi-modal return? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Motivation
This PR focuses to resolve the parallel tool calls parsing for
MistralDetectorandQwen25DetectorSee Multiple Tool Call Support for MistralDetector and Qwen25Detector for more details.
TL;DR:
bot_tokenandeot_tokenis for single function call. But EBNFComposer was designed for using bot_token and eot_token for the entier function call sequence.Modifications
refactor EBNF composer API for multiple tool call support
build_ebnf()method to use clearerparameter names:
bot_token/eot_tokenwithsequence_start_token/sequence_end_tokenfor sequence-level wrappingindividual_call_start_token/individual_call_end_tokenforindividual call wrapping
TOOL_CALLS_MAPtoTOOL_CALL_MAPand update logic to handleboth sequence and individual call patterns
individual_call_*parametersfor per-call token wrapping
to clarify multiple tool call support
sequence_*parametersfor sequence-level token wrapping
sequence_*parametersimprove streaming function call parsing and token handling in Qwen25Detector and MistralDetector
ends_with_partial_token()method to detectpartial bot tokens during streaming, improving buffer management and preventing
premature buffer clearing
handle partial end tokens (
</tool_call>) that are streamed character-by-character,preventing them from appearing in normal text output
objects and arrays, improve error handling with logging, and remove deprecated
_clean_text()methodas
{"name":"make_next_step_decision", "arguments":{"decision":"","content":"\nTOOL: Access a weather API or service\nOBSERVATION: Retrieve the current weather data for the top 5 populated cities in the US\nANSWER: The weather in the top 5 populated cities in the US is as follows: [City Name] - [Weather Conditions] - [Temperature]\n"}Checklist