[Tool Call] Steamline function arguments when tool_choice="auto" for deepseekv31_detector #11589

Muqi1029 · 2025-10-14T05:59:03Z

Motivation

The current deepseekv31_detector only detects complete function arguments and immediately returns them while clearing the buffer.

This approach can lead to a poor user experience: when function arguments are very long, there can be a noticeable delay before the arguments are returned.

Modifications

Relax the previous strict regular expression by using the tool call end token as a conditional.

Now the pattern matches either the tool call end token or the end of the string as a separate group.

The argument group is matched non-greedily, which prevents excessive consumption of text.

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gemini-code-assist · 2025-10-14T05:59:06Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

JustinTong0323 · 2025-10-16T03:22:42Z

just curios, in which scenarios we need to show the partial tool call item (argument) to user?

Muqi1029 · 2025-10-16T06:38:39Z

just curios, in which scenarios we need to show the partial tool call item (argument) to user?

@JustinTong0323 Thanks for your question!

I think the reason is similar to why we use streaming output in general — it’s better for users to see a slimmer, more incremental output when printing for debugging or other purposes. More details can be found in the Motivation section.

SGLang already supports streaming output for:

The reasoning part
The normal content part
Tool call part when tool_choice is True or a special function

So, why not the tool call part when tool_choice is auto as well?

I’ve tested this PR, and it works as expected! And it doesn't introduce any overhead to the code.

CatherineSue

Thank you for this clever change!

Could you include a screenshot of a test case and the streaming results of before vs. after?

CatherineSue · 2025-10-16T15:42:21Z

python/sglang/srt/function_call/deepseekv31_detector.py

-                            tool_call_end_pattern, current_text, re.DOTALL
-                        )
-                        if match:
+                        if is_tool_end:


This is still inside of the if _is_complete_json() check. I imagine this would usually be the case where the function arguments is complete. Is my understanding correct?

Yes. Actually I don't change the main logic. I just relaxed the re match condition, so that function args could be returned once part of it is generated

Muqi1029 · 2025-10-16T15:52:22Z

Thank you for this clever change!

Could you include a screenshot of a test case and the streaming results of before vs. after?

@CatherineSue

Client:

def test_stream_tool_call(base_url, api_key):
    client = OpenAI(base_url=base_url + "/v1", api_key=api_key)
    model = list(client.models.list())[0].id
    print(f"Using {model=}\n\n")
    response_stream = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "user", "content": "What is the weather like in Boston in MA today? Please use Fahrenheit."}
        ],
        tools=tool_search_weather,
        stream=True,
        extra_body={"chat_template_kwargs": {"thinking": True}},
        # tool_choice="required",
        # stream_options={
        #     "include_usage": True
        # }

    )
    for chunk in response_stream:
        if chunk.choices[0].delta.reasoning_content:
            print(chunk.choices[0].delta.reasoning_content, end="", flush=True)
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)
        if chunk.choices[0].delta.tool_calls:
            print(chunk.choices[0].delta.tool_calls[0])
    print()

tool's definition is here

To get the complete streaming output, it could be seen in the SGLang doc here,

--

Before:

After:

d6638219 · 2025-11-06T06:39:17Z

Is there any progress?

jxz542189 · 2025-11-09T01:50:46Z

Is there any progress? expecting

Muqi1029 · 2025-11-09T03:06:20Z

@JustinTong0323 @CatherineSue Could you help review this PR again, thanks

cynial · 2025-11-13T07:52:23Z

@Muqi1029 It seems the test isn’t working as expected. Could you take a look?

Muqi1029 · 2025-11-13T07:55:34Z

@Muqi1029 It seems the test isn’t working as expected. Could you take a look?

@cynial sorry, which test do you mean? The CI? I have looked the failed checks, they are not related with this pr.

cynial · 2025-11-14T02:51:02Z

@cynial sorry, which test do you mean? The CI? I have looked the failed checks, they are not related with this pr.

@Muqi1029 Got it - I will try to deploy this patch to production. Thank you for your contribution and effort.

Muqi1029 added 2 commits October 11, 2025 17:46

Streamline function arguments for deepseekv31_detector

8e48c3e

Fix re

b72c754

Muqi1029 requested review from CatherineSue and JustinTong0323 as code owners October 14, 2025 05:59

Muqi1029 added 2 commits October 16, 2025 09:40

rename

2dc1302

Merge branch 'main' into stream

4e9dffc

CatherineSue reviewed Oct 16, 2025

View reviewed changes

JustinTong0323 mentioned this pull request Oct 24, 2025

[Bug] GLM-4.6 tool calls don't support streaming output for arguments in SGLang #11888

Open

5 tasks

jxz542189 mentioned this pull request Nov 9, 2025

[Bug] Deepseek-v3.1-Terminus stream parse function call #12898

Open

5 tasks

Kangyan-Zhou added the run-ci label Nov 9, 2025

Merge branch 'main' into stream

d0979b9

github-actions bot added the deepseek label Nov 9, 2025

Merge branch 'main' into stream

a324f72

CatherineSue approved these changes Nov 10, 2025

View reviewed changes

Fridge003 merged commit fc5da1e into sgl-project:main Nov 14, 2025
117 of 124 checks passed

Muqi1029 deleted the stream branch December 9, 2025 07:33

[Tool Call] Steamline function arguments when tool_choice="auto" for deepseekv31_detector #11589

[Tool Call] Steamline function arguments when tool_choice="auto" for deepseekv31_detector #11589

Conversation

Muqi1029 commented Oct 14, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Oct 14, 2025

Uh oh!

JustinTong0323 commented Oct 16, 2025

Uh oh!

Muqi1029 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CatherineSue left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CatherineSue Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Muqi1029 Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Muqi1029 commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

d6638219 commented Nov 6, 2025

Uh oh!

jxz542189 commented Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Muqi1029 commented Nov 9, 2025

Uh oh!

cynial commented Nov 13, 2025

Uh oh!

Muqi1029 commented Nov 13, 2025

Uh oh!

cynial commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Muqi1029 commented Oct 16, 2025 •

edited

Loading

CatherineSue left a comment •

edited

Loading

CatherineSue Oct 16, 2025 •

edited

Loading

Muqi1029 commented Oct 16, 2025 •

edited

Loading

jxz542189 commented Nov 9, 2025 •

edited

Loading