Skip to content

[Bug] Language model still thinks when "thinking" is disabled in TextGenerate #13641

@fappaz

Description

@fappaz

Custom Node Testing

Expected Behavior

If I disable "thinking" in the TextGenerate node, I expect the language models not to think before answering.

Actual Behavior

The language model still thinks even when thinking is disabled.

Steps to Reproduce

  • Import this workflow:
{"id":"5af8db4d-44eb-49f0-b205-76b0aa497cff","revision":0,"last_node_id":3,"last_link_id":2,"nodes":[{"id":1,"type":"TextGenerate","pos":[743.3188073596824,-1303.468489826901],"size":[400,372],"flags":{},"order":1,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":1},{"name":"image","shape":7,"type":"IMAGE","link":null}],"outputs":[{"name":"generated_text","type":"STRING","links":[2]}],"properties":{"Node name for S&R":"TextGenerate"},"widgets_values":["what's your name?",256,"on",0.7,64,0.95,0.05,1.05,0,0,false,true]},{"id":2,"type":"CLIPLoader","pos":[320,-1300],"size":[360,120],"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"name":"CLIP","type":"CLIP","links":[1]}],"properties":{"Node name for S&R":"CLIPLoader"},"widgets_values":["qwen_3_4b_fp8_mixed.safetensors","stable_diffusion","default"]},{"id":3,"type":"PreviewAny","pos":[1220,-1300],"size":[540,600],"flags":{},"order":2,"mode":0,"inputs":[{"name":"source","type":"*","link":2}],"outputs":[{"name":"STRING","type":"STRING","links":null}],"properties":{"Node name for S&R":"PreviewAny"},"widgets_values":[null,null,false]}],"links":[[1,2,0,1,0,"CLIP"],[2,1,0,3,0,"STRING"]],"groups":[],"config":{},"extra":{"ds":{"scale":0.9849732675807851,"offset":[-153.98631927138626,1543.676768024058]},"frontendVersion":"1.42.11"},"version":0.4}
  • In the CLIPLoader node, select a language model that supports thinking mode e..g: Qwen 3 4B
  • Run the workflow

Notice how the output still shows the model thinking:

<think>
Okay, the user is asking for my name. I need to respond appropriately.

First, I should confirm that my name is Qwen. That's correct.

I should mention that I am developed by Alibaba Cloud and part of the Qwen series. That gives context about my origin.

Also, I should highlight my capabilities, like answering questions, creating content, and helping with tasks. This shows the user what I can do.

I should keep the response friendly and welcoming. Maybe add an emoji to make it more approachable.

Make sure the answer is clear and concise. Avoid technical jargon so it's easy to understand.

Check for any typos or errors. Ensure the information is accurate.

Alright, time to put it all together in a natural way.
</think>

Hello! My name is Qwen. I am developed by Alibaba Cloud and belong to the Qwen series. I can help you with answering questions, creating content, and assisting with various tasks. How can I help you today? 😊
Image

Adding /no_think to the user prompt helps, but it still outputs the <think></think> tags, so my workaround is to add a RegexReplace to remove the stubborn prefix:

Image

Debug Logs

got prompt
Found quantization metadata version 1
Using MixedPrecisionOps for text encoder
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load ZImageTEModel_
loaded completely; 5688.80 MB usable, 4207.26 MB loaded, full load: True
Generating tokens:  80%|████████  | 205/256 [00:29<00:07,  6.95it/s]
Prompt executed in 35.07 seconds

Other

ComfyUI: v0.20.1
ComfyUI_frontend: v1.42.15
OS: win32
Python Version: 3.12.9 (main, Feb 12 2025, 14:52:31) [MSC v.1942 64 bit (AMD64)]
Embedded Python: false
Pytorch Version: 2.6.0+cu126
Device Name: cuda:0 NVIDIA GeForce RTX 3080 Laptop GPU : cudaMallocAsync

Metadata

Metadata

Assignees

No one assigned

    Labels

    Potential BugUser is reporting a bug. This should be tested.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions