Skip to content

Add detok in chat completion fn for non stream mode when VLLM_DETOKENIZE_ON_OPENAI_SERVER=true#1768

Merged
michalkuligowski merged 3 commits intohabana_mainfrom
separk/detok_nonstream
Aug 29, 2025
Merged

Add detok in chat completion fn for non stream mode when VLLM_DETOKENIZE_ON_OPENAI_SERVER=true#1768
michalkuligowski merged 3 commits intohabana_mainfrom
separk/detok_nonstream

Conversation

@shepark
Copy link
Copy Markdown

@shepark shepark commented Aug 18, 2025

Adding missing part in #1741 for chat_completion_full_generator().

When VLLM_DETOKENIZE_ON_OPENAI_SERVER=true
@xuechendi
Copy link
Copy Markdown

there are two files, serving_chat.py and serving_completion.py, does the other once also need to have same change?

@shepark
Copy link
Copy Markdown
Author

shepark commented Aug 18, 2025

there are two files, serving_chat.py and serving_completion.py, does the other once also need to have same change?

Actually, it's needed for our customer support only for now.
But, it'd be better to have one PR to cover all location.
I will close this PR.

@shepark shepark closed this Aug 18, 2025
@shepark shepark reopened this Aug 25, 2025
@xuechendi
Copy link
Copy Markdown

/run-gaudi-tests

Copy link
Copy Markdown

@xuechendi xuechendi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed by customer, ok to have partially supported per author's request

@xuechendi
Copy link
Copy Markdown

/run-gaudi-tests

@michalkuligowski michalkuligowski enabled auto-merge (squash) August 28, 2025 17:51
@michalkuligowski michalkuligowski merged commit 06a6d5d into habana_main Aug 29, 2025
47 checks passed
@michalkuligowski michalkuligowski deleted the separk/detok_nonstream branch August 29, 2025 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants