Skip to content

[model-gateway] Add tokenize/detokenize HTTP endpoints and tokenizer management#15702

Merged
slin1237 merged 1 commit intomainfrom
tokenizer-ep
Dec 24, 2025
Merged

[model-gateway] Add tokenize/detokenize HTTP endpoints and tokenizer management#15702
slin1237 merged 1 commit intomainfrom
tokenizer-ep

Conversation

@slin1237
Copy link
Collaborator

This adds HTTP endpoints for tokenization operations similar to SGLang Python:

  • POST /v1/tokenize - tokenize text into token IDs
  • POST /v1/detokenize - convert token IDs back to text
  • POST /v1/tokenizers - add a tokenizer (async via job queue)
  • GET /v1/tokenizers - list all registered tokenizers
  • GET /v1/tokenizers/{uuid} - get tokenizer info
  • GET /v1/tokenizers/{uuid}/status - get tokenizer loading status
  • DELETE /v1/tokenizers/{uuid} - remove a tokenizer

Key implementation details:

  • Uses workflow pattern for tokenizer registration to support async loading from HuggingFace (similar to worker registration)
  • Tokenizer deletion is synchronous since it's a simple HashMap remove with no external resources to clean up
  • Supports both single text and batch tokenization/detokenization
  • Returns job_id immediately for add_tokenizer, status can be polled

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

…management

This adds HTTP endpoints for tokenization operations similar to SGLang Python:
- POST /v1/tokenize - tokenize text into token IDs
- POST /v1/detokenize - convert token IDs back to text
- POST /v1/tokenizers - add a tokenizer (async via job queue)
- GET /v1/tokenizers - list all registered tokenizers
- GET /v1/tokenizers/{uuid} - get tokenizer info
- GET /v1/tokenizers/{uuid}/status - get tokenizer loading status
- DELETE /v1/tokenizers/{uuid} - remove a tokenizer

Key implementation details:
- Uses workflow pattern for tokenizer registration to support async
  loading from HuggingFace (similar to worker registration)
- Tokenizer deletion is synchronous since it's a simple HashMap remove
  with no external resources to clean up
- Supports both single text and batch tokenization/detokenization
- Returns job_id immediately for add_tokenizer, status can be polled
@slin1237 slin1237 merged commit 846953d into main Dec 24, 2025
62 checks passed
@slin1237 slin1237 deleted the tokenizer-ep branch December 24, 2025 01:32
jiaming1130 pushed a commit to zhuyijie88/sglang that referenced this pull request Dec 25, 2025
GuoYechang pushed a commit to GuoYechang/sglang that referenced this pull request Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments