Skip to content

Comments

Add Jet-Nemotron#12448

Merged
Fridge003 merged 19 commits intosgl-project:mainfrom
futrime:jet-nemotron
Nov 9, 2025
Merged

Add Jet-Nemotron#12448
Fridge003 merged 19 commits intosgl-project:mainfrom
futrime:jet-nemotron

Conversation

@futrime
Copy link
Contributor

@futrime futrime commented Oct 31, 2025

Motivation

To add support for Jet-Nemotron.

Modifications

  • Added Jet-Nemotron implementation.
  • Registered Jet-Nemotron as hybrid GDN attention model.
  • Added Jet-Nemotron configuration.

Accuracy Tests

Model GSM8K MMLU
Jet-Nemotron-2B 0.762 0.622
Jet-Nemotron-4B 0.792 0.675

Benchmarking and Profiling

$ python3 -m sglang.bench_serving --backend sglang --num-prompt 100
============ Serving Benchmark Result ============
Backend:                                 sglang    
Traffic request rate:                    inf       
Max request concurrency:                 not set   
Successful requests:                     100       
Benchmark duration (s):                  36.61     
Total input tokens:                      33839     
Total input text tokens:                 33839     
Total input vision tokens:               0         
Total generated tokens:                  21640     
Total generated tokens (retokenized):    12768     
Request throughput (req/s):              2.73      
Input token throughput (tok/s):          924.20    
Output token throughput (tok/s):         591.03    
Total token throughput (tok/s):          1515.23   
Concurrency:                             39.95     
----------------End-to-End Latency----------------
Mean E2E Latency (ms):                   14626.36  
Median E2E Latency (ms):                 13954.04  
---------------Time to First Token----------------
Mean TTFT (ms):                          1397.39   
Median TTFT (ms):                        1266.72   
P99 TTFT (ms):                           2969.48   
---------------Inter-Token Latency----------------
Mean ITL (ms):                           61.49     
Median ITL (ms):                         65.26     
P95 ITL (ms):                            93.59     
P99 ITL (ms):                            205.48    
Max ITL (ms):                            1660.42   
==================================================

Checklist

@futrime futrime marked this pull request as ready for review November 3, 2025 13:35
@futrime futrime marked this pull request as draft November 3, 2025 13:38
@futrime futrime marked this pull request as ready for review November 4, 2025 07:53
@zhaochenyang20
Copy link
Collaborator

Nice done!


# Nightly tests
DEFAULT_MODEL_NAME_FOR_NIGHTLY_EVAL_TP1 = "meta-llama/Llama-3.1-8B-Instruct,mistralai/Mistral-7B-Instruct-v0.3,deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct,google/gemma-2-27b-it"
DEFAULT_MODEL_NAME_FOR_NIGHTLY_EVAL_TP1 = "meta-llama/Llama-3.1-8B-Instruct,mistralai/Mistral-7B-Instruct-v0.3,deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct,google/gemma-2-27b-it,jet-ai/Jet-Nemotron-2B"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tends not to change this 😂

@futrime futrime requested a review from Fridge003 as a code owner November 8, 2025 03:22
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 8, 2025
@zhaochenyang20
Copy link
Collaborator

Only two CI left. Let's wait and see.

@Fridge003 Fridge003 merged commit 3633f8b into sgl-project:main Nov 9, 2025
185 of 201 checks passed
@zhaochenyang20
Copy link
Collaborator

congrats!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants