adjust dynamic vs static outputs comparison in test_lora_update.py by glenliu21 · Pull Request #11884 · sgl-project/sglang

glenliu21 · 2025-10-21T00:52:14Z

Motivation

Currently, test_lora_update.py requires dynamic outputs be equivalent to static outputs. This is too strict of a test, and it occasionally fails.

For example, I ran with the following hardware:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08              Driver Version: 575.57.08      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H200                    On  |   00000000:26:00.0 Off |                    0 |
| N/A   26C    P0             75W /  700W |       0MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

and got the error:

AssertionError: 'the [55 chars] It involves the use of algorithms, data struc[86 chars]s to' != 'the [55 chars] It is a broad field that encompasses a wide r[74 chars]orks'
- the theory, design, and implementation of computer systems. It involves the use of algorithms, data structures, and software engineering techniques to develop efficient and effective solutions to
+ the theory, design, and implementation of computer systems. It is a broad field that encompasses a wide range of topics, including algorithms, computer architecture, computer networks
 : Output mismatch at batch 1, prompt 11:
- Dynamic: 'the theory, design, and implementation of computer systems. It involves the use of algorithms, data structures, and software engineering techniques to develop efficient and effective solutions to'
- Static: 'the theory, design, and implementation of computer systems. It is a broad field that encompasses a wide range of topics, including algorithms, computer architecture, computer networks'
E
======================================================================
ERROR: test_dynamic_lora_update_engine (__main__.TestLoRADynamicUpdate)
Test dynamic LoRA updates in engine mode.
----------------------------------------------------------------------
AssertionError: 'the [55 chars] It involves the use of algorithms, data struc[86 chars]s to' != 'the [55 chars] It is a broad field that encompasses a wide r[74 chars]orks'
- the theory, design, and implementation of computer systems. It involves the use of algorithms, data structures, and software engineering techniques to develop efficient and effective sol
utions to
+ the theory, design, and implementation of computer systems. It is a broad field that encompasses a wide range of topics, including algorithms, computer architecture, computer networks
 : Output mismatch at batch 1, prompt 11:
- Dynamic: 'the theory, design, and implementation of computer systems. It involves the use of algorithms, data structures, and software engineering techniques to develop efficient and effective solutions to'
- Static: 'the theory, design, and implementation of computer systems. It is a broad field that encompasses a wide range of topics, including algorithms, computer architecture, computer ne
tworks'

Because the rest of the test guarantees that LoRA adapters are loaded and unloaded correctly, this check should be more of a sanity check than a rigorous test.

Modifications

Instead of comparing outputs exactly, the test now requires that the number of overlapping words is above some threshold (currently set to 90%). Though this should reduce the number of false negatives, this test still could be flaky, and I believe that in the future we could consider just removing this check.

Checklist

Format your code according to the Format code with pre-commit.

gemini-code-assist · 2025-10-21T00:52:30Z

Summary of Changes

Hello @glenliu21, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses the flakiness of the test_lora_update.py test by relaxing the output comparison criteria. Previously, the test demanded exact matches between dynamic and static outputs, which proved too strict and led to intermittent failures. The updated approach now validates output similarity based on a 90% word overlap threshold, ensuring the test remains effective without being overly restrictive.

Highlights

Relaxed Output Comparison: The test test_lora_update.py no longer requires exact equivalence between dynamic and static outputs, addressing flakiness due to strict comparison.
Word Overlap Threshold: Output comparison now uses a word overlap threshold of 90% to determine similarity, making the test more robust while still ensuring correctness.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adjusts the comparison of dynamic and static outputs in test_lora_update.py to be less strict, using a word overlap threshold instead of an exact match. This is a good approach to make the test more robust against minor generation differences. I've identified a potential ZeroDivisionError in the new similarity calculation and suggest a fix. I've also proposed a small improvement to the assertion's error message for better readability.

test/srt/lora/test_lora_update.py

Fridge003 · 2025-10-21T01:07:21Z

Maybe we can use Rouge-L as the metric for comparing output? Like this one:
https://github.com/sgl-project/sglang/blob/main/test/srt/lora/utils.py#L274

adjust dynamic vs static outputs comparison in test_lora_update.py

99e7aa0

gemini-code-assist bot reviewed Oct 21, 2025

View reviewed changes

test/srt/lora/test_lora_update.py Outdated Show resolved Hide resolved

glenliu21 and others added 5 commits October 21, 2025 22:32

use ROUGE-L

415817f

Merge branch 'main' into test_lora_update_fix

54ca1ce

Merge branch 'main' into test_lora_update_fix

86cbe49

Merge branch 'main' into test_lora_update_fix

f45a5b1

Merge branch 'main' into test_lora_update_fix

e2e4f1e

Fridge003 added the run-ci label Oct 24, 2025

Fridge003 approved these changes Oct 24, 2025

View reviewed changes

Fridge003 merged commit fc86b18 into sgl-project:main Oct 24, 2025
136 of 145 checks passed

glenliu21 deleted the test_lora_update_fix branch October 25, 2025 21:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adjust dynamic vs static outputs comparison in test_lora_update.py#11884

adjust dynamic vs static outputs comparison in test_lora_update.py#11884
Fridge003 merged 6 commits intosgl-project:mainfrom
glenliu21:test_lora_update_fix

glenliu21 commented Oct 21, 2025

Uh oh!

gemini-code-assist bot commented Oct 21, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Fridge003 commented Oct 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

glenliu21 commented Oct 21, 2025

Motivation

Modifications

Checklist

Uh oh!

gemini-code-assist bot commented Oct 21, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Fridge003 commented Oct 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments