Add nightly performance test for GPT-OSS 4GPU models#12805
Conversation
alisonshao
commented
Nov 7, 2025
- Create test_nightly_gpt_oss_4gpu_perf.py with profiling support
- Test bf16 and mxfp4 quantizations of gpt-oss-120b
- Add to nightly-4-gpu-b200 suite
Summary of ChangesHello @alisonshao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the testing infrastructure by adding a dedicated nightly performance test for GPT-OSS 4GPU models. This new test focuses on evaluating the performance of Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new nightly performance test for GPT-OSS 4GPU models and refactors some test suites. The new test file is well-structured, but I've identified a few areas for improvement related to test configuration, filename generation, and code style. Additionally, a change in run_suite.py appears to unintentionally leave a test file untracked, which would likely cause a CI failure. My review provides specific suggestions to address these points.
| @@ -236,7 +235,6 @@ class TestFile: | |||
| TestFile("ep/test_moe_deepep_eval_accuracy_large.py"), | |||
| TestFile("function_call/test_unknown_tool_name.py"), | |||
| TestFile("hicache/test_disaggregation_hicache.py"), | |||
| TestFile("hicache/test_hicache_page.py"), | |||
There was a problem hiding this comment.
The test file test_hicache_page.py has been removed from the __not_in_ci__ suite, but the file itself has not been deleted from the repository. The sanity check in run_suite.py will likely fail because this file is now untracked by the test runner. If this file is still needed, it should be added back to a test suite (e.g., __not_in_ci__) to resolve the sanity check failure. Otherwise, it should be deleted.
| ), | ||
| ] | ||
| cls.base_url = DEFAULT_URL_FOR_TEST | ||
| cls.batch_sizes = [1, 1, 8, 16, 64] |
There was a problem hiding this comment.
| profile_filename = ( | ||
| f"{model_path.replace('/', '_')}_{int(time.time())}" | ||
| ) | ||
| profile_path_prefix = os.path.join(PROFILE_DIR, profile_filename) | ||
| json_output_file = f"results_{model_path.replace('/', '_')}_{int(time.time())}.json" |
There was a problem hiding this comment.
The timestamp from int(time.time()) is generated separately for profile_filename and json_output_file. If these calls happen to cross a second boundary, the filenames will have different timestamps, which could be confusing. It's better to generate a single timestamp and reuse it for both filenames to ensure consistency. This also provides an opportunity to reduce code duplication by creating a slug for the model name.
| profile_filename = ( | |
| f"{model_path.replace('/', '_')}_{int(time.time())}" | |
| ) | |
| profile_path_prefix = os.path.join(PROFILE_DIR, profile_filename) | |
| json_output_file = f"results_{model_path.replace('/', '_')}_{int(time.time())}.json" | |
| timestamp = int(time.time()) | |
| model_name_slug = model_path.replace('/', '_') | |
| profile_filename = f"{model_name_slug}_{timestamp}" | |
| profile_path_prefix = os.path.join(PROFILE_DIR, profile_filename) | |
| json_output_file = f"results_{model_name_slug}_{timestamp}.json" |
|
|
||
| # Load and deserialize JSON results | ||
| if os.path.exists(json_output_file): | ||
| import json |
There was a problem hiding this comment.
- Create test_nightly_gpt_oss_4gpu_perf.py with profiling support - Test bf16 and mxfp4 quantizations of gpt-oss-120b - Add to nightly-4-gpu-b200 suite
a9b18d0 to
ff2a7b5
Compare