adjust dynamic vs static outputs comparison in test_lora_update.py#11884
adjust dynamic vs static outputs comparison in test_lora_update.py#11884Fridge003 merged 6 commits intosgl-project:mainfrom
Conversation
Summary of ChangesHello @glenliu21, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses the flakiness of the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adjusts the comparison of dynamic and static outputs in test_lora_update.py to be less strict, using a word overlap threshold instead of an exact match. This is a good approach to make the test more robust against minor generation differences. I've identified a potential ZeroDivisionError in the new similarity calculation and suggest a fix. I've also proposed a small improvement to the assertion's error message for better readability.
|
Maybe we can use Rouge-L as the metric for comparing output? Like this one: |
Motivation
Currently,
test_lora_update.pyrequires dynamic outputs be equivalent to static outputs. This is too strict of a test, and it occasionally fails.For example, I ran with the following hardware:
and got the error:
Because the rest of the test guarantees that LoRA adapters are loaded and unloaded correctly, this check should be more of a sanity check than a rigorous test.
Modifications
Instead of comparing outputs exactly, the test now requires that the number of overlapping words is above some threshold (currently set to 90%). Though this should reduce the number of false negatives, this test still could be flaky, and I believe that in the future we could consider just removing this check.
Checklist