Skip to content

[router] Fix Flaky test_circuit_breaker_opens_and_recovers#13164

Merged
CatherineSue merged 3 commits intosgl-project:mainfrom
XinyueZhang369:xinyue/fix-flaky-circuit-breaker-test
Nov 12, 2025
Merged

[router] Fix Flaky test_circuit_breaker_opens_and_recovers#13164
CatherineSue merged 3 commits intosgl-project:mainfrom
XinyueZhang369:xinyue/fix-flaky-circuit-breaker-test

Conversation

@XinyueZhang369
Copy link

@XinyueZhang369 XinyueZhang369 commented Nov 12, 2025

Motivation

Fix flaky test test_circuit_breaker_opens_and_recovers.
Sometimes the worker takes a bit longer to start. So when test hits the server, it receives 503 due to there's no available worker instead of circuit breaker is open. Then the test expects to receive 200 when worker starts and returns 500 for first 3 requests.

Modifications

Modify the test to wait to receive 500 first and then expect to receive 503, to ensure that test starts to check if circuit breaker is open after the worker is actually started.

Run the test for 500 times, didn't see any failure
Screenshot 2025-11-12 at 10 56 09 AM

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @XinyueZhang369, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a persistent flakiness issue in an integration test related to the circuit breaker mechanism. The core problem stemmed from the test sometimes attempting to verify circuit breaker behavior before the underlying worker was fully initialized. The changes introduce a robust waiting mechanism within the test, ensuring that the worker is ready to process requests before the circuit breaker's state is evaluated, thereby enhancing the reliability and accuracy of the test suite.

Highlights

  • Flaky Test Fix: Addressed the flakiness in test_circuit_breaker_opens_and_recovers by ensuring the test waits for the worker to be fully operational before proceeding with circuit breaker checks.
  • Test Logic Enhancement: Modified the test to first explicitly wait for a 500 status code (indicating the worker has started) and assert that any prior responses are 503 (service unavailable), preventing premature circuit breaker checks.
  • Improved Reliability: The updated test logic ensures that the circuit breaker test accurately reflects the intended scenario, eliminating failures caused by worker startup delays.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a flaky test by introducing a wait mechanism to ensure the worker is in the correct state before the test assertions are made. The logic is sound and makes the test more robust. I have one suggestion to refactor the implementation to be more idiomatic and concise by using a for...else pattern, which improves readability.

@CatherineSue
Copy link
Collaborator

Wait for CI

@CatherineSue CatherineSue merged commit 2cdde3d into sgl-project:main Nov 12, 2025
101 of 110 checks passed
@XinyueZhang369 XinyueZhang369 deleted the xinyue/fix-flaky-circuit-breaker-test branch November 12, 2025 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments