Skip to content

[Serve] Downstream deployments over-provision when receiving Deployme…#60747

Merged
abrarsheikh merged 1 commit intomasterfrom
60624-abrar-auto
Feb 4, 2026
Merged

[Serve] Downstream deployments over-provision when receiving Deployme…#60747
abrarsheikh merged 1 commit intomasterfrom
60624-abrar-auto

Conversation

@abrarsheikh
Copy link
Contributor

fixes #60624

…ntResponse arguments from slow upstream

Signed-off-by: abrar <abrar@anyscale.com>
@abrarsheikh abrarsheikh requested a review from a team as a code owner February 4, 2026 18:55
@abrarsheikh abrarsheikh added the go add ONLY when ready to merge, run all tests label Feb 4, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses an over-provisioning issue in downstream deployments by resolving request arguments before they are counted as queued. The logic change in router.py is direct and well-commented, and the new test case in test_autoscaling_policy.py effectively validates the fix. I have one suggestion to make the test even more robust against potential timing issues.


# Wait for all 5 requests to be blocked at SlowUpstream (waiting on signal)
wait_for_condition(lambda: ray.get(signal.cur_num_waiters.remote()) == 5)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To make this test more robust against timing-related flakiness, it would be beneficial to add a short time.sleep() after waiting for the requests to be blocked and before asserting the number of replicas. This ensures that the autoscaler has had sufficient time to make a (potentially incorrect) scaling decision. Given upscale_delay_s is 0.2s, a sleep of 0.5s should be adequate.

Suggested change
# Give the autoscaler time to potentially make a wrong decision.
# A sleep duration longer than upscale_delay_s (0.2s) ensures that
# we would have seen an upscale event if the fix was not effective.
time.sleep(0.5)

@ray-gardener ray-gardener bot added the serve Ray Serve Related Issue label Feb 4, 2026
@abrarsheikh abrarsheikh merged commit c40ef35 into master Feb 4, 2026
6 checks passed
@abrarsheikh abrarsheikh deleted the 60624-abrar-auto branch February 4, 2026 23:32
tiennguyentony pushed a commit to tiennguyentony/ray that referenced this pull request Feb 7, 2026
ray-project#60747)

fixes ray-project#60624

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: tiennguyentony <46289799+tiennguyentony@users.noreply.github.com>
tiennguyentony pushed a commit to tiennguyentony/ray that referenced this pull request Feb 7, 2026
ray-project#60747)


fixes ray-project#60624

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: tiennguyentony <46289799+tiennguyentony@users.noreply.github.com>
tiennguyentony pushed a commit to tiennguyentony/ray that referenced this pull request Feb 7, 2026
elliot-barn pushed a commit that referenced this pull request Feb 9, 2026
#60747)

fixes #60624

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Feb 9, 2026
Kunchd pushed a commit to Kunchd/ray that referenced this pull request Feb 17, 2026
ans9868 pushed a commit to ans9868/ray that referenced this pull request Feb 18, 2026
ray-project#60747)

fixes ray-project#60624

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: Adel Nour <ans9868@nyu.edu>
Aydin-ab pushed a commit to kunling-anyscale/ray that referenced this pull request Feb 20, 2026
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
ray-project#60747)

fixes ray-project#60624

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
ray-project#60747)

fixes ray-project#60624

Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve] Downstream deployments over-provision when receiving DeploymentResponse arguments from slow upstream

2 participants