document performance flags for serve#57845
Conversation
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request improves the performance tuning documentation by adding a new section on throughput-optimized flags. The changes clearly document several performance-related environment variables, explaining their purpose and how they can be used to improve throughput and latency. The restructuring of the document to separate request path performance issues from controller performance issues is a good improvement for clarity. I've found one minor grammatical issue in the new documentation.
| ### Enable throughput-optimized flags | ||
|
|
||
| :::{note} | ||
| In Ray v2.54.0, the defaults for `RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD` and `RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP` will change to `0` for improved performance. |
There was a problem hiding this comment.
but not the logging ones?
There was a problem hiding this comment.
Logging can be made default after #57850 is implemented
There was a problem hiding this comment.
since we're making this out in the future 2.54, should we just include the logging one as well then
There was a problem hiding this comment.
i am inclined to leave it out, because when we implement the time-based logger, we can immediately roll it out without warning to the developer, since it has no perceivable impact.
dstrodtman
left a comment
There was a problem hiding this comment.
@akshay-anyscale I believe that this provides sufficient detail without going so far as telling users how they must design their applications.
On the Anyscale side, I'll word this slightly more strongly. While usually I like to avoid anti-pattern code examples, this might be one that (especially for novice users, which is definitely some data scientists that could be our users) an explicit example of what "blocking code" looks like could be helpful.
| ### Enable throughput-optimized flags | ||
|
|
||
| :::{note} | ||
| In Ray v2.54.0, the defaults for `RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD` and `RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP` will change to `0` for improved performance. |
There was a problem hiding this comment.
| In Ray v2.54.0, the defaults for `RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD` and `RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP` will change to `0` for improved performance. | |
| A breaking change to this functionality will go live with Ray version 2.54.0. The defaults for `RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD` and `RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP` will change to `0`, disabling existing default functionality to improve serving throughput. | |
| You should update your code to explicitly set these properties to `1` if your workloads require legacy behavior. |
Typically, I avoid mentioning future state. Since this is a known planned migration, we should announce it.
We should also draft customer comms and work with @tg-anyscale to address Anyscale customers. (I understand this is technically opt-in for the breaking change because it's a new Ray version, but still nice to encourage customers to start testing now.)
There was a problem hiding this comment.
A breaking change to this functionality will go live with
This is not a breaking change from the customer POV, they don't need to take any action to opt into these optimizations.
There was a problem hiding this comment.
Sorry, I'm confused: won't running the user code in the same loop as the serve code break customer workloads with blocking logic once the default changes (assuming upgrade to Ray 2.54.0)?
Or will users not experience a performance degradation relative to now, they just won't see an improvement?
There was a problem hiding this comment.
will users not experience a performance degradation relative to now, they just won't see an improvement?
This ^
| In Ray v2.54.0, the defaults for `RAY_SERVE_RUN_USER_CODE_IN_SEPARATE_THREAD` and `RAY_SERVE_RUN_ROUTER_IN_SEPARATE_LOOP` will change to `0` for improved performance. | ||
| ::: | ||
|
|
||
| Ray Serve offers performance flags that improve throughput and latency. You can enable all optimizations at once with `RAY_SERVE_THROUGHPUT_OPTIMIZED=1`, or configure individual flags: |
There was a problem hiding this comment.
| Ray Serve offers performance flags that improve throughput and latency. You can enable all optimizations at once with `RAY_SERVE_THROUGHPUT_OPTIMIZED=1`, or configure individual flags: | |
| This section details how to enable Ray Serve options focused on improving throughput and reducing latency. These configurations focus on the following: | |
| - Reducing overhead associated with frequent logging. | |
| - Disabling behavior that allowed Serve applications to include blocking operations. | |
| If your Ray Serve code includes blocking operations, you must refactor your code to enable enhanced throughput. | |
| To configure all options to the recommended settings, set the environment variable `RAY_SERVE_THROUGHPUT_OPTIMIZED=1`. | |
| You can also configure each option individually. The following table details the recommended configurations and their impact: |
There was a problem hiding this comment.
Just confirming: if I've set RAY_SERVE_THROUGHPUT_OPTIMIZED=1, can I still override individual configs below? Or should I manually configure all 4 if I need a higher/lower buffer size for example?
There was a problem hiding this comment.
can I still override individual configs below?
no, need to unset RAY_SERVE_THROUGHPUT_OPTIMIZED and manually set each one individually.
I decided to add a code example showing blocking and non-blocking operation, let me know what you think. |
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: abrar <abrar@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
No description provided.