I'm using Valkey on Google Cloud Platform Memorystore.
I tried shifting traffic from the existing Valkey instance running 8.0 to a new one running 9.0, but ended up rolling it back due to P99 read latency degradation.
Here's the Datadog APM of valkey.command operation.
And here's the timeline.
- April 15th, 18:00 (JST): Shifted traffic to a 9.0 instance
- April 16th, 08:00 (JST): Rolled traffic back to the original 8.0 instance
During the time,
- At peak time (20:00-22:00), P99 read latency almost always stayed at 35ms timeout threshold
- At low traffic time (02:00-06:00): P99 read latency spikes occurred once every 1 hour, which matches the cluster topology refresh interval.
#3451 is a similar issue, but it compares Valkey 9.0 against Redis 8.6, not Valkey 8.0.
May I know what changes introduced to 9.0 caused this issue, and any plan to resolve it? Thank you in advance.
I'm using Valkey on Google Cloud Platform Memorystore.
I tried shifting traffic from the existing Valkey instance running 8.0 to a new one running 9.0, but ended up rolling it back due to P99 read latency degradation.
Here's the Datadog APM of
valkey.commandoperation.And here's the timeline.
During the time,
#3451 is a similar issue, but it compares Valkey 9.0 against Redis 8.6, not Valkey 8.0.
May I know what changes introduced to 9.0 caused this issue, and any plan to resolve it? Thank you in advance.