Skip to content

Update factory processing performance in queueing#410

Merged
slawlor merged 6 commits intomainfrom
factory_perf
Mar 6, 2026
Merged

Update factory processing performance in queueing#410
slawlor merged 6 commits intomainfrom
factory_perf

Conversation

@slawlor
Copy link
Owner

@slawlor slawlor commented Mar 6, 2026

When handling extremely large worker pools, we're spending excessive amounts of time iterating the map of workers. This is causing slowdowns sometimes, and can be optimized at a small cost of memory util in the factory agent.

The key change is replacing O(n) linear scans of the worker pool in choose_target_worker for QueuerRouting and StickyQueuerRouting with O(1) dispatch using a VecDeque of available workers. The improvement should be most visible with large worker pools.

slawlor added 3 commits March 5, 2026 21:38
When handling extremely large worker pools, we're spending excessive amounts of time iterating the map of workers. This is causing slowdowns sometimes, and can be optimized at a small cost of memory util in the factory agent.
We need to pin deeply nested futures, and this was showing in our cluster tests on the new Rust runtime.
@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 78.82353% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.56%. Comparing base (b3b589a) to head (b94075b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ractor/src/factory/routing.rs 80.00% 13 Missing ⚠️
ractor/src/factory/factoryimpl.rs 68.75% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #410      +/-   ##
==========================================
- Coverage   85.60%   85.56%   -0.05%     
==========================================
  Files          73       73              
  Lines       13256    13329      +73     
==========================================
+ Hits        11348    11405      +57     
- Misses       1908     1924      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@slawlor
Copy link
Owner Author

slawlor commented Mar 6, 2026

Benchmark: Factory Queuer Dispatch Throughput

Measures raw stable-state dispatch speed — the factory and its entire worker pool
are spawned once and reused across all criterion iterations, so only message
routing + processing is timed (50,000 messages per iteration).

Workers main (O(n) scan) PR #410 (O(1) deque) Speedup
100 47.1 ms · 1.06 M elem/s 48.6 ms · 1.03 M elem/s ~1×
1,000 113.7 ms · 440 K elem/s 52.3 ms · 956 K elem/s 2.2×
5,000 398.2 ms · 126 K elem/s 56.6 ms · 883 K elem/s 7.0×
10,000 732.6 ms · 68 K elem/s 61.7 ms · 811 K elem/s 11.9×

Analysis

  • Throughput stays nearly flat with this PR — scaling from 100 → 10,000 workers
    only drops from 1.03 M to 811 K elem/s (~21%, from actor management overhead).
    On main the same range collapses from 1.06 M to 68 K elem/s (~94%), showing the
    O(n) HashMap::iter().find() scan becomes the dominant bottleneck.

  • At 100 workers the two branches are at parity — the VecDeque bookkeeping cost
    is negligible and the linear scan is short enough not to matter.

  • At 10,000 workers this PR is ~12× faster.

@slawlor slawlor force-pushed the factory_perf branch 5 times, most recently from e8801af to 7e43753 Compare March 6, 2026 17:38
@slawlor slawlor marked this pull request as ready for review March 6, 2026 18:02
@slawlor slawlor merged commit e6174c8 into main Mar 6, 2026
23 of 27 checks passed
@slawlor slawlor deleted the factory_perf branch March 6, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant