Skip to content

Optimize MaxMap sparse range iteration with a lazy realized-key cache#94

Open
peter941221 wants to merge 3 commits intosigp:mainfrom
peter941221:perf/maxmap-sparse-range-cache
Open

Optimize MaxMap sparse range iteration with a lazy realized-key cache#94
peter941221 wants to merge 3 commits intosigp:mainfrom
peter941221:perf/maxmap-sparse-range-cache

Conversation

@peter941221
Copy link
Copy Markdown

Summary

This fixes the sparse-update hot path behind repeated for_each_range(...) calls on MaxMap<VecMap<_>>.

The core issue was that range iteration over sparse updates still paid for dense scans across large logical intervals during apply-time tree rebuilding. This patch makes MaxMap lazily cache realized update keys and answer subsequent range scans via binary search over that sorted key set.

What Changed

  • add a lazy realized-key cache to MaxMap
  • build the cache on the first for_each_range(...)
  • answer later range queries by searching the cached sorted keys
  • invalidate the cache on mutation
  • preserve get_cow_with(...) lazy materialization semantics
  • keep max_index() behavior compatible with the existing tests/proptests

Why This Shape

  • it fixes the hot path at the update-map layer instead of adding a Lighthouse-local workaround
  • it keeps the public UpdateMap API unchanged
  • it benefits both Tree::with_updated_leaves(...) and PackedLeaf::update(...)
  • it respects the current Cow model instead of forcing eager materialization

Validation

  • cargo check
  • cargo check --all-targets
  • cargo check --benches
  • targeted regression tests for get_mut_with, get_cow_with, and sparse apply paths
  • cargo test --release
  • Criterion benchmark for apply_updates

Benchmark Snapshot

Case Before After
sparse_two 3.72-4.08 ms 2.93-3.07 ms
sparse_eight 4.34-4.58 ms 3.03-3.23 ms
dense_window_512 14.70-17.89 ms 1.98-2.29 ms

Additional Context

Algorithmically, this changes the apply path from repeated dense interval scans to sparse ordered range reporting over realized update keys. The implementation is intentionally simple and local, but the framing is close to classic ordered-set / range-reporting work:

  • Mihai Patrascu, Mikkel Thorup, Time-space trade-offs for predecessor search (STOC 2006), DOI: 10.1145/1132516.1132551
  • Stephen Alstrup, Gerth Stolting Brodal, Theis Rauhe, Optimal static range reporting in one dimension (STOC 2001), DOI: 10.1145/380752.380842

This patch is not attempting to implement those structures directly; it is applying the same high-level move in a minimal crate-local form.

@peter941221
Copy link
Copy Markdown
Author

Pushed a small follow-up after a deeper review pass.

It adds an early return for empty out-of-range for_each_range(...) probes before the sparse-key cache is materialized, plus a regression test covering that case.

The main optimization is unchanged; this just removes the most obvious one-shot cold-start objection without changing the public API or the lazy materialization model.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 21, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.81%. Comparing base (3fcb03c) to head (0a2dfce).

Files with missing lines Patch % Lines
src/update_map.rs 85.71% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #94      +/-   ##
==========================================
+ Coverage   70.15%   70.81%   +0.65%     
==========================================
  Files          22       22              
  Lines        1280     1326      +46     
==========================================
+ Hits          898      939      +41     
- Misses        382      387       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@michaelsproul
Copy link
Copy Markdown
Member

This looks pretty good, albeit maybe on the slightly complicated side.

I'll do a bit of independent investigation and see what I can come up with. If I reach the same conclusion I'll be sure to merge your PR!

Btw, guessing you used AI quite heavily here? Not an issue, just we are trying to track AI contribs to LH.

@peter941221
Copy link
Copy Markdown
Author

Thanks — that sounds good, and I appreciate the independent pass.

And yes, this was AI-assisted. I used AI mostly to speed up codebase exploration, benchmark/test iteration, and to sanity-check implementation ideas, but the root-cause investigation, design choice, final code changes, and local validation were mine.

I also tracked the Ubuntu CI failure down to the DHAT helper in tests/list_u64.rs: the new MaxMap layout appears to have nudged allocator-retained bytes above the structural MemoryTracker count on that runner/toolchain (45840 vs 45088, so 752 bytes of slack) without changing behavior. I pushed a small follow-up in 0a2dfce that relaxes the helper from exact byte equality to a bounded 1 KiB allocator-slack check, which keeps the memory test useful while avoiding allocator size-class brittleness across platforms.

If it helps your LH tracking, I'm happy to add a short AI-assistance note to the PR description as well.

@peter941221
Copy link
Copy Markdown
Author

Quick follow-up now that sigp/lighthouse#9003 was closed via #9017 and the deeper exploration moved to sigp/lighthouse#9033.

I am not trying to force this into the just-ship-the-release path. The current head here is still 0a2dfce, and it is green across cargo-fmt, clippy, test-ubuntu-latest, test-macos-latest, test-windows-latest, and cargo-tarpaulin.

If the generic VecMap / milhouse optimization still feels useful for the post-release follow-up, I am happy to keep iterating here and can add a small benchmark matrix for the remaining VecMap users (for example balances) against the validators-only fix.

If, after #9017, this feels like too much complexity for the remaining upside, no worries — feel free to close this and I can always respin a smaller experiment later around #9033.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants