Optimize MaxMap sparse range iteration with a lazy realized-key cache by peter941221 · Pull Request #94 · sigp/milhouse

peter941221 · 2026-03-21T05:40:57Z

Summary

This fixes the sparse-update hot path behind repeated for_each_range(...) calls on MaxMap<VecMap<_>>.

The core issue was that range iteration over sparse updates still paid for dense scans across large logical intervals during apply-time tree rebuilding. This patch makes MaxMap lazily cache realized update keys and answer subsequent range scans via binary search over that sorted key set.

What Changed

add a lazy realized-key cache to MaxMap
build the cache on the first for_each_range(...)
answer later range queries by searching the cached sorted keys
invalidate the cache on mutation
preserve get_cow_with(...) lazy materialization semantics
keep max_index() behavior compatible with the existing tests/proptests

Why This Shape

it fixes the hot path at the update-map layer instead of adding a Lighthouse-local workaround
it keeps the public UpdateMap API unchanged
it benefits both Tree::with_updated_leaves(...) and PackedLeaf::update(...)
it respects the current Cow model instead of forcing eager materialization

Validation

cargo check
cargo check --all-targets
cargo check --benches
targeted regression tests for get_mut_with, get_cow_with, and sparse apply paths
cargo test --release
Criterion benchmark for apply_updates

Benchmark Snapshot

Case	Before	After
`sparse_two`	`3.72-4.08 ms`	`2.93-3.07 ms`
`sparse_eight`	`4.34-4.58 ms`	`3.03-3.23 ms`
`dense_window_512`	`14.70-17.89 ms`	`1.98-2.29 ms`

Additional Context

Algorithmically, this changes the apply path from repeated dense interval scans to sparse ordered range reporting over realized update keys. The implementation is intentionally simple and local, but the framing is close to classic ordered-set / range-reporting work:

Mihai Patrascu, Mikkel Thorup, Time-space trade-offs for predecessor search (STOC 2006), DOI: 10.1145/1132516.1132551
Stephen Alstrup, Gerth Stolting Brodal, Theis Rauhe, Optimal static range reporting in one dimension (STOC 2001), DOI: 10.1145/380752.380842

This patch is not attempting to implement those structures directly; it is applying the same high-level move in a minimal crate-local form.

peter941221 · 2026-03-21T06:00:03Z

Pushed a small follow-up after a deeper review pass.

It adds an early return for empty out-of-range for_each_range(...) probes before the sparse-key cache is materialized, plus a regression test covering that case.

The main optimization is unchanged; this just removes the most obvious one-shot cold-start objection without changing the public API or the lazy materialization model.

codecov-commenter · 2026-03-21T11:45:51Z

Codecov Report

❌ Patch coverage is 85.71429% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.81%. Comparing base (3fcb03c) to head (0a2dfce).

Files with missing lines	Patch %	Lines
src/update_map.rs	85.71%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #94      +/-   ##
==========================================
+ Coverage   70.15%   70.81%   +0.65%     
==========================================
  Files          22       22              
  Lines        1280     1326      +46     
==========================================
+ Hits          898      939      +41     
- Misses        382      387       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

michaelsproul · 2026-03-21T11:48:09Z

This looks pretty good, albeit maybe on the slightly complicated side.

I'll do a bit of independent investigation and see what I can come up with. If I reach the same conclusion I'll be sure to merge your PR!

Btw, guessing you used AI quite heavily here? Not an issue, just we are trying to track AI contribs to LH.

peter941221 · 2026-03-21T12:45:25Z

Thanks — that sounds good, and I appreciate the independent pass.

And yes, this was AI-assisted. I used AI mostly to speed up codebase exploration, benchmark/test iteration, and to sanity-check implementation ideas, but the root-cause investigation, design choice, final code changes, and local validation were mine.

I also tracked the Ubuntu CI failure down to the DHAT helper in tests/list_u64.rs: the new MaxMap layout appears to have nudged allocator-retained bytes above the structural MemoryTracker count on that runner/toolchain (45840 vs 45088, so 752 bytes of slack) without changing behavior. I pushed a small follow-up in 0a2dfce that relaxes the helper from exact byte equality to a bounded 1 KiB allocator-slack check, which keeps the memory test useful while avoiding allocator size-class brittleness across platforms.

If it helps your LH tracking, I'm happy to add a short AI-assistance note to the PR description as well.

peter941221 · 2026-03-31T13:23:37Z

Quick follow-up now that sigp/lighthouse#9003 was closed via #9017 and the deeper exploration moved to sigp/lighthouse#9033.

I am not trying to force this into the just-ship-the-release path. The current head here is still 0a2dfce, and it is green across cargo-fmt, clippy, test-ubuntu-latest, test-macos-latest, test-windows-latest, and cargo-tarpaulin.

If the generic VecMap / milhouse optimization still feels useful for the post-release follow-up, I am happy to keep iterating here and can add a small benchmark matrix for the remaining VecMap users (for example balances) against the validators-only fix.

If, after #9017, this feels like too much complexity for the remaining upside, no worries — feel free to close this and I can always respin a smaller experiment later around #9033.

Optimize MaxMap sparse range iteration with lazy realized-key cache

975b581

peter941221 mentioned this pull request Mar 21, 2026

BeaconState::apply_pending_mutations is slow with changes to validators sigp/lighthouse#9003

Closed

Skip cache materialization for empty out-of-range probes

ebc96fa

Relax DHAT equality for allocator slack

0a2dfce

michaelsproul mentioned this pull request Mar 26, 2026

Explore more Milhouse optimisations sigp/lighthouse#9033

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize MaxMap sparse range iteration with a lazy realized-key cache#94

Optimize MaxMap sparse range iteration with a lazy realized-key cache#94
peter941221 wants to merge 3 commits intosigp:mainfrom
peter941221:perf/maxmap-sparse-range-cache

peter941221 commented Mar 21, 2026

Uh oh!

peter941221 commented Mar 21, 2026

Uh oh!

codecov-commenter commented Mar 21, 2026 •

edited

Loading

Uh oh!

michaelsproul commented Mar 21, 2026

Uh oh!

peter941221 commented Mar 21, 2026

Uh oh!

peter941221 commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

peter941221 commented Mar 21, 2026

Summary

What Changed

Why This Shape

Validation

Benchmark Snapshot

Additional Context

Uh oh!

peter941221 commented Mar 21, 2026

Uh oh!

codecov-commenter commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

michaelsproul commented Mar 21, 2026

Uh oh!

peter941221 commented Mar 21, 2026

Uh oh!

peter941221 commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Mar 21, 2026 •

edited

Loading