Add LRU to improve the memory usage of the indexer by Lucccyo · Pull Request #2050 · ocaml/merlin

Lucccyo · 2026-03-24T12:54:10Z

The index links are stored in a Hashtable, which grows without bound as we add new entries. This is using a lot of memory.
This PR replaces the use of a Hashtable with an LRU cache to bound memory usage, keeping only the most recently used indices and evicting the least recently used entry when the capacity is reached.

src/index-format/dbllist.ml

src/index-format/lru.ml

src/index-format/dbllist.ml

voodoos · 2026-03-24T14:22:42Z

(I did not review the tricky pointer arithmetic of the dllist yet)

art-w · 2026-03-31T11:00:22Z

src/index-format/granular_marshal.ml

-          | In_memory v -> write_child lnk schema v size ~placeholders ~restore
-          | On_disk _ ->
-            write_child lnk schema (fetch lnk) size ~placeholders ~restore)
+          | In_memory v | In_memory_c (v, _) ->


I'm curious if your tests ran into the In_memory_c case here? The current code is correct, but if this does happen in practice then In_memory_c would benefit from being translated into a pointed-index On_disk_ptr directly (since an In_memory_c is an in-memory value that was read from an existing file)

My test doesn't hit the In_memory_c case here.

art-w · 2026-03-31T11:07:37Z

src/index-format/granular_marshal.ml

+    in
+    List.iter
+      (fun (Cached (link, loc, store, schema)) ->
+        Cache.remove store.cache loc;


This remove doesn't look entirely correct, since we are loosing sharing as we never reintroduce lnk into the cache later if it turns out that offset was still useful (so we'll allocate new links for that same value). Does your benchmarks show that this remove is necessary? :)

Yes I agree.
On the benchmarks, removing the Cache.remove multiplies the max memory usage by around 1.1.

Progress.

acc2ec6

xvw reviewed Mar 24, 2026

View reviewed changes

src/index-format/dbllist.ml Outdated Show resolved Hide resolved

Clean up.

c4a34e4

voodoos reviewed Mar 24, 2026

View reviewed changes

Tim-ats-d and others added 2 commits March 30, 2026 12:02

Use shared cache sync with filesystem for ocaml-index file reading.

7567b8f

Add LRU to limit the memory usage

4f75584

Lucccyo force-pushed the lru_cache_index branch from 5bf10b0 to 4f75584 Compare March 31, 2026 09:16

art-w reviewed Mar 31, 2026

View reviewed changes

Lucccyo added 2 commits April 1, 2026 15:12

tmp

8f993ef

Add files to find the leak

9945ba4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LRU to improve the memory usage of the indexer#2050

Add LRU to improve the memory usage of the indexer#2050
Lucccyo wants to merge 6 commits intoocaml:mainfrom
Lucccyo:lru_cache_index

Lucccyo commented Mar 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

voodoos commented Mar 24, 2026

Uh oh!

art-w Mar 31, 2026

Uh oh!

Lucccyo Mar 31, 2026

Uh oh!

art-w Mar 31, 2026

Uh oh!

Lucccyo Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Lucccyo commented Mar 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

voodoos commented Mar 24, 2026

Uh oh!

art-w Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Lucccyo Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

art-w Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Lucccyo Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants