Skip to content

Comments

memstore: speed up LoadNextMsg when using wildcard filter#7840

Merged
neilalexander merged 1 commit intomainfrom
memstore-wildcard-matching-opt
Feb 18, 2026
Merged

memstore: speed up LoadNextMsg when using wildcard filter#7840
neilalexander merged 1 commit intomainfrom
memstore-wildcard-matching-opt

Conversation

@sciascid
Copy link
Contributor

This commit speeds up wildcard based filtering:

  • Avoid expanding the bounds for matching fss entries that are past our search
  • Avoid unnecessary creation of a list with matching subjects
  • Introduce MatchUntil, this allows early stop if we find a match with first <= start

Signed-off-by: Daniele Sciascia daniele@nats.io

@sciascid sciascid requested a review from a team as a code owner February 17, 2026 13:48
@sciascid
Copy link
Contributor Author

Results for the benchmarks in in this PR:

goos: linux
goarch: amd64
pkg: github.com/nats-io/nats-server/v2/server
cpu: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
                                                      │      old.txt       │               new.txt                │
                                                      │       sec/op       │   sec/op     vs base                 │
_MemStoreLoadNextMsgFiltered/wildcard_linear_scan-16            2.263 ± 0%    2.265 ± 0%         ~ (p=0.393 n=10)
_MemStoreLoadNextMsgFiltered/wildcard_bounded_scan-16   2237858263.0n ± 1%   109.7n ± 3%  -100.00% (p=0.000 n=10)
_MemStoreLoadNextMsgFiltered/literal_bounded_scan-16           843.5n ± 4%   879.4n ± 7%         ~ (p=0.143 n=10)
geomean                                                        16.23m        60.22µ        -99.63%

This PR improves the case "wildcard_bounded_scan". This case populates a memstore with 10 million messages and the repeatedly calls LoadNextMsg until EOF. In this case, the filter given to LoadNextMsg will only match subset of 100 messages out of the 10 million.

The other cases show no regresions, nor improvements. That's the expectation.

Copy link
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great, couple minor things.

// Internal function which can be called recursively to match all leaf nodes to a given filter subject which
// once here has been decomposed to parts. These parts only care about wildcards, both pwc and fwc.
func (t *SubjectTree[T]) match(n node, parts [][]byte, pre []byte, cb func(subject []byte, val *T)) {
// Returns false if the callback requested to stop matching.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anything use this return value outside of tests? Trying to decide if it's a useful value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently no use outside of tests. I don't have a strong opinion on this. Let me know if I should remove it.

if start > ss.Last {
return 0, 0, false
}
if ss.First > start {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be able to simplify this into a max() case on the return line for brevity? e.g. return max(start, ss.First), ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

wallyqs pushed a commit to wallyqs/nats-server that referenced this pull request Feb 17, 2026
Detailed review of the MatchUntil + early-termination optimization for
wildcard-filtered LoadNextMsg. Identifies a benchmark bug (start/count
not reset between b.Loop() iterations) and suggests reconsidering the
linearScanMaxFSS=256 threshold which limits the optimization to small
subject trees.

https://claude.ai/code/session_0198NdQiRC8VME4WzZyQYEt3
wallyqs pushed a commit to wallyqs/nats-server that referenced this pull request Feb 17, 2026
Ran benchmarks on both base and PR commits with the start/count reset
fix applied for accurate measurements. Results confirm ~12,400x speedup
for wildcard bounded scan (4s -> 323us) with no regressions in linear
scan or literal scan paths.

https://claude.ai/code/session_0198NdQiRC8VME4WzZyQYEt3
name: "wildcard_linear_scan",
msgs: 10_000_000,
matchingMsgEvery: 10_000,
filter: "foo.baz.*",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not match the subject from the stream config but it is ok for the test? []string{"foo.*"}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to foo.>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

event though it made no difference

This commit speeds up wildcard based filtering:

- Avoid expanding the bounds for matching fss entries
  that are past our search
- Avoid unnecessary creation of a list with matching
  subjects
- Introduce MatchUntil, this allows early stop if we
  find a match with first <= start

Signed-off-by: Daniele Sciascia <daniele@nats.io>
@sciascid sciascid force-pushed the memstore-wildcard-matching-opt branch from 50e15a5 to 0fd9865 Compare February 18, 2026 08:39
@sciascid
Copy link
Contributor Author

There was a bug in the benchmark accounting. Variables start and count need to be reset at the beginning of the for b.Loop() {} loop (reported by claude, codex did not spot it!).
Attaching a new set of results:

goos: linux
goarch: amd64
pkg: github.com/nats-io/nats-server/v2/server
cpu: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
                                                           │     old.txt     │               new.txt               │
                                                           │     sec/op      │   sec/op     vs base                │
_MemStoreLoadNextMsgFilteredFixed/wildcard_linear_scan-16         2.274 ± 0%    2.272 ± 1%        ~ (p=0.912 n=10)
_MemStoreLoadNextMsgFilteredFixed/wildcard_bounded_scan-16   2248966.6µ ± 1%   205.1µ ± 2%  -99.99% (p=0.000 n=10)
_MemStoreLoadNextMsgFilteredFixed/literal_bounded_scan-16        886.3m ± 1%   884.3m ± 1%        ~ (p=0.739 n=10)
geomean                                                           1.655        74.42m       -95.50%

What changed? The bug favored somewhat the optimization. Now we have 205.1µ for the wildcard_bounded_scan case. Prior to fixing the bug, the benchmark reported 109.7n. This value changed quite a bit... but overall the results are still very good when compared to the baseline result 2248966.6µ.

Copy link
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@neilalexander neilalexander merged commit 02c7003 into main Feb 18, 2026
89 of 92 checks passed
@neilalexander neilalexander deleted the memstore-wildcard-matching-opt branch February 18, 2026 09:56
neilalexander added a commit that referenced this pull request Feb 20, 2026
Includes the following:

- #7839
- #7843
- #7824
- #7826
- #7845
- #7844
- #7840
- #7827
- #7846
- #7848
- #7849
- #7855
- #7850
- #7857
- #7856

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants