Use strings.IndexByte in tokenizeSubjectIntoSlice#7858
Draft
Conversation
01d3793 to
5cb98dd
Compare
wallyqs
pushed a commit
to wallyqs/nats-server
that referenced
this pull request
Feb 19, 2026
For subjects shorter than 32 bytes, the manual byte-by-byte scan avoids function call overhead and matches or beats strings.IndexByte. For longer subjects, strings.IndexByte leverages SIMD instructions for up to 3.4x speedup. This threshold was validated across x86, arm64, and Graviton architectures per nats-io#7858. https://claude.ai/code/session_01TdeiSyKRBD7Z9xgt8kfHuS
wallyqs
pushed a commit
to wallyqs/nats-server
that referenced
this pull request
Feb 19, 2026
Use strings.IndexByte instead of byte-by-byte iteration for subjects >= 32 bytes in match(), hasInterest(), and tokenizeSubjectIntoSlice(). strings.IndexByte leverages SIMD optimizations that provide significant speedups on longer strings across different architectures. This mirrors the same optimization applied to sublist.go in PR nats-io#7858. https://claude.ai/code/session_01Sc5YGRWoPsi6jMKDZ8n2zi
5cb98dd to
fb31976
Compare
wallyqs
pushed a commit
to wallyqs/nats-server
that referenced
this pull request
Feb 20, 2026
Use strings.IndexByte instead of a manual byte-by-byte loop. IndexByte leverages SIMD instructions on amd64 for scanning. Benchmarks with realistic NATS subjects (9 tokens, 105 bytes based on observed token length distribution) show tokenization is 13% faster, and 43% faster for short 3-token subjects: TokenizeSubjectIntoSlice/3_tokens_19B 23.1ns → 13.1ns -43% TokenizeSubjectIntoSlice/9_tokens_105B 89.3ns → 78.0ns -13% Follows the same approach as nats-io#7858
wallyqs
pushed a commit
to wallyqs/nats-server
that referenced
this pull request
Feb 20, 2026
Remove the threshold-gated branching and use strings.IndexByte unconditionally in match() and hasInterest(), consistent with the tokenizeSubjectIntoSlice change. Match/3_tokens_19B 93.5ns → 87.3ns -6.6% HasInterest/3_tokens_19B 56.3ns → 54.3ns -3.5% HasInterest/9_tokens_105B 162ns → 161ns ~ (p=0.699) Follows the same approach as nats-io#7858
Use strings.IndexByte instead of byte-by-byte scanning in tokenizeSubjectIntoSlice for faster subject tokenization. Signed-off-by: Waldemar Quevedo <wally@nats.io>
fb31976 to
dce4984
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Use
strings.IndexByteinstead of byte-by-byte scanning intokenizeSubjectIntoSlicefor more efficient tokenization on longer subjects with more tokens.Signed-off-by: Waldemar Quevedo wally@nats.io