Skip to content

Comments

Use strings.IndexByte in tokenizeSubjectIntoSlice#7858

Draft
wallyqs wants to merge 1 commit intomainfrom
wq/tokenize-string-indexbyte
Draft

Use strings.IndexByte in tokenizeSubjectIntoSlice#7858
wallyqs wants to merge 1 commit intomainfrom
wq/tokenize-string-indexbyte

Conversation

@wallyqs
Copy link
Member

@wallyqs wallyqs commented Feb 19, 2026

Use strings.IndexByte instead of byte-by-byte scanning in tokenizeSubjectIntoSlice for more efficient tokenization on longer subjects with more tokens.

Signed-off-by: Waldemar Quevedo wally@nats.io

@wallyqs wallyqs requested a review from a team as a code owner February 19, 2026 22:17
@wallyqs wallyqs force-pushed the wq/tokenize-string-indexbyte branch from 01d3793 to 5cb98dd Compare February 19, 2026 22:18
wallyqs pushed a commit to wallyqs/nats-server that referenced this pull request Feb 19, 2026
For subjects shorter than 32 bytes, the manual byte-by-byte scan avoids
function call overhead and matches or beats strings.IndexByte. For longer
subjects, strings.IndexByte leverages SIMD instructions for up to 3.4x
speedup. This threshold was validated across x86, arm64, and Graviton
architectures per nats-io#7858.

https://claude.ai/code/session_01TdeiSyKRBD7Z9xgt8kfHuS
wallyqs pushed a commit to wallyqs/nats-server that referenced this pull request Feb 19, 2026
Use strings.IndexByte instead of byte-by-byte iteration for subjects
>= 32 bytes in match(), hasInterest(), and tokenizeSubjectIntoSlice().
strings.IndexByte leverages SIMD optimizations that provide significant
speedups on longer strings across different architectures.

This mirrors the same optimization applied to sublist.go in PR nats-io#7858.

https://claude.ai/code/session_01Sc5YGRWoPsi6jMKDZ8n2zi
@wallyqs wallyqs force-pushed the wq/tokenize-string-indexbyte branch from 5cb98dd to fb31976 Compare February 19, 2026 22:32
@wallyqs wallyqs changed the title Use strings.Index.Byte in tokenizeSubjectIntoSlice Use strings.IndexByte in tokenizeSubjectIntoSlice Feb 19, 2026
@wallyqs wallyqs marked this pull request as draft February 19, 2026 23:04
wallyqs pushed a commit to wallyqs/nats-server that referenced this pull request Feb 20, 2026
Use strings.IndexByte instead of a manual byte-by-byte loop.
IndexByte leverages SIMD instructions on amd64 for scanning.

Benchmarks with realistic NATS subjects (9 tokens, 105 bytes based
on observed token length distribution) show tokenization is 13%
faster, and 43% faster for short 3-token subjects:

  TokenizeSubjectIntoSlice/3_tokens_19B    23.1ns → 13.1ns  -43%
  TokenizeSubjectIntoSlice/9_tokens_105B   89.3ns → 78.0ns  -13%

Follows the same approach as nats-io#7858
wallyqs pushed a commit to wallyqs/nats-server that referenced this pull request Feb 20, 2026
Remove the threshold-gated branching and use strings.IndexByte
unconditionally in match() and hasInterest(), consistent with the
tokenizeSubjectIntoSlice change.

  Match/3_tokens_19B          93.5ns → 87.3ns   -6.6%
  HasInterest/3_tokens_19B    56.3ns → 54.3ns   -3.5%
  HasInterest/9_tokens_105B   162ns  → 161ns     ~ (p=0.699)

Follows the same approach as nats-io#7858
Use strings.IndexByte instead of byte-by-byte scanning in
tokenizeSubjectIntoSlice for faster subject tokenization.

Signed-off-by: Waldemar Quevedo <wally@nats.io>
@wallyqs wallyqs force-pushed the wq/tokenize-string-indexbyte branch from fb31976 to dce4984 Compare February 20, 2026 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant