Use strings.IndexByte in tokenizeSubjectIntoSlice by wallyqs · Pull Request #7858 · nats-io/nats-server

wallyqs · 2026-02-19T22:17:34Z

Use strings.IndexByte instead of byte-by-byte scanning in tokenizeSubjectIntoSlice for more efficient tokenization on longer subjects with more tokens.

Signed-off-by: Waldemar Quevedo wally@nats.io

For subjects shorter than 32 bytes, the manual byte-by-byte scan avoids function call overhead and matches or beats strings.IndexByte. For longer subjects, strings.IndexByte leverages SIMD instructions for up to 3.4x speedup. This threshold was validated across x86, arm64, and Graviton architectures per nats-io#7858. https://claude.ai/code/session_01TdeiSyKRBD7Z9xgt8kfHuS

Use strings.IndexByte instead of byte-by-byte iteration for subjects >= 32 bytes in match(), hasInterest(), and tokenizeSubjectIntoSlice(). strings.IndexByte leverages SIMD optimizations that provide significant speedups on longer strings across different architectures. This mirrors the same optimization applied to sublist.go in PR nats-io#7858. https://claude.ai/code/session_01Sc5YGRWoPsi6jMKDZ8n2zi

Use strings.IndexByte instead of a manual byte-by-byte loop. IndexByte leverages SIMD instructions on amd64 for scanning. Benchmarks with realistic NATS subjects (9 tokens, 105 bytes based on observed token length distribution) show tokenization is 13% faster, and 43% faster for short 3-token subjects: TokenizeSubjectIntoSlice/3_tokens_19B 23.1ns → 13.1ns -43% TokenizeSubjectIntoSlice/9_tokens_105B 89.3ns → 78.0ns -13% Follows the same approach as nats-io#7858

Remove the threshold-gated branching and use strings.IndexByte unconditionally in match() and hasInterest(), consistent with the tokenizeSubjectIntoSlice change. Match/3_tokens_19B 93.5ns → 87.3ns -6.6% HasInterest/3_tokens_19B 56.3ns → 54.3ns -3.5% HasInterest/9_tokens_105B 162ns → 161ns ~ (p=0.699) Follows the same approach as nats-io#7858

Use strings.IndexByte instead of byte-by-byte scanning in tokenizeSubjectIntoSlice for faster subject tokenization. Signed-off-by: Waldemar Quevedo <wally@nats.io>

wallyqs requested a review from a team as a code owner February 19, 2026 22:17

wallyqs force-pushed the wq/tokenize-string-indexbyte branch from 01d3793 to 5cb98dd Compare February 19, 2026 22:18

wallyqs force-pushed the wq/tokenize-string-indexbyte branch from 5cb98dd to fb31976 Compare February 19, 2026 22:32

wallyqs changed the title ~~Use strings.Index.Byte in tokenizeSubjectIntoSlice~~ Use strings.IndexByte in tokenizeSubjectIntoSlice Feb 19, 2026

wallyqs marked this pull request as draft February 19, 2026 23:04

Use strings.IndexByte in tokenizeSubjectIntoSlice

dce4984

Use strings.IndexByte instead of byte-by-byte scanning in tokenizeSubjectIntoSlice for faster subject tokenization. Signed-off-by: Waldemar Quevedo <wally@nats.io>

wallyqs force-pushed the wq/tokenize-string-indexbyte branch from fb31976 to dce4984 Compare February 20, 2026 06:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

Use strings.IndexByte in tokenizeSubjectIntoSlice#7858

Use strings.IndexByte in tokenizeSubjectIntoSlice#7858
wallyqs wants to merge 1 commit intomainfrom
wq/tokenize-string-indexbyte

wallyqs commented Feb 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Comments

Conversation

wallyqs commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wallyqs commented Feb 19, 2026 •

edited

Loading