Skip to content

perf: swap order of escape regexes to avoid lookahead#546

Merged
zbjornson merged 3 commits intosiimon:masterfrom
ngavalas:master
Mar 8, 2023
Merged

perf: swap order of escape regexes to avoid lookahead#546
zbjornson merged 3 commits intosiimon:masterfrom
ngavalas:master

Conversation

@ngavalas
Copy link
Copy Markdown
Contributor

@ngavalas ngavalas commented Mar 8, 2023

When looking at some CPU profiles of metric -> string serialization, I noticed that escaping all of the strings and labels was showing up in the flamegraph more than I would expect. This isn't a massive win by any means, but it's pretty consistently ~20% faster on a microbenchmark: https://jsbench.me/qklez5ombn/1. I'll take 20%.

Some notes:

  1. The benchmarks literally don't test this case, as far as I can tell. None of the registry benchmarks have quotes or newlines or anything in the labels? An A/A test (master vs. 14.2.0) was producing inconsistent results, so I'm not quite sure I trust that benchmark anyway.
  2. Correctness-wise, this should work. Replacing the \ before replacing the \n fixes the case that made you need the .replace(/\\(?!n)/g, '\\\\') regex in the first place, so now we can just do \\ -> \\\\ and \n -> \\n and " -> \\" in that order without lookaheads.

Relevant to #543.

@zbjornson
Copy link
Copy Markdown
Collaborator

zbjornson commented Mar 8, 2023

Nice. prom-client still support Node.js 10 though, and String.prototype.replaceAll wasn't added until Node.js 15.0.0. I think the other maintainers are fine with bumping the minimum supported version, but we should maybe wait until Apr 30 when Node.js v14.x is EOL.

Did you benchmark just reversing the two regexps so that a lookbehindahead isn't needed? I'm curious if that would give a similar speedup.

@ngavalas
Copy link
Copy Markdown
Contributor Author

ngavalas commented Mar 8, 2023

Totally forgot that we used to live in a world without replaceAll. Dark times.

From the microbenchmark, it seems like the swapped regex is as fast, if not faster (!!!). Somewhat surprising given that replaceAll is like the simplest case of regex matching, so I'd expect it to be as fast as the equivalent regex, but...

https://jsbench.me/qklez5ombn/1

I'm happy to just swap the regex order and call it a day given that it seems to be the winner anyway. The thing that really matters is that these operations (original regex vs. swapped w/o lookahead) are equivalent; am I missing anything?

FWIW I did make sure that on my test string (const test = 'askjdskajldbhflkashfdkja\nsakljdalsjdaslkdjalsdjlkasd""""""sadjklasd\adqweojkioe23mac\n\n\nasdlkjqodq\qasjkdadasdaskdjqlk;j'), removing the lookahead but retaining the original regex order leads to the wrong results (expected) and swapped them without the lookahead gives the same results as the original code (which is what we wanted).

@ngavalas ngavalas changed the title perf: replace escape regexes with simpler string.replaceAll perf: swap order of escape regexes to avoid lookahead Mar 8, 2023
@ngavalas
Copy link
Copy Markdown
Contributor Author

ngavalas commented Mar 8, 2023

I'll update the changelog after all the checks come back green to avoid an extra roundtrip if something's still wrong. My bad. :)

@zbjornson
Copy link
Copy Markdown
Collaborator

Awesome, easy win.

Yes, the behavior from just swapping the regexps looks correct.

Copy link
Copy Markdown
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zbjornson zbjornson merged commit 199b7d1 into siimon:master Mar 8, 2023
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants