Hoist all bounds checks out of loops#31
Hoist all bounds checks out of loops#31agnivade merged 4 commits intoagnivade:masterfrom jub0bs:master
Conversation
Comparison to dgryski's implementationBeforeAfterComparison to arbovm's implementationBeforeAfter |
agnivade
left a comment
There was a problem hiding this comment.
This is very clever, thank you! Hoisting the bounds checks from the main loops looks good to me.
I wonder how much performance are we getting in hoisting out the checks in the trimLongestCommonSuffix and trimLongestCommonPrefix functions. Because it looks like we have to pay quite a large cost to readability.
Could you re-run the benchmarks by reverting the logic of the two functions?
I guess the performance gain largely depends on the two input strings: if they have a long common prefix and/or a long common suffix, avoiding unnecessary bounds checks in those loops is likely to have a non-negligible benefit on performance.
I don't think readability suffers that much, esp. since I've abstracted the two loops to functions with clarifying names.
Here are some results after reverting the commit in question: The commit is worth keeping, in my opinion. What do you think? |
|
I've changed the names of the sub-benchmarks in order for the results to be easier to analyse with benchstat. |
Yes. Could you add the additional benchmarks you added for the common prefix and suffixes? I don't see them in the PR. |
I haven't added any new cases yet, but I can. Edit: Actually, aren't the cases named "Long lead" and "Long trail" already testing exactly this? Are new benchmark cases needed? |
Note: functions trimLongestCommonPrefix and trimLongestCommonSuffix are implemented in such a way as to make them eligible for inlining.
|
My bad there. Somehow I forgot that the long lead and long trail cases already exist. Thanks! |
|
Great! Thanks for taking the time to review and accept my changes. |
|
Thanks for spending the time to improve the library! Hope it's useful to you. |
Some benchmark results (no changes in terms of allocations):