[DO NOT MERGE, ONLY FOR REVIEW] Feature/quantile implementation#3
Draft
rileydes-improving wants to merge 14 commits intofeature/vector-reducer-functionsfrom
Draft
[DO NOT MERGE, ONLY FOR REVIEW] Feature/quantile implementation#3rileydes-improving wants to merge 14 commits intofeature/vector-reducer-functionsfrom
rileydes-improving wants to merge 14 commits intofeature/vector-reducer-functionsfrom
Conversation
5a2adda to
5e863aa
Compare
2819935 to
0117d89
Compare
AlexFilipImproving
pushed a commit
that referenced
this pull request
Mar 11, 2026
Resolves: valkey-io#703 Encountered the following series of crashes during engine startup. 1. SIGSEGV in `string2ll_resolver` — IFUNC resolver called before TSAN runtime initializes ``` Program received signal SIGSEGV, Segmentation fault. 0x00007ffff6de0fb0 in ?? () #0 0x00007ffff6de0fb0 in ?? () #1 0x00000000004b89a7 in string2ll_resolver.lto_priv () #2 0x00007ffff7de4d94 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2 #3 0x00007ffff7ddceaa in dl_main () from /lib64/ld-linux-x86-64.so.2 #4 0x00007ffff7df0b5f in _dl_sysdep_start () from /lib64/ld-linux-x86-64.so.2 #5 0x00007ffff7ddad2e in _dl_start () from /lib64/ld-linux-x86-64.so.2 #6 0x00007ffff7dd9ef8 in _start () from /lib64/ld-linux-x86-64.so.2 valkey-io#7 0x0000000000000003 in ?? () valkey-io#8 0x00007fffffffe04c in ?? () valkey-io#9 0x00007fffffffe0b6 in ?? () valkey-io#10 0x00007fffffffe0bd in ?? () valkey-io#11 0x0000000000000000 in ?? () ``` 2. TSAN abort on `RTLD_DEEPBIND` — module loader uses `RTLD_DEEPBIND` which is incompatible with TSAN ``` cat /workplace/baswanth/tsan/valkey-search/testing/integration/.build-release-tsan/output/6379_stdout.txt 2>/dev/null | tail -20 45443:M 25 Feb 2026 01:57:58.222 # WARNING Memory overcommit must be enabled! Without it, a background save or replication may fail under low memory condition. Being disabled, it can also cause failures without low memory condition, see jemalloc/jemalloc#1328. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 45443:M 25 Feb 2026 01:57:58.222 * oO0OoO0OoO0Oo Valkey is starting oO0OoO0OoO0Oo 45443:M 25 Feb 2026 01:57:58.222 * Valkey version=9.0.1, bits=64, commit=ab3c953b, modified=1, pid=45443, just started 45443:M 25 Feb 2026 01:57:58.222 * Configuration loaded 45443:M 25 Feb 2026 01:57:58.224 * monotonic clock: POSIX clock_gettime 45443:M 25 Feb 2026 01:57:58.226 * Running mode=cluster, port=6379. 45443:M 25 Feb 2026 01:57:58.226 * No cluster configuration found, I'm a13c403e8a798f9ad051c654ff540ce1b3833b9d ==45443==You are trying to dlopen a /workplace/baswanth/tsan/valkey-search/.build-release-tsan/valkey-json/build/src/libjson.so shared library with RTLD_DEEPBIND flag which is incompatible with sanitizer runtime (see google/sanitizers#611 for details). If you want to run /workplace/baswanth/tsan/valkey-search/.build-release-tsan/valkey-json/build/src/libjson.so library under sanitizers please remove RTLD_DEEPBIND from dlopen flags. ``` 3. `CHECK failed: sanitizer_deadlock_detector.h:67` — TSAN deadlock detector overflows during module load ``` 7826:M 25 Feb 2026 02:39:45.284 * Valkey version=9.0.1, bits=64, commit=ab3c953b, modified=1, pid=17826, just started 17826:M 25 Feb 2026 02:39:45.284 * Configuration loaded 17826:M 25 Feb 2026 02:39:45.286 * monotonic clock: POSIX clock_gettime 17826:M 25 Feb 2026 02:39:45.288 * Running mode=cluster, port=6379. 17826:M 25 Feb 2026 02:39:45.289 * No cluster configuration found, I'm 134edf06efe1a2e084ce9c94a43dab67e8376dfc 17826:M 25 Feb 2026 02:39:45.298 . <module> Setting shards to 524287 ThreadSanitizer: CHECK failed: sanitizer_deadlock_detector.h:67 "((n_all_locks_)) < (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]))))" (0x40, 0x40) (tid=17826) ``` core changes that needs to be added. ``` cd /workplace/baswanth/tsan/valkey-search/.build-release-tsan/valkey-server && git diff src/util.c src/module.c src/config.h diff --git a/src/config.h b/src/config.h index 0c6b111a9..f80fa8cbc 100644 --- a/src/config.h +++ b/src/config.h @@ -172,6 +172,10 @@ #define VALKEY_NO_SANITIZE(sanitizer) #endif +#if defined(__SANITIZE_THREAD__) +#define VALKEY_THREAD_SANITIZER 1 +#endif + #if defined(__SANITIZE_ADDRESS__) /* GCC */ #define VALKEY_ADDRESS_SANITIZER 1 diff --git a/src/module.c b/src/module.c index 8d1bad4fe..0435499a8 100644 --- a/src/module.c +++ b/src/module.c @@ -12545,7 +12545,7 @@ int moduleLoad(const char *path, void **module_argv, int module_argc, int is_loa } int dlopen_flags = RTLD_NOW | RTLD_LOCAL; -#if (defined(__GLIBC__) || defined(__FreeBSD__)) && !defined(VALKEY_ADDRESS_SANITIZER) && __has_include(<dlfcn.h>) +#if (defined(__GLIBC__) || defined(__FreeBSD__)) && !defined(VALKEY_ADDRESS_SANITIZER) && !defined(VALKEY_THREAD_SANITIZER) && __has_include(<dlfcn.h>) /* RTLD_DEEPBIND, which is required for loading modules that contains the * same symbols, does not work with ASAN. Therefore, we exclude * RTLD_DEEPBIND when doing test builds with ASAN. diff --git a/src/util.c b/src/util.c index 0631736c8..cf5a073b1 100644 --- a/src/util.c +++ b/src/util.c @@ -622,7 +622,7 @@ static int string2llScalar(const char *s, size_t slen, long long *value) { } #if HAVE_IFUNC && HAVE_X86_SIMD -__attribute__((no_sanitize_address, used)) static int (*string2ll_resolver(void))(const char *, size_t, long long *) { +__attribute__((no_sanitize_address, no_sanitize("thread"), used)) static int (*string2ll_resolver(void))(const char *, size_t, long long *) { /* Ifunc resolvers run before ASan initialization and before CPU detection * is initialized, so disable ASan and init CPU detection here. */ __builtin_cpu_init(); ``` --------- Signed-off-by: Baswanth Vegunta <baswanth@amazon.com> Co-authored-by: Baswanth Vegunta <baswanth@amazon.com>
Changed the ProcessRecords interface to accept all records at once instead of processing them one at a time. This enables more efficient implementations, particularly for COUNT which now simply returns the input size rather than incrementing a counter for each record. Changes: - Updated ReducerInstance::ProcessRecords to accept const std::vector<absl::InlinedVector<expr::Value, 4>>& all_values - Modified GroupBy::Execute to collect all values per group before calling ProcessRecords once - Updated all reducer implementations (Count, Min, Max, Sum, Avg, Stddev, CountDistinct) to process batched records Benefits: - Count reducer is now O(1) instead of O(n) function calls - Better separation between data collection and processing phases - Enables future optimizations for batch-aware reducers Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
Signed-off-by: Allen Samuels <allenss@amazon.com>
Signed-off-by: Allen Samuels <allenss@amazon.com>
Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
5e863aa to
a194333
Compare
…imation - Add QUANTILE reducer function using Greenwald-Khanna streaming algorithm - Support quantile values from 0.0 to 1.0 with 1% error bound for memory efficiency - Update ft.aggregate.md documentation with QUANTILE reducer specification - Add QUANTILE to ft.aggregate.json command schema with 2 required arguments - Implement QUANTILE in ft_aggregate_exec.cc with streaming quantile calculation - Update ft_aggregate_parser.cc and ft_aggregate_parser.h to parse QUANTILE syntax - Add comprehensive test coverage in ft_aggregate_parser_test.cc and ft_aggregate_exec_test.cc - Add compatibility tests for basic usage, multiple quantiles, edge cases, and error conditions - Add integration tests validating QUANTILE with various quantile values and groupby operations Signed-off-by: Riley Des <riley.desserre@improving.com>
a194333 to
09f81ba
Compare
…oordinator / iscancelled cases (valkey-io#901) The PR does the following: Update to oom flags on search commands. We are marking search commands with the denyoom flags following what was done since 1.0. Updated Error Tracking / Propagation to ensure we do not always return the "timeout" message even we cancel a fanout operation. Updated the normal code path of search commands (commands.cc) to first attempt replying with an error message based on the status. If no error, then check if canceled and return the standard timeout error. Note: The logic for early cancellation when partial results are disabled is still retained in this PR. If there is a timeout when partial results are disabled, we cancel without waiting and return the timed out error message. The parts which are new are the following: We have added handling for the case (with partial results disabled) and one of the fanouts fails with OOM. We now have correct with error propagation for it It also adds handling for the case (with partial results enabled) where we have not timed out, and all nodes have failed. Since it tracks the first error message, we are able to reply with the error now --------- Signed-off-by: KarthikSubbarao <karthikrs2021@gmail.com>
Signed-off-by: Alexandru Filip <alexandru.filip@improving.com>
…ey-io#904) Signed-off-by: Elias Tamraz <tamrazelias@gmail.com>
Removing some scenarios due to benchamrk workflow taking too long. Removed 12b , 12c, 12e -> PositionMap extreme test already covered in 12a. 12b, 12c, 12e test the PositionMap differently but not needed now. Removed 14a -> failing on CME Removed 14c, 14e -> Removing Baseline case with SUFFIXTRIE enabled and STOPWORDS 0 since same performance as the baseline case, so not adding any value. Changed 13b to have 500 clients from 1000. --------- Signed-off-by: Aksha Thakkar <thaakb@amazon.com> Signed-off-by: AkshaThakkar1812 <akshathakkar@gmail.com> Co-authored-by: Aksha Thakkar <thaakb@amazon.com>
…y-io#896) kDefaultMinPrefixLength is annotated as “configurable”, but it is currently hardcoded and cannot be changed at runtime. This PR adds a module config (search.tag-min-prefix-length) to control the minimum prefix length for TAG wildcard queries (e.g., @field:{prefix*}). issue : valkey-io#895 Signed-off-by: Su Ko <rhtn1128@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.