Conversation
VectorDB Benchmark — Dense — PassedTriggered by @omnish-endee · Commit
|
|
/correctness_benchmarking dense |
3 similar comments
|
/correctness_benchmarking dense |
|
/correctness_benchmarking dense |
|
/correctness_benchmarking dense |
VectorDB Benchmark - Ready To Run
Post one of the command below. Only members with write access can trigger runs. Available Modes
Infrastructure
Both servers start on demand and are always terminated after the run — pass or fail. How Correctness Benchmarking Works
|
|
/correctness_benchmarking dense |
VectorDB Benchmark — Dense — FailedTriggered by @hemant-endee · Commit
|
| struct RebuildResult { | ||
| bool success; | ||
| int http_code; | ||
| std::string message; | ||
| }; | ||
|
|
There was a problem hiding this comment.
use unsigned char and std::string. declare a generic return struct which will be used everywhere in the codebase eventually.
for each function returning this struct, at the comments above the function definition will define what each return code means. This can be later conglomerated as ENUMS.
| int code = (message.find("already in progress") != std::string::npos) ? 409 : 400; | ||
| return json_error(code, message); | ||
| if (!result.success) { | ||
| return json_error(result.http_code, result.message); |
There was a problem hiding this comment.
do not couple http codes to the core source. decide the http_code to be returned based on the return code of the function,.
There was a problem hiding this comment.
dont use atomics in status variables
There was a problem hiding this comment.
remove jthread from here
There was a problem hiding this comment.
dont use string comparisons for this.
| } catch (const std::exception&) { | ||
| // Silently ignore cleanup errors on startup | ||
| } catch (const std::exception& e) { | ||
| LOG_WARN(2053, "rebuild", "Failed to cleanup temp files on startup: " << e.what()); |
There was a problem hiding this comment.
use 1800 series for rebuild logs. add it to docs/logs.md
* Rebuild index with new config * fix 1 * index name in get stattu api correction * docs changes * Rebuild Status Persistence
…me as addVectors)
117a81f to
6f4083e
Compare
…y + IndexManager to executeJob instead of unpacking fields
| void Rebuild::cleanupTempFiles(const std::string& data_dir) { | ||
| if (!std::filesystem::exists(data_dir)) { | ||
| return; | ||
| } | ||
| try { | ||
| std::string temp_filename = std::string(settings::DEFAULT_SUBINDEX) + ".idx.temp"; | ||
| std::string ts_prefix = std::string(settings::DEFAULT_SUBINDEX) + ".idx."; | ||
| for (const auto& entry : std::filesystem::recursive_directory_iterator(data_dir)) { | ||
| if (!entry.is_regular_file()) continue; | ||
| const std::string fname = entry.path().filename().string(); | ||
| bool is_temp = (fname == temp_filename); | ||
| bool is_ts = fname.size() > ts_prefix.size() | ||
| && fname.substr(0, ts_prefix.size()) == ts_prefix | ||
| && std::all_of(fname.begin() + ts_prefix.size(), fname.end(), ::isdigit); | ||
| if (is_temp || is_ts) { | ||
| std::filesystem::remove(entry.path()); | ||
| } | ||
| } | ||
| } catch (const std::exception& e) { | ||
| LOG_WARN(1803, "rebuild", "Failed to cleanup temp files on startup: " << e.what()); | ||
| } | ||
| } |
There was a problem hiding this comment.
should not throw error if there are not cleanup files left in the directory.
| std::filesystem::copy_file(p.timestamped_path, p.index_path, | ||
| std::filesystem::copy_options::overwrite_existing); |
There was a problem hiding this comment.
why is renaming the index file not enough here ?
| std::filesystem::copy_file(p.timestamped_path, p.index_path, | ||
| std::filesystem::copy_options::overwrite_existing); | ||
|
|
||
| // Cannot call reloadIndex() here — we hold operation_mutex and reloadIndex acquires |
There was a problem hiding this comment.
st.stop_requested() is not checked in phase 3. makes the shutdown time unbounded for large indexes.
| new_alg->saveIndex(p.timestamped_path); | ||
| std::filesystem::copy_file(p.timestamped_path, p.index_path, | ||
| std::filesystem::copy_options::overwrite_existing); | ||
|
|
||
| // Cannot call reloadIndex() here — we hold operation_mutex and reloadIndex acquires | ||
| // indices_mutex_, while deleteIndex holds indices_mutex_ then acquires operation_mutex. | ||
| // Calling reloadIndex here would deadlock with a concurrent delete on the same index. | ||
| auto fresh_alg = std::make_unique<hnswlib::HierarchicalNSW<float>>(p.index_path, 0); | ||
| IndexManager::wireVectorFetchers(fresh_alg.get(), entry->vector_storage); |
There was a problem hiding this comment.
new_alg already has all the data and was just persisted.
can use entry->alg = std::move(new_alg) directly.
doesnt seem useful to re load again
| @@ -0,0 +1,124 @@ | |||
| # Index Rebuild | |||
There was a problem hiding this comment.
Things that are missing in the doc. make it more comprehensive
Disk space: rebuild needs 2× the index size on disk (timestamped + canonical co-exist briefly).
Memory: 2× the graph in RAM during Phase 2 (old + new).
Expected duration: rough order of magnitude per million vectors helps capacity planning.
What "manually restart" means after a crash — operationally how does an operator know a rebuild was incomplete? (There's no persisted state.)
| std::string Rebuild::formatTime(std::chrono::system_clock::time_point tp) { | ||
| return timeToISO8601(tp); | ||
| } |
There was a problem hiding this comment.
no one is using this function
| rebuild_.setActiveRebuild(username, index_id, current_count); | ||
|
|
||
| // Spawn thread — lambda calls rebuild_.executeJob directly (execution lives in Rebuild) | ||
| std::jthread t([this, params = std::move(params)](std::stop_token st) mutable { |
There was a problem hiding this comment.
here mutable is not required. executeJob takes const RebuildJobParams&, so the lambda doesn't actually mutate
| case RebuildStatus::IN_PROGRESS: return "in_progress"; | ||
| case RebuildStatus::COMPLETED: return "completed"; | ||
| case RebuildStatus::FAILED: return "failed"; | ||
| default: return "unknown"; |
There was a problem hiding this comment.
can never reach this line since RebuildStatus has only three defined values
No description provided.