Skip to content

generic cpu compilation and fallback#227

Open
shaleenji wants to merge 2 commits intomasterfrom
generic_cpu_compilation
Open

generic cpu compilation and fallback#227
shaleenji wants to merge 2 commits intomasterfrom
generic_cpu_compilation

Conversation

@shaleenji
Copy link
Copy Markdown
Collaborator

Pull Request

Summary

This PR removes the need to be extremely rigid in the way compiation for different CPU capabilities is done. AVX2 AVX512 etc. A good way to do things is to be able to compile the code in a generic fashion and enable higher capabilities based on the CPU we are running on currently.

Type of Change

  • New feature

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 23, 2026

VectorDB Benchmark - Ready To Run

CI Passed ([lint + unit tests] (https://github.com/endee-io/endee/actions/runs/24833016032)) - benchmark options unlocked.

Post one of the command below. Only members with write access can trigger runs.


Available Modes

Mode Command What runs
Dense /correctness_benchmarking dense HNSW insert throughput · query P50/P95/P99 · recall@10 · concurrent QPS
Hybrid /correctness_benchmarking hybrid Dense + sparse BM25 fusion · same suite + fusion latency overhead

Infrastructure

Server Role Instance
Endee Server Endee VectorDB — code from this branch t2.large
Benchmark Server Benchmark runner t3a.large

Both servers start on demand and are always terminated after the run — pass or fail.


How Correctness Benchmarking Works

1. Post /correctness_benchmarking <mode>
2. Endee Server Create  →  this branch's code deployed  →  Endee starts in chosen mode
3. Benchmark Server Create  →  benchmark suite transferred
4. Benchmark Server runs correctness benchmarking against Endee Server
5. Results posted back here  →  pass/fail + full metrics table
6. Both servers terminated   →  always, even on failure

After a new push, CI must pass again before this menu reappears.

- Skip the post-build ndd symlink when the binary is already named
  'ndd' to prevent a self-referential symlink on generic CPU builds
- Add AVX2 SIMD paths for fp16↔fp32 vector conversion and scaled
  quantization, filling the gap between AVX-512 and scalar fallback
- Refactor AVX-512 quantization block to use scoped variables and
  support runtime dispatch via NDD_RUNTIME_X86_DISPATCH
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants