Skip to content

Conversation

@eculver
Copy link
Contributor

@eculver eculver commented Jul 10, 2025

Description of changes

Similar to what we did in #4845, this adds the jemalloc pprof server to the query-service server so that we can do memory profiling for the service using Polar Signals.

Note: I also updated a place in the garbage collector's implementation where the port was hard-coded to 6060.

  • New functionality
    • pprof server started when config given

Test plan

Build should succeed and updating chroma_config.yaml with local tilt environment to test that pprof endpoint is up.

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

@eculver eculver requested a review from codetheweb July 10, 2025 20:31
@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@github-actions
Copy link

github-actions bot commented Jul 10, 2025

✅ The Helm chart's version was changed. Your changes to the chart will be published upon merge to main.

@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Jul 10, 2025

Add jemalloc pprof Server to Query Service for Memory Profiling

This PR introduces support for starting a jemalloc-based pprof HTTP profiling server in the Chroma query-service, enabling memory profiling via Polar Signals or similar tooling. The addition is controlled via new configuration (jemalloc_pprof_server_port), and chart/k8s manifests are updated to expose the port and allow adventurous users to activate jemalloc profiling as part of deployment. The garbage collector also now accepts a dynamic port for its own pprof server, fixing previous hard-coding.

Key Changes

• Added chroma-jemalloc-pprof-server as a dependency and integrated spawn_pprof_server into the query service.
• Extended QueryServiceConfig and worker logic to optionally start/stop the pprof server based on new config port.
• Kubernetes Helm charts and templates updated to expose and configure the pprof port (6060/TCP) and jemalloc environment variable.
• Upgraded chart version and values.dev.yaml for easier profiling in dev environments.
• Replaced hardcoded port in the garbage collector's pprof logic with dynamic configuration.

Affected Areas

• rust/worker/src/server.rs (query service logic and initialization)
• rust/worker/src/bin/query_service.rs (allocator integration)
• rust/worker/src/config.rs (configuration struct)
• rust/garbage_collector/src/lib.rs (fix port usage)
• k8s/distributed-chroma/ (Helm templates, charts, dev values)
• Cargo.toml, Cargo.lock (dependency changes)

This summary was automatically generated by @propel-code-bot

@eculver eculver merged commit 798f0ee into main Jul 10, 2025
57 checks passed
@eculver eculver deleted the eculver/rust-query-service-pprof branch July 10, 2025 22:16
eculver added a commit that referenced this pull request Jul 11, 2025
## Description of changes

Similar to what we did in #4845 and #5072, this adds the
jemalloc pprof server to the compaction-service server so that we can do
memory profiling for the service using Polar Signals.

- New functionality
  - pprof server started when config given

## Test plan

Build should succeed and updating chroma_config.yaml with local tilt
environment to test that pprof endpoint is up.
chroma-droid pushed a commit that referenced this pull request Jul 11, 2025
## Description of changes

Similar to what we did in #4845 and #5072, this adds the
jemalloc pprof server to the compaction-service server so that we can do
memory profiling for the service using Polar Signals.

- New functionality
  - pprof server started when config given

## Test plan

Build should succeed and updating chroma_config.yaml with local tilt
environment to test that pprof endpoint is up.
Inventrohyder pushed a commit to Inventrohyder/chroma that referenced this pull request Aug 5, 2025
## Description of changes

Similar to what we did in chroma-core#4845, this adds the jemalloc pprof server to
the query-service server so that we can do memory profiling for the
service using Polar Signals.

Note: I also updated a place in the garbage collector's implementation
where the port was hard-coded to 6060.

- New functionality
  - pprof server started when config given

## Test plan

Build should succeed and updating `chroma_config.yaml` with local tilt
environment to test that pprof endpoint is up.

- [x] Tests pass locally with `pytest` for python, `yarn test` for js,
`cargo test` for rust
Inventrohyder pushed a commit to Inventrohyder/chroma that referenced this pull request Aug 5, 2025
## Description of changes

Similar to what we did in chroma-core#4845 and chroma-core#5072, this adds the
jemalloc pprof server to the compaction-service server so that we can do
memory profiling for the service using Polar Signals.

- New functionality
  - pprof server started when config given

## Test plan

Build should succeed and updating chroma_config.yaml with local tilt
environment to test that pprof endpoint is up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants