Skip to content

Conversation

@c-gamble
Copy link
Contributor

@c-gamble c-gamble commented Jul 10, 2025

Description of changes

Currently, the OSS code is responsible for initializing tracing/otel for all crates. A global filter is applied in which for all crates that initialize tracing, every filter is applied for that crate. This PR slightly refactors this pattern to define a set of "global default crate filters" and allow initializers to pass in additional crate filters (and even overrides) at their call sites.

Test plan

This PR is the first in a stack with https://github.com/chroma-core/hosted-chroma/pull/3345. Together, these refactor the tracing and otel initialization to move private service names into our private code. Forward/backward compatibility would require a dance in which we doubly-initialize filters for the same crates, which feels like more complexity than it's worth given the scope of these changes. Therefore, these two PRs should be rolled out together.

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

N/A

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@c-gamble c-gamble requested a review from HammadB July 10, 2025 22:34
@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Jul 10, 2025

Refactor Tracing/Otel Initialization with Configurable Crate Filters and Hidden Private Services

This PR refactors tracing and OpenTelemetry (Otel) initialization logic across the Rust codebase to support hiding private service names, modularize crate-level filter configuration, and provide per-service or per-crate filter overrides at runtime. The core change introduces a typed filter configuration (OtelFilter, OtelFilterLevel) allowing each service to specify a set of crate+filter rules, which are combined with global defaults, thereby separating OSS configuration concerns from private code and enabling per-deployment visibility policies. This PR is the first step in a two-part rollout and requires coordinated deployment with a corresponding change in hosted-chroma to fully ensure private service names are hidden from OSS.

Key Changes

• Introduces OtelFilter and OtelFilterLevel types to specify tracing filter rules per crate.
• Refactors init_global_filter_layer and init_otel_tracing APIs to accept custom crate filters as input.
• Removes hardcoded crate name list and global filter string construction in favor of extensible, typed config.
• Propagates otel_filters fields/config through all relevant service configs for worker, compactor, garbage_collector, log-service, and frontend.
• Makes related updates in test code, initialization flows, and config serialization/deserialization to use the new OtelFilter types.
• Updates service entrypoints and integration points to call the new tracing initialization signatures.

Affected Areas

• rust/tracing (init_tracer.rs, lib.rs)
• rust/worker (config.rs, lib.rs)
• rust/garbage_collector (config.rs, lib.rs, garbage_collector_component.rs)
• rust/log-service (lib.rs)
• rust/frontend (config.rs, lib.rs)
• Configuration YAML/Serde serialization for otel_filters

This summary was automatically generated by @propel-code-bot

@blacksmith-sh blacksmith-sh bot deleted a comment from c-gamble Jul 11, 2025
let config = GarbageCollectorConfig {
service_name: "gc".to_string(),
otel_endpoint: "none".to_string(),
otel_filters: vec![],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean nit - should this be the default? You can do ...default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC config currently doesn't implement Default. IMO that's a separate concern from this PR. Alternative options are to leave as an empty array, hard-code the same default filters as are hard-coded in the default fn, or make the default fn public. Of these, I think hard-coding the same value makes sense since it's only one element.

#[serde(default = "CompactionServiceConfig::default_otel_endpoint")]
pub otel_endpoint: String,
#[serde(default = "CompactionServiceConfig::default_otel_filters")]
pub otel_filters: Vec<OtelFilter>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this look in yaml?

Copy link
Contributor Author

@c-gamble c-gamble Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this:

otel_filters:
  - crate_name: some_crate
    filter_level: debug
  - crate_name: other_crate
    filter_level: trace

"query_service",
"wal3",
"worker",
"continuous_verification",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will have to update this in hosted/ with this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@HammadB HammadB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, did you test forwards /backwards compatibility of config parsing to avoid rollout issues? Can you add reasoning / test plan followed for why its safe?

@blacksmith-sh blacksmith-sh bot deleted a comment from c-gamble Jul 15, 2025
@c-gamble c-gamble merged commit 49e9e3f into main Jul 15, 2025
111 of 113 checks passed
@c-gamble c-gamble deleted the cooper/chore-hide-private-services branch July 15, 2025 15:55
c-gamble added a commit that referenced this pull request Jul 15, 2025
## Description of changes

Adds the new tracing init pattern introduced in #5075 to sample YAML
files to demonstrate how these values should be supplied in config.

## Test plan

- [x] Tests pass locally with `pytest` for python, `yarn test` for js,
`cargo test` for rust

## Migration plan

N/A

## Observability plan

Will monitor honeycomb in staging to make sure we are still getting
traces for all services.

## Documentation Changes

N/A
Inventrohyder pushed a commit to Inventrohyder/chroma that referenced this pull request Aug 5, 2025
…#5075)

## Description of changes

Currently, the OSS code is responsible for initializing tracing/otel for
all crates. A global filter is applied in which for all crates that
initialize tracing, every filter is applied for that crate. This PR
slightly refactors this pattern to define a set of "global default crate
filters" and allow initializers to pass in additional crate filters (and
even overrides) at their call sites.

## Test plan

This PR is the first in a stack with
https://github.com/chroma-core/hosted-chroma/pull/3345. Together, these
refactor the tracing and otel initialization to move private service
names into our private code. Forward/backward compatibility would
require a dance in which we doubly-initialize filters for the same
crates, which feels like more complexity than it's worth given the scope
of these changes. Therefore, these two PRs should be rolled out
together.

- [x] Tests pass locally with `pytest` for python, `yarn test` for js,
`cargo test` for rust

## Documentation Changes

N/A
Inventrohyder pushed a commit to Inventrohyder/chroma that referenced this pull request Aug 5, 2025
## Description of changes

Adds the new tracing init pattern introduced in chroma-core#5075 to sample YAML
files to demonstrate how these values should be supplied in config.

## Test plan

- [x] Tests pass locally with `pytest` for python, `yarn test` for js,
`cargo test` for rust

## Migration plan

N/A

## Observability plan

Will monitor honeycomb in staging to make sure we are still getting
traces for all services.

## Documentation Changes

N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants