Skip to content

Expose wrapped scorer in ProfileScorer via getWrappedScorer method#20549

Merged
jainankitk merged 2 commits intoopensearch-project:mainfrom
martin-gaievski:profile-scorer-get-wrapped-scorer
Feb 6, 2026
Merged

Expose wrapped scorer in ProfileScorer via getWrappedScorer method#20549
jainankitk merged 2 commits intoopensearch-project:mainfrom
martin-gaievski:profile-scorer-get-wrapped-scorer

Conversation

@martin-gaievski
Copy link
Copy Markdown
Member

Description

This change exposes the wrapped scorer in ProfileScorer via a new getWrappedScorer() method.

Background: When profiling is enabled, OpenSearch wraps scorers with ProfileScorer to collect timing metrics. However, plugin queries (like neural-search's HybridQuery) may extend the standard Lucene Scorer API with custom methods. Currently, plugins cannot access these custom methods when profiling is enabled because the wrapped scorer is stored in a private field with no accessor.

Solution: Add a getWrappedScorer() method that returns the wrapped scorer, following the same pattern already established in the codebase:

  • ProfileCollector.getDelegate() (same profiling package)
  • ProfilingAggregator.unwrapAggregator() (aggregation profiling)

Use case: neural-search's HybridQuery needs to call docIdStream() on its HybridBulkScorer, which is not part of the standard Lucene Scorer API.

Related Issues

#20548

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 4, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

This PR adds a public accessor method getWrappedScorer() to the ProfileScorer class, exposing the underlying wrapped scorer to enable plugin compatibility. The change includes comprehensive Javadoc documentation and a unit test covering the new functionality.

Changes

Cohort / File(s) Summary
New Public Accessor
server/src/main/java/org/opensearch/search/profile/query/ProfileScorer.java
Added public method getWrappedScorer() with Javadoc explaining its purpose for plugin queries and clarifying that mutating calls bypass profiling while read-only access is safe.
Test Coverage
server/src/test/java/org/opensearch/search/profile/query/ProfileScorerTests.java
Added unit test testGetWrappedScorer() verifying that the method correctly returns the original wrapped scorer from a complete scorer pipeline.
Documentation
CHANGELOG.md
Updated changelog to document the new getWrappedScorer() method addition.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and clearly describes the main change: exposing the wrapped scorer via a new getWrappedScorer method in ProfileScorer.
Description check ✅ Passed The description comprehensively covers the change, provides context about why it's needed, cites relevant patterns in the codebase, explains the use case, includes testing confirmation, and properly acknowledges the Apache 2.0 license.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 4, 2026

❌ Gradle check result for 91ad9df: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@martin-gaievski
Copy link
Copy Markdown
Member Author

Failing test is unrelated to this change, found using OpenSearch Gradle Check Metrics Dashboard that this is usual suspect for flaky tests.

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests 'org.opensearch.cluster.routing.allocation.decider.ShardsLimitAllocationDeciderIT' -Dtests.method='testCombinedClusterAndIndexSpecificShardLimits {p0={"opensearch.experimental.feature.writable_warm_index.enabled":"true"}}' -Dtests.seed=9CABF380A17E9DDB -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=sd-Deva-IN -Dtests.timezone=Asia/Bishkek -Druntime.java=25

ShardsLimitAllocationDeciderIT > testCombinedClusterAndIndexSpecificShardLimits {p0={"opensearch.experimental.feature.writable_warm_index.enabled":"true"}} FAILED
    java.lang.AssertionError: Total assigned shards should be 17 expected:<17> but was:<16>
        at __randomizedtesting.SeedInfo.seed([9CABF380A17E9DDB:1EA359D23A23D919]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:647)
        at org.opensearch.cluster.routing.allocation.decider.ShardsLimitAllocationDeciderIT.lambda$testCombinedClusterAndIndexSpecificShardLimits$0(ShardsLimitAllocationDeciderIT.java:303)
        at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1193)
        at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1166)
        at org.opensearch.cluster.routing.allocation.decider.ShardsLimitAllocationDeciderIT.testCombinedClusterAndIndexSpecificShardLimits(ShardsLimitAllocationDeciderIT.java:263)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@CHANGELOG.md`:
- Line 11: The changelog entry for exposing the wrapped scorer should be moved
from the "Changed" section to the "Added" section because it introduces a new
public API; locate the line mentioning "Expose wrapped scorer in ProfileScorer
via getWrappedScorer" (reference symbol: getWrappedScorer and ProfileScorer) in
CHANGELOG.md and cut/paste that bullet into the "Added" section, keeping the
copy and issue reference intact and removing it from the "Changed" section.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d831e3 and 91ad9df.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/search/profile/query/ProfileScorer.java
  • server/src/test/java/org/opensearch/search/profile/query/ProfileScorerTests.java
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.

Applied to files:

  • server/src/main/java/org/opensearch/search/profile/query/ProfileScorer.java
🔇 Additional comments (2)
server/src/test/java/org/opensearch/search/profile/query/ProfileScorerTests.java (1)

105-114: Good coverage for the new accessor.

server/src/main/java/org/opensearch/search/profile/query/ProfileScorer.java (1)

68-86: No changes needed—the design is correct.

ProfileScorer is intentionally package-private and marked @opensearch.internal. External plugins access the getWrappedScorer() method polymorphically through the public Scorer interface, not by directly referencing the ProfileScorer class. This is a standard pattern in Java instrumentation and requires no utility class or visibility modification. The implementation properly supports the documented use case for plugins like neural-search.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 4, 2026

❌ Gradle check result for ba22d8f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 5, 2026

❌ Gradle check result for ba22d8f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 5, 2026

❌ Gradle check result for ba22d8f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 5, 2026

❌ Gradle check result for ba22d8f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Copy Markdown
Contributor

@jainankitk jainankitk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the solution works for addressing the inconsistency across collectors/scorers, I am wondering if there is a better way since HybridBulkScorer will not be profiled even when the profiling option is set to true, right?

@martin-gaievski
Copy link
Copy Markdown
Member Author

While the solution works for addressing the inconsistency across collectors/scorers, I am wondering if there is a better way since HybridBulkScorer will not be profiled even when the profiling option is set to true, right?

Thanks for the thoughtful review. You raise a valid technical point about BulkScorer-level profiling.

With this change, hybrid queries are now fully profiled at the query/weight level. Here is summary of what is in profiler output based on the POC I built using changes from this PR

HybridQuery: time_in_nanos: 12,444,791
├── breakdown.create_weight: 4,008,459
├── breakdown.build_scorer: 8,321,958
├── breakdown.next_doc: 11,249

├── children[0]: BooleanQuery (match)   time_in_nanos: 12,413,459
   ├── children: TermQuery "name:cookies"  time_in_nanos: 11,457,584
   ├── children: TermQuery "name:kitchen"  time_in_nanos: 8,749,625  
   └── children: TermQuery "name:aroma"    time_in_nanos: 8,671,626

├── children[1]: IndexOrDocValuesQuery (range)  time_in_nanos: 8,576,000
└── children[2]: LuceneEngineKnnVectorQuery (knn)  time_in_nanos: 8,562,458

detailed response is in neural-search PR opensearch-project/neural-search#1255 (comment). this provides:

  • total HybridQuery execution time
  • per-sub-query breakdown (users can identify which sub-query is the bottleneck)
  • standard profiling metrics: create_weight, build_scorer, next_doc, score, etc.
  • collector timing via HybridCollectorManager

Looking at OpenSearch core, there is no ProfileBulkScorer equivalent to ProfileScorer. Current profiling infrastructure wraps:

  • WeightProfileWeight
  • ScorerProfileScorer
  • CollectorProfileCollector

But BulkScorer is not in that list, I think that's because:

  1. BulkScorer is an optimization path, not the primary scoring interface
  2. Most queries don't use custom BulkScorers (they use the default implementation)
  3. Timing captured at the Weight level already includes BulkScorer execution

HybridBulkScorer's internal operations (window scoring, doc ID iteration) are indirectly captured in the parent HybridQuery's time_in_nanos. The only "missing" granularity is a breakdown like:

  • score_window: X ns
  • iterate_docs: Y ns

However, this level of detail is rarely actionable for users - what they need to know is "which sub-query is slow?", which the current profiling fully answers.

Adding BulkScorer profiling (in the way I see it) would require:

  1. new ProfileBulkScorer class in OpenSearch core (significant scope expansion)
  2. changes to IndexSearcher.search() to wrap BulkScorers
  3. new QueryTimingType values for BulkScorer operations

This could be a separate enhancement PR, but is beyond the scope of fixing the original issue opensearch-project/neural-search#1255 which was specifically about getting profiling to work at all for hybrid query

@martin-gaievski
Copy link
Copy Markdown
Member Author

This time there is bunch of FailedToCommitClusterStateException: publication failed for 2.19.5 tests

ERROR][o.o.g.G.RemotePersistedState] [v2.19.5-remote-0] Latest manifest is not present in remote store for cluster UUID: pG7Prmj0S7KRGT8wSz6Sxw
» WARN ][o.o.t.SniffConnectionStrategy] [v2.19.5-remote-0] fetching nodes from external cluster [cluster] failed
»  org.opensearch.transport.ConnectTransportException: [][127.0.0.1:9300] connect_exception
»  	at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1078)
»  	at org.opensearch.core.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:217)
»  	at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:57)
»  	at java.****/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
»  	at java.****/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
»  	at java.****/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
»  	at java.****/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2194)
»  	at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:72)
»  	at org.opensearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:83)
»  	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:603)
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:596)
»  	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:572)
»  	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:505)
»  	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:649)
»  	at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:642)
»  	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:131)
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:326)
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:342)
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:784)
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:697)
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:660)
»  	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
»  	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
»  	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
»  	at java.****/java.lang.Thread.run(Thread.java:1583)
»  Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:9300
»  Caused by: java.net.ConnectException: Connection refused
»  	at java.****/sun.nio.ch.Net.pollConnect(Native Method)
»  	at java.****/sun.nio.ch.Net.pollConnectNow(Net.java:694)
»  	at java.****/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:973)
»  	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:336)
»  	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:339)
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:784)
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:697)
»  	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:660)
»  	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
»  	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998)
»  	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
»  	at java.****/java.lang.Thread.run(Thread.java:1583)
» WARN ][stderr                   ] [v2.19.5-remote-0] Feb 05, 2026 6:19:09 AM org.opensearch.javaagent.bootstrap.AgentPolicy setPolicy
» WARN ][stderr                   ] [v2.19.5-remote-0] INFO: Policy attached successfully: 

Signed-off-by: Martin Gaievski <gaievski@amazon.com>
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
@martin-gaievski martin-gaievski force-pushed the profile-scorer-get-wrapped-scorer branch from ba22d8f to 07ae015 Compare February 5, 2026 23:14
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 6, 2026

✅ Gradle check result for 07ae015: SUCCESS

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.28%. Comparing base (3ba2f37) to head (07ae015).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #20549      +/-   ##
============================================
+ Coverage     73.25%   73.28%   +0.02%     
- Complexity    72103    72126      +23     
============================================
  Files          5798     5798              
  Lines        329732   329733       +1     
  Branches      47519    47519              
============================================
+ Hits         241554   241641      +87     
+ Misses        68805    68720      -85     
+ Partials      19373    19372       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jainankitk jainankitk merged commit 8c2fb2e into opensearch-project:main Feb 6, 2026
35 checks passed
tanyabti pushed a commit to tanyabti/OpenSearch that referenced this pull request Feb 24, 2026
tanyabti pushed a commit to tanyabti/OpenSearch that referenced this pull request Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants