Skip to content

Enable all system generated search processor factories by default.#19402

Closed
bzhangam wants to merge 2 commits intoopensearch-project:mainfrom
bzhangam:main
Closed

Enable all system generated search processor factories by default.#19402
bzhangam wants to merge 2 commits intoopensearch-project:mainfrom
bzhangam:main

Conversation

@bzhangam
Copy link
Copy Markdown
Contributor

@bzhangam bzhangam commented Sep 24, 2025

Description

Enable all system generated search processor factories by default.

When we first introduced system-generated search pipelines, we disabled all processor factories by default and required users to explicitly enable the ones they needed. However, after using this feature to support native MMR and semantic highlighting, we’ve found that enabling all factories by default provides a better user experience.

Here’s why:

  • Alignment with the Feature’s Purpose
    The primary goal of system-generated pipelines is to improve ease of use. As plugin developers, when we add a new system-generated processor factory (PR), we typically want it enabled by default. Requiring users to manually enable it works against this goal and introduces unnecessary friction.

  • Negligible Performance Impact
    While enabling all factories adds a small amount of processing to the search path, the shouldGenerate function is designed to be lightweight. As long as it avoids expensive logic, the latency impact should remain negligible (benchmark).

  • Backward Compatibility & Safety Valve
    For features like semantic highlighting, enabling the factory by default ensures backward compatibility without requiring complex logic to auto-enable it when the neural plugin is loaded. Relying on plugins to automatically enable factories could unintentionally override user preferences.

    Since plugins can already programmatically enable factories, disabling them by default doesn’t guarantee that all factories will actually stay disabled. Instead, this setting should be seen as a way for users to disable specific factories in edge cases (for example, if a particular processor causes issues in their workload).

Summary:
We propose enabling all system-generated processor factories by default. This simplifies the user experience, maintains compatibility for key features, and still allows users to explicitly disable any factory if needed.

Related Issues

N/A

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Bo Zhang <bzhangam@amazon.com>
@bzhangam
Copy link
Copy Markdown
Contributor Author

@owaiskazi19 @atris @msfroh Want to see your opinion on this.

@owaiskazi19
Copy link
Copy Markdown
Member

owaiskazi19 commented Sep 24, 2025

Want to see your opinion on this.

Copying from my discussion offline with @bzhangam . My take would be to have it disable from core and can auto enable it from plugin side. The only reason to disable it by default since it's on the hot path search request and to avoid any conflicts caused by system generated search pipeline. Will let other maintainers discuss this as well. Adding @reta @cwperks too

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 0f2e4e2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bzhangam bzhangam closed this Sep 24, 2025
@bzhangam bzhangam reopened this Sep 24, 2025
@junqiu-lei
Copy link
Copy Markdown
Member

Thanks Bo for raising this PR. This way can make the default processor behavior clear while compared to let plugin side manually enable each processor, which can cause conflicts as well.

For case of batch semantic highlighting with system generated pipeline, we'll need enable it by default to avoid UX change since OpenSearch 3.0. Besides, I don't see there has easy way to enable it from plugin side, while I tried override additionalSettings but it could ignore other processors value.

@bzhangam
Copy link
Copy Markdown
Contributor Author

adding @saratvemulapalli @andrross

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 0f2e4e2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 0f2e4e2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@reta
Copy link
Copy Markdown
Contributor

reta commented Sep 25, 2025

Copying from my discussion offline with @bzhangam . My take would be to have it disable from core and can auto enable it from plugin side. The only reason to disable it by default since it's on the hot path search request and to avoid any conflicts caused by system generated search pipeline. Will let other maintainers discuss this as well. Adding @reta @cwperks too

+1 to @owaiskazi19 suggestion, I think we discussed this subject before, what is not necessary for most users have to be opt-in only (off by default).

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 7c77a3e: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@saratvemulapalli
Copy link
Copy Markdown
Member

+1 to other maintainers. I would recommend to leave system generated processors off by default without altering the behavior. For features which will need this setting, lets document it and convey steps to enable it.
Operators can choose to overwrite this setting for managed offerings.

@bzhangam bzhangam closed this Sep 26, 2025
@bzhangam
Copy link
Copy Markdown
Contributor Author

bzhangam commented Sep 26, 2025

Since multiple maintainers think it's better to disable them by default I cancelled this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants