Fix bug "synonym_graph filter fails with word_delimiter_graph when using whitespace or classic tokenizer in synonym_analyzer"#19248
Conversation
|
❌ Gradle check result for 5c8fbe6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Only fail in CI, cannot reproduce locally with same seeds |
|
❌ Gradle check result for 267f48e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
gaobinlong
left a comment
There was a problem hiding this comment.
The DCO check is failed, please amend your commit with '-s' to include the sign off info, and change log is needed.
...lysis-common/src/main/java/org/opensearch/analysis/common/MultiplexerTokenFilterFactory.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/index/analysis/AnalysisRegistry.java
Outdated
Show resolved
Hide resolved
|
Persistent review updated to latest commit 1e2895a |
|
❌ Gradle check result for 1e2895a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
1e2895a to
739a656
Compare
|
Persistent review updated to latest commit 739a656 |
|
Persistent review updated to latest commit e73f993 |
cebf2d0 to
3fcb384
Compare
|
Persistent review updated to latest commit 3fcb384 |
|
❌ Gradle check result for 3fcb384: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 3fcb384: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…whitespace or classic tokenizer in synonym_analyzer" bug opensearch-project#18037 add 'analyzersBuiltSoFar' to getChainAwareTokenFilterFactory to build custom analyzers depending on other (already built) analyzers The analyzers are built following the order of precedence specified in the settings Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
…rgs) Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
…g kahn's algorithm topological sort Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
Signed-off-by: Andrew Ross <andrross@amazon.com>
3fcb384 to
03518e0
Compare
|
Persistent review updated to latest commit 03518e0 |
Fix "synonym_graph filter fails with word_delimiter_graph when using whitespace or classic tokenizer in synonym_analyzer" bug. Use automatic dependency detection using kahn's algorithm topological sort. Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com> Signed-off-by: Andrew Ross <andrross@amazon.com> Co-authored-by: Lamine Idjeraoui <lidjeraoui@apple.com> Co-authored-by: Andrew Ross <andrross@amazon.com> Signed-off-by: kkewwei <kkewwei@163.com>
This PR fixes the "synonym_graph filter fails with word_delimiter_graph when using whitespace or classic tokenizer in synonym_analyzer" bug
Investigated the issue and looks like there are 2 causes:
analysisRegistry.getAnalyzer(synonymAnalyzerName);This is not enough because it only looks into the built in and pre built in analyzers. The one from settings are not there.
The solution is two-fold:
Fail safe instead of fail fast when building the analyzers
Right now, if an analyzer fails for some reason the whole building process fails with an exception.
Build the depending analyzers first:
Synonym custom analyzers may depend on another analyzer that has to be built first.
The PR adds a logic to:
Resolves #18037
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.