Fix: avoid ZSTD codec from overriding service codec factory.#7037
Conversation
- addresses opensearch-project#7012 Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
dblock
left a comment
There was a problem hiding this comment.
This works, but feels awfully specific to the fact that we have these two custom codecs in the project. A few ideas before I hit approve/merge.
- Can we check whether the codec is not custom (aka well known codecs) instead of whether it's a known custom codec?
- Does it make sense to make names such as "CUSTOM:ZSTD" and then look for "CUSTOM:" instead or is it a silly idea?
Gradle Check (Jenkins) Run Completed with:
|
| @Override | ||
| public Optional<CodecServiceFactory> getCustomCodecServiceFactory(final IndexSettings indexSettings) { | ||
| return Optional.of(new CustomCodecServiceFactory()); | ||
| String codec = indexSettings.getValue(EngineConfig.INDEX_CODEC_SETTING); |
There was a problem hiding this comment.
please add unit test for this class.
@dblock Do you mean requiring the user to specify all custom codecs as, for example, |
|
One comment, if an index sets this value to use the custom codec and from another plugin the codec is coming lets say a k-NN index, which codec will be picked up? or it will lead to failures? Example(might result in failure, can we check this case): I have created this issue: #7032 tries to explore the possible solution. One suggestion I have is can we add java doc on top of this plugins and also on the EnginePlugin class which provides this interface that if an index tries to use 2 codec this can lead to failures in creating the index. |
- Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
|
I reviewed the code again to see if I could do it without overriding the existing code service factory, and I can. So, I have removed the CodecService classes altogether. Note that the custom compression codecs are registered by calling org.apache.lucene.codecs.Codec ctor here. The @dblock The names for custom compression codecs, @navneet1v @martin-gaievski Can you check if this also addresses #7032? |
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7037 +/- ##
============================================
- Coverage 70.78% 70.72% -0.07%
- Complexity 59269 59278 +9
============================================
Files 4823 4820 -3
Lines 283985 283962 -23
Branches 40953 40952 -1
============================================
- Hits 201026 200820 -206
- Misses 66403 66693 +290
+ Partials 16556 16449 -107 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Now you have removed all CodecSeviceFactory , I want to know how the codec is now getting used or getting attached to a particular index? is it like if someone specify the index.codec: ZSTD the ZstdCodec would be picked up if already present? |
@navneet1v Yes. That happens here, which calls Lucene's registered codecs here. Lucene's Flamegraphs for the run below, with |
This is awesome, @mulugetam , basically the standard service loader mechanism is purely sufficient here, right? |
|
- Zstandard version 1.5.5 contains a bug fix for a rare corruption error
described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The
zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5.
Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
|
@reta I have also upgraded the zstd-jni version from 1.5.4-1 to 1.5.5-1. Version 1.5.5-1 is based on ZSTD version 1.5.5 that addresses the rare corruption bug described here: https://github.com/facebook/zstd/releases/tag/v1.5.5 |
Gradle Check (Jenkins) Run Completed with:
|
* Fix: enable ZSTD codec only if index.codec is set to ZSTD. - addresses #7012 Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Removed custom CodecService and CodecServiceFactory classes. - Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Bump zstd-jni version from 1.5.4-1 to 1.5.5-1. - Zstandard version 1.5.5 contains a bug fix for a rare corruption error described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> --------- Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> (cherry picked from commit 569e90c) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…7149) * Fix: enable ZSTD codec only if index.codec is set to ZSTD. - addresses #7012 * Removed custom CodecService and CodecServiceFactory classes. - Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. * Bump zstd-jni version from 1.5.4-1 to 1.5.5-1. - Zstandard version 1.5.5 contains a bug fix for a rare corruption error described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5. --------- (cherry picked from commit 569e90c) Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…rch-project#7037) * Fix: enable ZSTD codec only if index.codec is set to ZSTD. - addresses opensearch-project#7012 Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Removed custom CodecService and CodecServiceFactory classes. - Removed custom classes for CodecService and CodecServiceFactory. - Also removed PerFieldMappingPostingFormatCodec -- not required. - Added documentation. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Bump zstd-jni version from 1.5.4-1 to 1.5.5-1. - Zstandard version 1.5.5 contains a bug fix for a rare corruption error described here: https://github.com/facebook/zstd/releases/tag/v1.5.5. The zstd-jni version we use here, 1.5.5-1, uses Zstandard v1.5.5. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> --------- Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>
The new ZSTD compression codec adds ZSTD to the existing compression codecs (
default,best_compression, andlucene_default). This PR allows the compression codec to give a custom-codec service factory only whenindex.codecis set toZSTDorZSTDNODICT.Description
Fixes issue #7012 by explicitly avoiding the creation of a custom codec service unless the
index.codecvalue is eitherZSTDorZSTDNODICT.Issues Resolved
Resolves #7012
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.