Adding protos for search classes#13178
Adding protos for search classes#13178VachaShah wants to merge 3 commits intoopensearch-project:mainfrom
Conversation
Compatibility status:Checks if related components are compatible with change b6f1d25 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git] |
|
❌ Gradle check result for 855f7ce: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for b6f1d25: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
dblock
left a comment
There was a problem hiding this comment.
This works to quickly add protobuf, but I think we must refactor the classes to separate transport-specific parts in this one.
| /** | ||
| * Gates the functionality of integrating protobuf within search API and node-to-node communication. | ||
| */ | ||
| public static final String PROTOBUF = "opensearch.experimental.feature.search_with_protobuf.enabled"; |
There was a problem hiding this comment.
Should we call this something more generic since we may be adding more than search and gate all of protobuf transport behind it? What do you think about "opensearch.experimental.feature.transport.protobuf.enabled"?
There was a problem hiding this comment.
Currently, only search related transport will be gated behind this since when we add the transport specific node to node protobuf structure (in the next PR), it will consider only search related classes for now. We can keep adding more classes once we expand the use case. That message would be https://github.com/opensearch-project/OpenSearch/pull/11910/files#diff-3b6cef54e769860b6ff96d7052208f6fe2ef8a86ea04aec232044984ba52e5db.
|
|
||
| public static final Setting<Boolean> PLUGGABLE_CACHE_SETTING = Setting.boolSetting(PLUGGABLE_CACHE, false, Property.NodeScope); | ||
|
|
||
| public static final Setting<Boolean> PROTOBUF_SETTING = Setting.boolSetting(PROTOBUF, false, Property.NodeScope, Property.Dynamic); |
There was a problem hiding this comment.
The feature/effort is not just transport specific though, protobuf is at the transport level and also at the API request response level. WDYT?
There was a problem hiding this comment.
I think this variable and the name of the setting should match our intent. Won't we be rolling out the search API with protobuf support, then another API, etc.? What's the setting that will live the longest?
Either way this variable and the setting string should match I think.
The most generic one:
PROTOBUF = opensearch.experimental.feature.protobuf.enabled
PROTOBUF_SETTING
Search specific.
PROTOBUF_SEARCH = opensearch.experimental.feature.protobuf.search.enabled
PROTOBUF_SEEARCH_SETTING
I'm good with either, but I prefer (1) because I don't think we'll ever want to have 2 APIs with protobuf support enabled/disabled separately. I could totally be wrong, too.
| hits = new SearchHits(in); | ||
| } | ||
|
|
||
| public FetchSearchResult(InputStream in) throws IOException { |
There was a problem hiding this comment.
This is a code smell. We have org.opensearch.core.common.io.stream.StreamInput and java.io.InputStream, and two constructors that do completely different things depending on whether we pass StreamInput or InputStream. We're lucky they are called different class names :)
This needs to be disambiguated, likely by implementing a base class and separate implementations for one vs. the other stream.
There was a problem hiding this comment.
Both the constructors come from its base class implementing Writeable and BytesWriteable (this is not part of this PR). Let me see if I can make this a little more disambiguated.
There was a problem hiding this comment.
We need to do the same things as with DocumentField - the class should not deal with all possible (de)serealizations
| public FetchSearchResult(ShardSearchContextId id, SearchShardTarget shardTarget) { | ||
| this.contextId = id; | ||
| setSearchShardTarget(shardTarget); | ||
| this.fetchSearchResultProto = FetchSearchResultProto.FetchSearchResult.newBuilder() |
There was a problem hiding this comment.
Transport-specific implementation needs to live in its own namespaces/classes.
There was a problem hiding this comment.
This is not a transport specific implementation though. This is the response/request specific serialization and deserialization. Transport specific protobuf implementation will be separate which is sent over the wire of which this message is a part. Transport specific protobuf will be a part of the next PR.
There was a problem hiding this comment.
The way it's written the request/response can serialize its way two ways, however the serialization primitives are all over the class. I think there are a few options (and possibly more).
- Multiple
serializemethods which take different stream types, and no locals anywhere else in the class. - Different
serializemethods in the style oftoString(), soserializeToNative(..)orserializeToProtobuf(..). - A base class
FetchResultand child classesNativeFetchResultandProtobufFetchResultthat both have aserializemethod that comes fromNativeSerializableandProtobufSerializable. - A helper serializer class
NativeFetchResultSerializerandProtobufFetchResultSerializer.
Maybe @reta has better ideas?
There was a problem hiding this comment.
Thanks @dblock definitively +1 (https://github.com/opensearch-project/OpenSearch/pull/13178/files#r1566328274)
There was a problem hiding this comment.
Doing something similar to DocumentField might not work since the response classes implement Writeable for native protocol (and will do BytesWriteable as well). But I think option 3 specified by @dblock might be a way to go where the specific protocol class can implement the related constructor for Writeable or BytesWriteable.
There was a problem hiding this comment.
Thanks for the hint, we probably need to align the that between transport and protocol, I will look into it shortly, thanks!
There was a problem hiding this comment.
@VachaShah OK, I think we won't be able to move on with this change before we solve the protocol handling by outbound communication (we sort of addressed the inbound only, but not outbound, TransportResponseHandler & are heavily relying on "native" serialization). In any case, I think the serde should be moved away from the response classes, and handled by the TransportResponseHandler, depending on the protocol of the inbound message).
There was a problem hiding this comment.
I'm with this. I think the "right" way is to first PR splitting out the native serialization and deserialization out of FetchSearchResult. It's a hefty piece of work, but who dares wins.
There was a problem hiding this comment.
Makes sense! I will look into this. For abstracting the outbound side of transport and adding support for multiple protocols on that side of transport, I have created #13293. Next, I am looking into decoupling Writeable for native serialization from TransportResponseHandler.
I think I missed adding some explanation here. Sorry for the confusion! This is not adding any transport specific implementations. In this PR, we are adding proto structures and serialization logic for API level response classes and model classes for search. The transport specific implementation for node to node message in protobuf structure will come in the next PR where these response classes will be used as part of it. |
Understood. I think the fact that If you create a fetch result with a specific constructor that knows nothing about protobuf, and that will never be serialized to/from protobuf, you still have something related to protobuf. Can we do better? |
...src/main/java/org/opensearch/common/document/serializer/DocumentFieldProtobufSerializer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/common/document/serializer/DocumentFieldSerializer.java
Outdated
Show resolved
Hide resolved
b6f1d25 to
288f99c
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13178 +/- ##
============================================
+ Coverage 71.42% 71.48% +0.06%
- Complexity 59978 61155 +1177
============================================
Files 4985 5064 +79
Lines 282275 287916 +5641
Branches 40946 41731 +785
============================================
+ Hits 201603 205815 +4212
- Misses 63999 65102 +1103
- Partials 16673 16999 +326 ☔ View full report in Codecov by Sentry. |
|
|
||
| public static final Setting<Boolean> PLUGGABLE_CACHE_SETTING = Setting.boolSetting(PLUGGABLE_CACHE, false, Property.NodeScope); | ||
|
|
||
| public static final Setting<Boolean> PROTOBUF_SETTING = Setting.boolSetting(PROTOBUF, false, Property.NodeScope, Property.Dynamic); |
There was a problem hiding this comment.
I think this variable and the name of the setting should match our intent. Won't we be rolling out the search API with protobuf support, then another API, etc.? What's the setting that will live the longest?
Either way this variable and the setting string should match I think.
The most generic one:
PROTOBUF = opensearch.experimental.feature.protobuf.enabled
PROTOBUF_SETTING
Search specific.
PROTOBUF_SEARCH = opensearch.experimental.feature.protobuf.search.enabled
PROTOBUF_SEEARCH_SETTING
I'm good with either, but I prefer (1) because I don't think we'll ever want to have 2 APIs with protobuf support enabled/disabled separately. I could totally be wrong, too.
| public FetchSearchResult(ShardSearchContextId id, SearchShardTarget shardTarget) { | ||
| this.contextId = id; | ||
| setSearchShardTarget(shardTarget); | ||
| this.fetchSearchResultProto = FetchSearchResultProto.FetchSearchResult.newBuilder() |
There was a problem hiding this comment.
I'm with this. I think the "right" way is to first PR splitting out the native serialization and deserialization out of FetchSearchResult. It's a hefty piece of work, but who dares wins.
I think it depends how we would implement it, you might be very right and we may need to keep response classes implementing the default (native) serialization, but protobuf should be 100% out , I am sure |
Signed-off-by: Vacha Shah <vachshah@amazon.com>
Signed-off-by: Vacha Shah <vachshah@amazon.com>
Signed-off-by: Vacha Shah <vachshah@amazon.com>
288f99c to
9c07269
Compare
|
Update: Merged the latest changes from #13293. Working on architecture for keeping serde separate from classes for protobuf. |
dblock
left a comment
There was a problem hiding this comment.
I commented on the next refactor in #13697 (comment).
This PR can become mergeable when FetchSearchResult doesn't have both native stream and protobuf mixed together.
|
This PR is stalled because it has been open for 30 days with no activity. |
|
This PR is stalled because it has been open for 30 days with no activity. |
|
Closing this PR since won't be working on this actively. If this change needs to be picked up, please feel free to use my branch https://github.com/VachaShah/OpenSearch/tree/add-protos-for-search. |
Description
Coming from #11910, extracting some classes for introducing proto structures in this PR for easier reviews. This PR mainly contains:
Note:
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Part of #10684
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.