-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[RFC] Alternate Stream Transport in OpenSearch #18425
Description
Is your feature request related to a problem? Please describe
The existing transport framework in opensearch isn't well suited to generate stream of responses. For example,
TransportChannel isn't designed currently to send stream of TransportResponses for a given TransportRequest. It also closes the channel once done sending a response.
void sendResponse(TransportResponse response) throws IOException;
void sendResponse(Exception exception) throws IOException;similarly, ChannelActionListener is designed on the same lines to handle TransportChannel
@Override
public void onResponse(Response response) {
try {
channel.sendResponse(response);
} catch (Exception e) {
onFailure(e);
}
}
@Override
public void onFailure(Exception e) {
TransportChannel.sendErrorResponse(channel, actionName, request, e);
}Describe the solution you'd like
Below the proposal we (@bowenlan-amzn @harshavamsi) have been brainstorming -
- Enhance the opensearch transport interfaces to support stream of responses.
- Provide a way in OpenSearch to install stream transport and transportService using NetworkPlugin, similar to native transport and transport service. Stream transport will be capable of handling any TransportRequest but the requestHandlers can send stream of
TransportResponses to theTransportChannel.
Why is stream transport needed in addition to regular transport?
We would like the native transport and its implementation based on netty to continue to work as it for all the APIs and cluster management. However, if there is any new API, which needs to stream responses, then it can switch to the new stream transport and transport service. Everything remains same except, it can send stream of TransportResponses to the TransportChannel.
Why can't we modify the existing transport to handle both?
It can be modified and we can get it work, however, if we want to use a different underlying transport altogether, like based on Arrow Flight or gRPC, then its not a good idea to tie them together to keep things simple.
Will the stream transport have features what NodeConnectionsService offers to ensure the connection pool is up to date?
Yes, just like native transport, NodeConnectionService for new transport will ensure the connections and their handshake mechanisms works for healthchecks.
If a plugin needs to write a new API to make use of stream transport for streaming responses, how can it be done?
Just like it works today using ActionPlugin, one can also register new Actions and define the TransportAction by injecting the stream transport service instead of regular transport service. Everything else works as it is.
Will stream transport be enabled by default?
it will be behind feature flag and its implementation will be provided by a NetworkPlugin and has to be installed. Maybe after few releases, we can convert it into a module and make it a first class citizen in OpenSearch.
Will it support both node-node transport and http request?
We will start with node-node transport. Once stable and secure, we can think about exposing it for http client.
Will existing features of native transport be supported on new transport?
Yes, features like thread management and delegation for request and response handling, tracing, header and thread context propagation and task management will work out of the box reusing existing inbound and outbound handlers.
Why not use the Auxiliary transport?
Auxiliary transport doesn't support above features and also the connection and channel management have to be handled explicitly. This means if we have to reuse any of the above mentioned features, they have to rebuilt or integrated with them with additional effort. It is also not very well suited for node-node communication where TransportRequest and TransportResponse are serialized and deserialized using StreamInput and StreamOutput. Also, writing new TransportActions using Aux transport isn't supported. If we need all these features, it better to use regular transport framework.
How is serialization and deserialization of request and responses handled?
The underlying transport can provide their implementation for StreamInput and StreamOutput to handle serialization and deserialization. This can help achieve zero copy between opensearch transport layer and underlying transport and also decouples serialization protocol.
For example, in Flight based stream transport, for requests serialization/deserialization, we can piggy back on native transport, since nothing changes on the TransportRequest side handling. For responses, we are have created Arrow based implementation of StreamInput and StreamOutput. It can be found in #18393 @harshavamsi is working on. This approach can potentially avoid issues like #17580, so no need to bring in Arrow dependencies into the server.
Will security features like TLS and FGAC from security plugin work out of the box?
When defining a new transport using NetworkPlugin, it gives access to SecureTransportSettingsProvider which can be used to configure SSL the same way it is done for default transport. For FGAC, security headers are set at inbound and outbound handling of request and responses using ThreadContext, more details here. As far as the header are correctly set and propagated, the changes to add support for new type of transport looks minor here.
#18424 Here is an attempt for changes related to stream transport and its implementation using Flight based transport. It also migrates the search API to this transport with only QUERY and FETCH action migrated under new StreamSearchAction.
Looking for feedbacks!
Related component
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status