Skip to content

[RFC] Protobuf in OpenSearch #6844

@saratvemulapalli

Description

@saratvemulapalli

Inspiration

Plugins are very tightly version coupled with OpenSearch #1707 and relaxing them to work for patch versions is still in the works.

While working on extensions #2447 we really wanted to support multiple versions (including major/minor/patch) of OpenSearch with one OpenSearch SDK[1].

Proposal

Exploring for opensource solutions, Protobuf[2] which is built by Google and is widely adopted for serializing/de-serializing and used as RPC. It was built out of the box to support forward and backward compatibility seamlessly.

With an initial experiment of integrating protobuf in OpenSearch/Extensions opensearch-project/opensearch-sdk-java#414 (comment), we see:

  • an extension working with multiple versions of OpenSearch (Backward and forward compatibility)
  • Simple human readable message contracts from .proto definitions.
  • Generated classes for readers and writers in any language of choice, an important factor for offering OpenSearch SDK in different languages.

For extensions, protobuf solves a lot of problems but has a tiny overhead for serialization/de-serialization over existing OpenSearch's StreamInput StreamOutput

Next Steps

With the learnings we have seen in SDK/Extensions, there is more potential for Protobuf integration in OpenSearch and would like to propose offering Protobuf as a new type:

  • Transport Layer: Implement StreamInput, StreamOutput with protobuf serializer/de-serializers. This will help offer another type within the transport ecosystem similar to ByteBufferStreamInput[3] etc.
    This would seamlessly plugin into Writable[4] interface which is used across the repo for transporting custom messages.

Adding in transport will enable communication between OpenSearch nodes to have significant benefits in performance and seamless versioning compatibility.
@nknize already started making changes to enable this with restructuring XContent #6470

  • Rest Layer: Implement new XContent.Type to add protobuf as an option. Historically converting Json <-> Protobuf has performance implications but for transporting on the Rest Layer with clients, OpenSearch Dashboards and ingestion tools might have benefit when talking over binary format. (Yet to experiment)

Additionally, having protobuf at Rest layer will unblock OpenSearch to support gRPC (if we choose this path).

FAQ

Q Is Protobuf higher performant?
A. We moved 2 APIs a. Cat Nodes b. _search, both APIs with protobuf had atleast 20% better performance compared to native protocol, and we see linear improvements with increase in cluster size.

Q. What are the benchmark numbers for search ?
A. See OpenSearch benchmark results for querying with Protobuf : #10684 (comment)

Q. What are the benchmark numbers for Cat Nodes (Operational APIs)
A. See benchmarking results : #6844 (comment)

Q. Is Protobuf in OpenSearch necessary to support GRPC
A. Protobuf works at transport layer, while GRPC is a layer 7 protocol. GRPC internally uses protobuf as transport which makes it a dependency. We presume there will be significant performance benefits with GRPC as data would be transmitted binary instead of JSON.

cc: @VachaShah @prudhvigodithi

[1] https://github.com/opensearch-project/opensearch-sdk-java
[2] https://protobuf.dev/overview/
[3] https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/common/io/stream/ByteBufferStreamInput.java
[4] https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/common/io/stream/Writeable.java

Metadata

Metadata

Assignees

No one assigned

    Labels

    RFCIssues requesting major changesRoadmap:Cost/Performance/ScaleProject-wide roadmap labelenhancementEnhancement or improvement to existing feature or request

    Type

    No type

    Projects

    Status

    Todo

    Status

    New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions