[Feature Proposal] Writable Remote Index

> This feature proposal is WIP. We will continue to add details to Sections that are marked with ToDo.

## Goal

As an extension to [remote store](https://github.com/opensearch-project/OpenSearch/issues/2700) feature, [searchable remote index](https://github.com/opensearch-project/OpenSearch/issues/6528) will introduce data tier support in OpenSearch. Hot index has data in local disk as well as remote store whereas warm index has data only in the remote store. The next step is writable warm index. This RFC talks about the requirement of writable warm, different approaches to support writes, pros/cons of each of the approaches and recommends an approach.

## Background

This doc assumes following index structure with data tiers. Example provided is just to highlight sample pattern and can be changed as per user’s requirements

* `orders` - Live index, normal writes go to this index.
* `order-history-<DATE>` - orders index is rotated on a daily basis and rotated index is suffixed with the date.
* `orders-alias` points to indexes containing last 30 days of data. `orders` is added to this alias with `is_write_index=true` . That means, if we use alias to write data, it will always write to `orders` index.
* Last 7 days of data is kept in the hot tier. That means indexes between `order-history-2023-02-22` to `order-history-2023-02-16`  are hot indexes and can be written in the same way we write data to an index today.
* Data that is 7 to 30 days old will be removed from local nodes, index metadata is still part of the cluster state. This becomes part of warm tier. In this example, indexes between  `order-history-2023-02-15` to `order-history-2023-01-16` are warm indexes.

## Requirements

### Functional

* Support updates to existing documents without any changes at client side
* Support append data to warm index
* Optimised append-only writes based on auto-generated ids/data streams
* Refresh data post writes after a configurable period or based on explicitly defined policies

### Non-Functional

* Shouldn’t interfere with read performance
* Impact on write latency should be predictable and/or configurable
* Time required to make new changes visible should be configurable
* Minimal storage overhead in append/updates

### Non-Requirements

* Using the same index name (or alias) to write to hot/warm index.
    * In phase 1, user needs to provide the exact index to write data to. For example, writing to warm index `order-history-2023-02-22` would need the same index name to be provided. Writing to alias will only write to live hot index.
    * In next phase, we can support writing to a single index (`orders` alias as per the example above). Based on a configured field (like `timestamp`), OpenSearch decides which index to write the data to. Even though this is valid requirement, this can be built incrementally.

## Use Cases

### Write New Data

Add new documents to the existing warm index. This use case is mostly driven by back-filling data that was not ingested earlier due to some reason. This assumes that user knows which index to use for writing the new data.

### Update Existing Data

To update existing data, we need to fetch the existing document first. To improve the latency we need to perform block-level fetches. Once the document is fetched and new changes are applied to it, the next step would be same as `Write New Data`

## Potential Approaches 

These approaches provide solution for `Write New Data` use case only as `Update Existing Data` use case internally depends on write new data.

### [Recommended] 

Once the request to write hits the warm index, we open the engine in read-write mode, with the metadata from local disk. We can potentially have warm index have engine open in read-write mode from the start to support writes.
For non-append-only cases we do a block fetch of the document that needs to be updated. Then perform an update of the document, by writing to remote translog before we ack back.
For append-only uses cases, we can skip the block fetch part altogether since we know its a new document and write directly to remote translog. Based on configurable delay we refresh the segments and move the newly created segments and updated bitsets to remote segment store. More details of this approach will be covered in the design review.

### Alternative Approaches 

**Download All Data**
In this approach, we make the index hot by downloading all data from remote store to local disk. Once data is downloaded, new data is ingested into it. As this is warm index, we can’t keep the data forever on the local disk. We wait for X mins after last write to avoid frequent downloading of the data then flush and delete data (and metadata based on the data tier type type) from local disk.

### Comparison

<ToDo>

### Potential Issues

1. Both of the above approaches can result in too many small segments. This will impact query performance. Even with concurrent segment search, if the number of segments is high, it would impact overall performance. We need a way to limit number of segments with the help of background segment merger.
2. Time to make documents visible will increase (it would not be same as refresh_interval of an index)

### Next Steps

1. POC to check feasibility of using `RemoteDirectory` instead of `FSDirectory` in `IndexShard.Store`
2. Once concurrent segment search is introduced, we need to understand impact of 1/5/10/100 segments on search and overall node performance (CPU, JVM etc.)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Proposal] Writable Remote Index #7804

Goal

Background

Requirements

Functional

Non-Functional

Non-Requirements

Use Cases

Write New Data

Update Existing Data

Potential Approaches

[Recommended]

Alternative Approaches

Comparison

Potential Issues

Next Steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Proposal] Writable Remote Index #7804

Description

Goal

Background

Requirements

Functional

Non-Functional

Non-Requirements

Use Cases

Write New Data

Update Existing Data

Potential Approaches

[Recommended]

Alternative Approaches

Comparison

Potential Issues

Next Steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions