Skip to content

Add indexer interface for shard interaction with underlying engines#20675

Merged
Bukhtawar merged 8 commits intoopensearch-project:mainfrom
mgodwan:os-abstraction
Mar 18, 2026
Merged

Add indexer interface for shard interaction with underlying engines#20675
Bukhtawar merged 8 commits intoopensearch-project:mainfrom
mgodwan:os-abstraction

Conversation

@mgodwan
Copy link
Copy Markdown
Member

@mgodwan mgodwan commented Feb 19, 2026

Summary

This PR introduces an Indexer interface that abstracts the interaction between IndexShard and the underlying storage engine. Today, IndexShard holds a direct reference to Engine and calls engine methods throughout its lifecycle — tightly coupling shard-level coordination logic to a single engine implementation. This change decouples that relationship by introducing Indexer as the primary contract through which IndexShard performs document operations, manages sequence numbers, controls lifecycle events (flush, refresh, force merge), and queries statistics. A concrete implementation, EngineBackedIndexer, wraps the existing Engine to preserve full backward compatibility. This is a foundational step toward enabling pluggable and composite engine architectures in OpenSearch, as described in RFC #20644.


Motivation

OpenSearch's engine layer is responsible for indexing, searching, and managing the lifecycle of data on a shard. As the project evolves toward supporting composite engines (e.g., a primary Lucene engine paired with a secondary vector or columnar engine), the current tight coupling between IndexShard and Engine becomes a bottleneck. There is no clean abstraction boundary that allows a shard to interact with multiple engine implementations or swap engine behavior without modifying IndexShard directly.

This PR establishes that abstraction boundary.


Architecture: Before vs. After

Before (Tightly Coupled)

┌─────────────────────────────────────────────┐
│                  IndexShard                 │
│                                             │
│  index()  delete()  flush()  refresh()  ... │
│       │        │       │        │           │
│       └────────┴───────┴────────┘           │
│                    │                        │
│             Direct Engine ref               │
└────────────────────┼────────────────────────┘
                     │
                     ▼
           ┌─────────────────┐
           │     Engine      │  (InternalEngine / NRTReplicationEngine / etc.)
           └─────────────────┘

After (Interface-Abstracted)

┌─────────────────────────────────────────────┐
│                  IndexShard                 │
│                                             │
│  index()  delete()  flush()  refresh()  ... │
│       │        │       │        │           │
│       └────────┴───────┴────────┘           │
│                    │                        │
│             Indexer interface ref           │
└────────────────────┼────────────────────────┘
                     │
                     ▼
           ┌─────────────────────┐
           │      Indexer        │  <<interface>>
           │  (LifecycleAware,   │
           │   Closeable)        │
           └──────────┬──────────┘
                      │
          ┌───────────┴────────────┐
          │                        │
          ▼                        ▼
┌──────────────────────┐   ┌──────────────────────────┐
│  EngineBackedIndexer │   │  (Future: Composite-      │
│  (wraps Engine)      │   │   IndexerImpl, etc.)      │
└──────────┬───────────┘   └──────────────────────────┘
           │
           ▼
  ┌─────────────────┐
  │     Engine      │
  └─────────────────┘

New Class Structure

org.opensearch.index.engine
├── EngineBackedIndexer.java          ← Concrete Indexer wrapping Engine
├── Engine.java                       ← Minor cleanup (unintended changes removed)
│
└── exec/
    ├── Indexer.java                  ← Core interface (@ExperimentalApi)
    ├── Segment.java                  ← Segment metadata model
    ├── MergeResult.java              ← Result of a merge operation
    ├── WriterFileSet.java            ← File set produced by a writer
    └── DataFormat.java               ← Data format identifier abstraction

org.opensearch.index.shard
├── IndexShard.java                   ← Migrated to use Indexer interface
└── RemoteStoreRefreshListener.java   ← Migrated to use Indexer interface

Indexer Interface Responsibilities

┌──────────────────────────────────────────────────────┐
│                    Indexer Interface                 │
├──────────────────────────────────────────────────────┤
│  Document Operations                                 │
│    index(Engine.Index)   →  Engine.IndexResult       │
│    delete(Engine.Delete) →  Engine.DeleteResult      │
│    noOp(Engine.NoOp)     →  Engine.NoOpResult        │
│    prepareIndex(...)     →  Engine.Index             │
│    prepareDelete(...)    →  Engine.Delete            │
├──────────────────────────────────────────────────────┤
│  Sequence Number & Checkpoint Management             │
│    getLocalCheckpoint()                              │
│    getMaxSeqNo()                                     │
│    advanceMaxSeqNoOfUpdatesOrDeletes(long)           │
│    currentOngoingRefreshCheckpoint()                 │
├──────────────────────────────────────────────────────┤
│  Lifecycle Operations                                │
│    flush(FlushRequest)                               │
│    refresh(String)                                   │
│    forceMerge(ForceMergeRequest)                     │
│    close()                                           │
├──────────────────────────────────────────────────────┤
│  Statistics & Monitoring                             │
│    docStats()                                        │
│    segmentStats(...)                                 │
│    mergeStats()                                      │
│    getTranslogStats()                                │
└──────────────────────────────────────────────────────┘

EngineBackedIndexer: Delegation Model

IndexShard.getIndexer()
        │
        ▼
EngineBackedIndexer
        │
        │  delegates all calls to:
        ▼
    Engine (volatile ref)
        │
        ├── InternalEngine
        └── NRTReplicationEngine

The engine reference inside EngineBackedIndexer is volatile and mirrors the existing engine swap semantics in IndexShard. The getEngine() accessor is intentionally marked for future removal once all callers migrate away from direct engine access.


Migration Path

Phase 1 (This PR)
─────────────────
IndexShard  ──uses──►  Indexer (via EngineBackedIndexer)
RemoteStoreRefreshListener  ──uses──►  Indexer

Phase 2 (Future)
────────────────
Deprecate & remove getEngine() from EngineBackedIndexer
Introduce SearcherInterface (tracked via TODO comments)
Deprecate direct acquireSearcher / acquireLastIndexCommit paths

Phase 3 (Future)
────────────────
Introduce CompositeIndexer for multi-engine shards
Register DataFormats via DataFormatRegistry at plugin bootstrap

Related Issues & PRs


Check List

  • Functionality includes testing
  • API changes companion pull request created (pending interface stabilization)
  • Public documentation issue/PR created (pending)
  • Changelog entry added

Notes for Reviewers

  • The Indexer interface is annotated @ExperimentalApi — the contract is expected to evolve as composite engine support matures.
  • EngineBackedIndexer.getEngine() is a temporary escape hatch for callers not yet migrated; it is documented for removal.
  • Several IndexShard methods are marked @Deprecated with TODO comments pending a SearcherInterface companion PR.
  • The applyOnEngine utility method currently throws IllegalStateException for non-EngineBackedIndexer instances; this will be addressed as the interface matures.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 19, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

PR Reviewer Guide 🔍

(Review updated until commit e403aa9)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
📝 TODO sections

🔀 Multiple PR themes

Sub-PR theme: Introduce Indexer interface and EngineBackedIndexer implementation

Relevant files:

  • server/src/main/java/org/opensearch/index/engine/exec/Indexer.java
  • server/src/main/java/org/opensearch/index/engine/exec/IndexerEngineOperations.java
  • server/src/main/java/org/opensearch/index/engine/exec/IndexerLifecycleOperations.java
  • server/src/main/java/org/opensearch/index/engine/exec/IndexerStateManager.java
  • server/src/main/java/org/opensearch/index/engine/exec/IndexerStatistics.java
  • server/src/main/java/org/opensearch/index/engine/exec/CatalogSnapshot.java
  • server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java
  • server/src/main/java/org/opensearch/index/engine/exec/Segment.java
  • server/src/main/java/org/opensearch/index/engine/exec/package-info.java
  • server/src/main/java/org/opensearch/index/engine/EngineBackedIndexer.java

Sub-PR theme: Migrate IndexShard to use Indexer interface instead of Engine

Relevant files:

  • server/src/main/java/org/opensearch/index/shard/IndexShard.java
  • server/src/test/java/org/opensearch/index/shard/IndexShardTests.java
  • server/src/test/java/org/opensearch/index/shard/IndexShardRetentionLeaseTests.java

Sub-PR theme: Introduce DataFormat plugin and execution engine abstractions

Relevant files:

  • server/src/main/java/org/opensearch/index/engine/dataformat/DataFormat.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/DataFormatPlugin.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/IndexingExecutionEngine.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/Writer.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/Merger.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/MergeInput.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/MergeResult.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/RefreshInput.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/RefreshResult.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/WriteResult.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/DocumentInput.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/FileInfos.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/FieldTypeCapabilities.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/RowIdMapping.java
  • server/src/main/java/org/opensearch/index/engine/dataformat/package-info.java
  • server/src/test/java/org/opensearch/index/engine/dataformat/DataFormatPluginTests.java

⚡ Recommended focus areas for review

Leaky Abstraction

The applyOnEngine helper method casts Indexer back to EngineBackedIndexer and then calls getEngine() to apply a function directly on the underlying Engine. This pattern is used in many places (e.g., get, minimumCompatibleVersion, segments, acquireLastIndexCommit, acquireSearcherSupplier, acquireSearcher, snapshotStoreMetadata, resetEngineToGlobalCheckpoint). This defeats the purpose of the abstraction — any non-EngineBackedIndexer implementation will throw IllegalStateException, making the interface effectively non-pluggable for those operations.

public static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
    if (indexer instanceof EngineBackedIndexer engine) {
        return applier.apply(engine.getEngine());
    } else {
        throw new IllegalStateException("Cannot apply function on indexer " + indexer.getClass() + " directly on IndexShard");
    }
}
Inconsistent Abstraction

Several methods still break through the Indexer abstraction by checking instanceof EngineBackedIndexer and calling getEngine() directly (e.g., getReplicationEngine, getSegmentInfosSnapshot, updateShardIngestionState, getIngestionState, testRecordsForceMerges). This means the abstraction is incomplete and any future non-EngineBackedIndexer implementation would fail at runtime for these code paths.

if (getIndexer() instanceof EngineBackedIndexer indexer && indexer.getEngine() instanceof NRTReplicationEngine nrtEngine) {
    return Optional.of(nrtEngine);
} else {
    return Optional.empty();
Duplicate getIndexer() calls

In applyIndexOperation, when the early-return path is taken (line 1207), getIndexer() is called again instead of reusing the indexer parameter already passed to the method. This introduces an unnecessary second lookup and a subtle inconsistency — the indexer used for prepareIndex and the one used for index could theoretically differ if the engine is swapped concurrently.

    UNASSIGNED_SEQ_NO,
    0
);
return getIndexer().index(index);
Deprecated Without Replacement

acquireLastIndexCommit, acquireLastIndexCommitAndRefresh, acquireSearcherSupplier, acquireSearcher, wrapSearcher, and getSegmentInfosSnapshot are marked @Deprecated with TODO comments, but no replacement API or migration path is provided in this PR. Deprecating public/package-visible methods without a clear successor may confuse consumers and leave the codebase in an inconsistent state.

@Deprecated
public GatedCloseable<IndexCommit> acquireLastIndexCommit(boolean flushFirst) throws EngineException {
    final IndexShardState state = this.state; // one time volatile read
    // we allow snapshot on closed index shard, since we want to do one after we close the shard and before we close the engine
    if (state == IndexShardState.STARTED || state == IndexShardState.CLOSED) {
        return applyOnEngine(getIndexer(), engine -> engine.acquireLastIndexCommit(flushFirst));
    } else {
        throw new IllegalIndexShardStateException(shardId, state, "snapshot is not allowed");
    }
}

@Deprecated
public GatedCloseable<IndexCommit> acquireLastIndexCommitAndRefresh(boolean flushFirst) throws EngineException {
    GatedCloseable<IndexCommit> indexCommit = acquireLastIndexCommit(flushFirst);
    getIndexer().refresh("Snapshot for Remote Store based Shard");
Test Coupling

Tests such as testRecordsForceMerges and testReadOnlyReplicaEngineConfig use double-casting like ((InternalEngine) ((EngineBackedIndexer) shard.getIndexer()).getEngine()), which tightly couples tests to the concrete EngineBackedIndexer and InternalEngine implementations. This undermines the abstraction being introduced and will break if the engine type changes.

final String initialForceMergeUUID = ((InternalEngine) ((EngineBackedIndexer) shard.getIndexer()).getEngine()).getForceMergeUUID();
assertThat(initialForceMergeUUID, nullValue());
final ForceMergeRequest firstForceMergeRequest = new ForceMergeRequest().maxNumSegments(1);
shard.forceMerge(firstForceMergeRequest);
final String secondForceMergeUUID = ((InternalEngine) ((EngineBackedIndexer) shard.getIndexer()).getEngine()).getForceMergeUUID();
assertThat(secondForceMergeUUID, notNullValue());
assertThat(secondForceMergeUUID, equalTo(firstForceMergeRequest.forceMergeUUID()));
final ForceMergeRequest secondForceMergeRequest = new ForceMergeRequest().maxNumSegments(1);
shard.forceMerge(secondForceMergeRequest);
final String thirdForceMergeUUID = ((InternalEngine) ((EngineBackedIndexer) shard.getIndexer()).getEngine()).getForceMergeUUID();

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

PR Code Suggestions ✨

Latest suggestions up to e403aa9

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Fix misleading assertion error message

The second assertion error message incorrectly references engine.getClass() instead
of translogManager.getClass(). This makes the error message misleading when the
assertion fires, as it will print the engine class rather than the actual translog
manager class.

test/framework/src/main/java/org/opensearch/index/shard/IndexShardTestCase.java [1575-1576]

 public static Translog getTranslog(IndexShard shard) {
     Indexer indexer = getIndexer(shard);
     Engine engine = getEngine(shard);
     assert engine instanceof InternalEngine || engine instanceof NRTReplicationEngine
         : "only InternalEngines or NRTReplicationEngines have translogs, got: " + engine.getClass();
     indexer.ensureOpen();
     TranslogManager translogManager = indexer.translogManager();
     assert translogManager instanceof InternalTranslogManager : "only InternalTranslogManager have translogs, got: "
-        + engine.getClass();
+        + translogManager.getClass();
Suggestion importance[1-10]: 7

__

Why: The assertion error message uses engine.getClass() instead of translogManager.getClass(), which would print the wrong class when the assertion fires. This is a clear bug in the error message that would make debugging harder.

Medium
Reduce visibility of internal engine accessor method

The applyOnEngine method is public static but is only intended for internal use
within IndexShard. Making it public exposes the internal Engine abstraction to
external callers, undermining the purpose of the Indexer interface. It should be
private static or at most package-private to enforce encapsulation.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5984-5990]

-public static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
+static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
     if (indexer instanceof EngineBackedIndexer engine) {
         return applier.apply(engine.getEngine());
     } else {
         throw new IllegalStateException("Cannot apply function on indexer " + indexer.getClass() + " directly on IndexShard");
     }
 }
Suggestion importance[1-10]: 5

__

Why: The applyOnEngine method is public static but exposes the internal Engine abstraction to external callers, undermining the Indexer interface encapsulation. Reducing visibility to package-private would be a valid improvement, though this is a style/design concern rather than a critical bug.

Low
Avoid silently swallowing IO errors in size calculation

Silently swallowing IOException in getTotalSize() by returning 0 can lead to
incorrect size calculations without any indication of failure. This could cause
subtle bugs in merge decisions or storage management that depend on accurate file
sizes.

server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java [29-37]

 public long getTotalSize() {
     return files.stream().mapToLong(file -> {
         try {
             return java.nio.file.Files.size(Path.of(directory, file));
         } catch (IOException e) {
-            return 0;
+            throw new UncheckedIOException("Failed to get size of file: " + file + " in directory: " + directory, e);
         }
     }).sum();
 }
Suggestion importance[1-10]: 5

__

Why: Silently returning 0 on IOException in getTotalSize() can lead to incorrect size calculations that affect merge and storage decisions. Throwing an UncheckedIOException would make failures visible, though this is a new experimental API so the impact is moderate.

Low
Fix misleading variable name after API change

The variable is named engineOrNull but getEngine() (which casts to
EngineBackedIndexer) will throw a ClassCastException rather than returning null if
the indexer is not engine-backed. The variable name is misleading and the null check
is no longer meaningful given the new implementation.

server/src/test/java/org/opensearch/index/shard/SegmentReplicationIndexShardTests.java [1007-1009]

-final Engine engineOrNull = getEngine(replicaShard);
-assertNotNull(engineOrNull);
-assertTrue(engineOrNull instanceof ReadOnlyEngine);
+final Engine engine = getEngine(replicaShard);
+assertNotNull(engine);
+assertTrue(engine instanceof ReadOnlyEngine);
Suggestion importance[1-10]: 3

__

Why: The variable name engineOrNull is misleading since getEngine() no longer returns null but throws on failure. Renaming improves readability, but this is a minor cosmetic issue in test code with low functional impact.

Low
Clarify engine replacement and close sequencing

When currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine)) is
called, the previously held Indexer is returned and passed to IOUtils.close.
However, if EngineBackedIndexer construction fails (unlikely here but possible in
future), the old engine would not be closed. More critically, the old Indexer being
closed here is itself an EngineBackedIndexer wrapping an engine — ensure that
closing the EngineBackedIndexer properly delegates close() to the underlying engine,
which it does, but this pattern should be verified to avoid double-close if the same
engine is wrapped multiple times.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5288]

-IOUtils.close(currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine)));
+final Indexer oldIndexer = currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine));
+IOUtils.close(oldIndexer);
Suggestion importance[1-10]: 2

__

Why: The suggestion to extract the old indexer into a variable before closing is a minor readability improvement. The existing code IOUtils.close(currentEngineReference.getAndSet(...)) is functionally equivalent and the concern about double-close is not a real issue in this context.

Low
Possible issue
Avoid calling test infrastructure methods from static nested class constructors

createTempDir() is called inside the constructor of a static inner class. In
OpenSearch test infrastructure, createTempDir() is an instance method of
OpenSearchTestCase and may not be accessible or may behave unexpectedly when called
from a static nested class constructor. This could cause test failures or resource
leaks. The directory should be passed as a constructor parameter instead.

server/src/test/java/org/opensearch/index/engine/dataformat/DataFormatPluginTests.java [353-356]

-MockIndexingExecutionEngine(MockDataFormat dataFormat) {
+MockIndexingExecutionEngine(MockDataFormat dataFormat, Path directory) {
     this.dataFormat = dataFormat;
-    this.directory = createTempDir();
+    this.directory = directory;
 }
Suggestion importance[1-10]: 6

__

Why: Calling createTempDir() inside a static nested class constructor is problematic since it's an instance method of OpenSearchTestCase. This could cause NullPointerException or unexpected behavior. Passing the directory as a constructor parameter is the correct approach.

Low
Add safe cast check before casting indexer

This cast to EngineBackedIndexer will throw a ClassCastException if the indexer is
not an EngineBackedIndexer. Unlike other places in the same file that use a safe
instanceof check, this performs an unchecked cast without validation, which could
cause unexpected failures for non-engine-backed indexers.

test/framework/src/main/java/org/opensearch/index/shard/IndexShardTestCase.java [1361]

-return EngineTestCase.getDocIds(((EngineBackedIndexer) shard.getIndexer()).getEngine(), true);
+Indexer indexer = shard.getIndexer();
+if (!(indexer instanceof EngineBackedIndexer)) {
+    throw new IllegalStateException("getDocIdAndSeqNos requires an EngineBackedIndexer, got: " + indexer.getClass());
+}
+return EngineTestCase.getDocIds(((EngineBackedIndexer) indexer).getEngine(), true);
Suggestion importance[1-10]: 4

__

Why: While the suggestion is valid in pointing out the unchecked cast, this is a test utility method and the existing pattern in the codebase uses direct casts in similar contexts (e.g., getEngine). The improvement adds defensive coding but is low priority for test code.

Low

Previous suggestions

Suggestions up to commit 4d70feb
CategorySuggestion                                                                                                                                    Impact
Possible issue
Replace hardcoded paths with portable temp directories

The test uses hardcoded absolute paths /tmp/uuid/0 for ShardPath, which may not
exist or be writable on all test environments (e.g., Windows or sandboxed CI
environments). Use createTempDir() (already available in OpenSearchTestCase) to
create proper temporary directories for the shard path.

server/src/test/java/org/opensearch/index/engine/dataformat/DataFormatPluginTests.java [65-69]

+Path shardDir = createTempDir();
 IndexingExecutionEngine<DataFormat, MockDocumentInput> engine = plugin.indexingEngine(
     mock(MapperService.class),
-    new ShardPath(false, Path.of("/tmp/uuid/0"), Path.of("/tmp/uuid/0"), new ShardId("index", "uuid", 0)),
+    new ShardPath(false, shardDir, shardDir, new ShardId("index", "uuid", 0)),
     new IndexSettings(IndexMetadata.builder("index").settings(settings).build(), settings)
 );
Suggestion importance[1-10]: 7

__

Why: Using hardcoded /tmp/uuid/0 paths in tests is a portability issue that can cause failures on Windows or sandboxed CI environments. Using createTempDir() from OpenSearchTestCase is the correct approach and is already used elsewhere in the same test class.

Medium
Make unsupported engine operations fail explicitly

The Indexer.super.lastRefreshedCheckpoint() and
Indexer.super.currentOngoingRefreshCheckpoint() default implementations throw
UnsupportedOperationException, so wrapping a non-InternalEngine will silently fall
through to an exception at runtime. This is a hidden failure mode that should be
made explicit. Consider throwing a descriptive UnsupportedOperationException with
the actual engine type, or defining a proper interface/abstract method on Engine for
these operations.

server/src/main/java/org/opensearch/index/engine/EngineBackedIndexer.java [358-371]

 @Override
 public long lastRefreshedCheckpoint() {
     if (engine instanceof InternalEngine) {
         return ((InternalEngine) engine).lastRefreshedCheckpoint();
     }
-    return Indexer.super.lastRefreshedCheckpoint();
+    throw new UnsupportedOperationException(
+        "lastRefreshedCheckpoint is not supported for engine type: " + engine.getClass().getSimpleName()
+    );
 }
 
 @Override
 public long currentOngoingRefreshCheckpoint() {
     if (engine instanceof InternalEngine) {
         return ((InternalEngine) engine).currentOngoingRefreshCheckpoint();
     }
-    return Indexer.super.currentOngoingRefreshCheckpoint();
+    throw new UnsupportedOperationException(
+        "currentOngoingRefreshCheckpoint is not supported for engine type: " + engine.getClass().getSimpleName()
+    );
 }
Suggestion importance[1-10]: 6

__

Why: The fallback to Indexer.super.lastRefreshedCheckpoint() and Indexer.super.currentOngoingRefreshCheckpoint() silently delegates to a default that throws UnsupportedOperationException without context. The improved code makes this failure explicit with a descriptive message including the engine type, which is a meaningful improvement for debuggability.

Low
Avoid silently swallowing IO errors in size calculation

Silently swallowing IOException in getTotalSize() will return incorrect totals
without any indication of failure. This can lead to silent data inconsistencies when
files are missing or inaccessible. Consider logging the error or propagating it as
an unchecked exception.

server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java [30-37]

 public long getTotalSize() {
     return files.stream().mapToLong(file -> {
         try {
             return java.nio.file.Files.size(Path.of(directory, file));
         } catch (IOException e) {
-            return 0;
+            throw new UncheckedIOException("Failed to get size of file: " + file, e);
         }
     }).sum();
 }
Suggestion importance[1-10]: 5

__

Why: Silently catching IOException and returning 0 can produce incorrect size totals without any indication of failure. However, since getTotalSize() is used in a stream context and the interface doesn't declare IOException, the fix requires careful consideration of the API contract.

Low
Add null and type safety check before casting indexer

This performs an unchecked cast of shard.getIndexer() to EngineBackedIndexer without
any null check or type check, which will throw a NullPointerException or
ClassCastException if the indexer is null or not an EngineBackedIndexer. The other
similar call sites in this file use a safer pattern with instanceof checks.

test/framework/src/main/java/org/opensearch/index/shard/IndexShardTestCase.java [1361]

-return EngineTestCase.getDocIds(((EngineBackedIndexer) shard.getIndexer()).getEngine(), true);
+Indexer indexer = shard.getIndexer();
+if (!(indexer instanceof EngineBackedIndexer)) {
+    throw new IllegalStateException("Expected EngineBackedIndexer but got: " + (indexer == null ? "null" : indexer.getClass()));
+}
+return EngineTestCase.getDocIds(((EngineBackedIndexer) indexer).getEngine(), true);
Suggestion importance[1-10]: 4

__

Why: The unchecked cast to EngineBackedIndexer could throw a ClassCastException at runtime. However, this is test code and the pattern is consistent with how getEngine() works in the same file, so the risk is lower in practice.

Low
General
Fix misleading assertion error message for translog manager

The second assert error message incorrectly references engine.getClass() instead of
translogManager.getClass(), which will produce a misleading error message when the
assertion fails. This makes debugging harder when a non-InternalTranslogManager is
encountered.

test/framework/src/main/java/org/opensearch/index/shard/IndexShardTestCase.java [1568-1579]

 public static Translog getTranslog(IndexShard shard) {
     Indexer indexer = getIndexer(shard);
     Engine engine = getEngine(shard);
     assert engine instanceof InternalEngine || engine instanceof NRTReplicationEngine
         : "only InternalEngines or NRTReplicationEngines have translogs, got: " + engine.getClass();
     indexer.ensureOpen();
     TranslogManager translogManager = indexer.translogManager();
     assert translogManager instanceof InternalTranslogManager : "only InternalTranslogManager have translogs, got: "
-        + engine.getClass();
+        + translogManager.getClass();
     InternalTranslogManager internalTranslogManager = (InternalTranslogManager) translogManager;
     return internalTranslogManager.getTranslog();
 }
Suggestion importance[1-10]: 6

__

Why: The assertion error message incorrectly uses engine.getClass() instead of translogManager.getClass(), which would produce a misleading diagnostic when the assertion fails. This is a clear copy-paste bug that reduces debuggability.

Low
Restrict visibility of engine-unwrapping utility method

The applyOnEngine method is public static but it exposes internal implementation
details (the EngineBackedIndexer wrapping) to external callers, breaking
encapsulation. Since this method throws for any non-EngineBackedIndexer
implementation, it will cause runtime failures when new Indexer implementations are
introduced, defeating the purpose of the abstraction. Consider making it private or
package-private to limit its scope, or better yet, add the needed methods directly
to the Indexer interface.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5984-5990]

-public static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
+static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
     if (indexer instanceof EngineBackedIndexer engine) {
         return applier.apply(engine.getEngine());
     } else {
         throw new IllegalStateException("Cannot apply function on indexer " + indexer.getClass() + " directly on IndexShard");
     }
 }
Suggestion importance[1-10]: 5

__

Why: The applyOnEngine method being public static exposes internal EngineBackedIndexer details to external callers, breaking the abstraction. However, changing to package-private (static) is a minor encapsulation improvement and doesn't fix the fundamental issue of runtime failures for non-EngineBackedIndexer implementations.

Low
Handle null indexer and improve error messaging

The getSegmentInfosSnapshot() method calls getIndexer() which throws
AlreadyClosedException if the engine is null, but the method's contract (as a
snapshot method) should handle the closed state gracefully. Additionally, the
IllegalStateException thrown for non-EngineBackedIndexer types gives no context
about the actual indexer type, making debugging harder.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5785-5790]

 @Deprecated
 public GatedCloseable<SegmentInfos> getSegmentInfosSnapshot() {
-    if (getIndexer() instanceof EngineBackedIndexer indexer) {
-        return indexer.getEngine().getSegmentInfosSnapshot();
+    final Indexer indexer = getIndexerOrNull();
+    if (indexer == null) {
+        throw new AlreadyClosedException("engine is closed");
     }
-    throw new IllegalStateException("Cannot request SegmentInfos directly on IndexShard");
+    if (indexer instanceof EngineBackedIndexer engineBackedIndexer) {
+        return engineBackedIndexer.getEngine().getSegmentInfosSnapshot();
+    }
+    throw new IllegalStateException(
+        "Cannot request SegmentInfos directly on IndexShard for indexer type: " + indexer.getClass().getSimpleName()
+    );
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion to use getIndexerOrNull() instead of getIndexer() is a minor defensive improvement, but getIndexer() already throws AlreadyClosedException with the same message, so the behavior difference is minimal. The improved error message for non-EngineBackedIndexer types is a small readability improvement.

Low
Suggestions up to commit b7133e3
CategorySuggestion                                                                                                                                    Impact
General
Fix misleading assertion error message class reference

The second assertion error message incorrectly references engine.getClass() instead
of translogManager.getClass(). This makes the error message misleading when the
assertion fires, as it will print the engine class rather than the actual translog
manager class that violated the assertion.

test/framework/src/main/java/org/opensearch/index/shard/IndexShardTestCase.java [1575-1578]

 public static Translog getTranslog(IndexShard shard) {
     Indexer indexer = getIndexer(shard);
     Engine engine = getEngine(shard);
     assert engine instanceof InternalEngine || engine instanceof NRTReplicationEngine
         : "only InternalEngines or NRTReplicationEngines have translogs, got: " + engine.getClass();
     indexer.ensureOpen();
     TranslogManager translogManager = indexer.translogManager();
     assert translogManager instanceof InternalTranslogManager : "only InternalTranslogManager have translogs, got: "
-        + engine.getClass();
+        + translogManager.getClass();
     InternalTranslogManager internalTranslogManager = (InternalTranslogManager) translogManager;
     return internalTranslogManager.getTranslog();
 }
Suggestion importance[1-10]: 7

__

Why: The assertion error message uses engine.getClass() instead of translogManager.getClass(), which would print the wrong class when the assertion fires. This is a genuine bug in the error message that would make debugging harder.

Medium
Reduce visibility of internal engine accessor method

The applyOnEngine method is public static but is only intended for internal use
within IndexShard. Making it public exposes the internal Engine abstraction to
external callers, undermining the purpose of the Indexer interface. It should be
private static or at most package-private to prevent external callers from bypassing
the Indexer abstraction.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5984-5990]

-public static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
+static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
     if (indexer instanceof EngineBackedIndexer engine) {
         return applier.apply(engine.getEngine());
     } else {
         throw new IllegalStateException("Cannot apply function on indexer " + indexer.getClass() + " directly on IndexShard");
     }
 }
Suggestion importance[1-10]: 5

__

Why: The applyOnEngine method is public static but exposes the internal Engine abstraction to external callers, undermining the Indexer abstraction. Reducing visibility to package-private would be a reasonable improvement, though this is a design/style concern rather than a critical bug.

Low
Replace hardcoded temp paths with test-managed directories

Using hardcoded /tmp/uuid/0 paths in tests is fragile and may fail on systems where
/tmp is not writable or has different permissions. The test base class
OpenSearchTestCase provides createTempDir() for creating temporary directories that
are properly cleaned up after the test.

server/src/test/java/org/opensearch/index/engine/dataformat/DataFormatPluginTests.java [67]

-new ShardPath(false, Path.of("/tmp/uuid/0"), Path.of("/tmp/uuid/0"), new ShardId("index", "uuid", 0)),
+Path tempDir = createTempDir();
+new ShardPath(false, tempDir, tempDir, new ShardId("index", "uuid", 0)),
Suggestion importance[1-10]: 5

__

Why: Using hardcoded /tmp/uuid/0 paths is fragile and may fail on systems where /tmp is not writable or has different permissions. Using createTempDir() from the test base class is the correct approach for test portability and cleanup.

Low
Log silently swallowed IO exceptions in size calculation

Silently swallowing IOException in getTotalSize() by returning 0 can mask real
errors such as missing files or permission issues, leading to incorrect size
calculations without any indication of failure. Consider logging the exception or
propagating it so callers are aware of the problem.

server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java [29-37]

 public long getTotalSize() {
     return files.stream().mapToLong(file -> {
         try {
             return java.nio.file.Files.size(Path.of(directory, file));
         } catch (IOException e) {
+            LogManager.getLogger(WriterFileSet.class).warn("Failed to get size of file: " + file, e);
             return 0;
         }
     }).sum();
 }
Suggestion importance[1-10]: 4

__

Why: Silently swallowing IOException in getTotalSize() can mask real errors, but this is a minor improvement suggestion for a new utility class. Adding logging would improve observability without changing behavior.

Low
Provide explicit error for unsupported engine types

The default implementations of lastRefreshedCheckpoint() and
currentOngoingRefreshCheckpoint() in the Indexer interface throw
UnsupportedOperationException. When the wrapped engine is not an InternalEngine,
these methods will throw, which could cause unexpected runtime failures. The
fallback should either return a safe default value (e.g.,
SequenceNumbers.NO_OPS_PERFORMED) or document clearly that non-InternalEngine
engines are unsupported.

server/src/main/java/org/opensearch/index/engine/EngineBackedIndexer.java [358-371]

 @Override
 public long lastRefreshedCheckpoint() {
     if (engine instanceof InternalEngine) {
         return ((InternalEngine) engine).lastRefreshedCheckpoint();
     }
-    return Indexer.super.lastRefreshedCheckpoint();
+    throw new UnsupportedOperationException(
+        "lastRefreshedCheckpoint is not supported for engine type: " + engine.getClass().getSimpleName()
+    );
 }
 
 @Override
 public long currentOngoingRefreshCheckpoint() {
     if (engine instanceof InternalEngine) {
         return ((InternalEngine) engine).currentOngoingRefreshCheckpoint();
     }
-    return Indexer.super.currentOngoingRefreshCheckpoint();
+    throw new UnsupportedOperationException(
+        "currentOngoingRefreshCheckpoint is not supported for engine type: " + engine.getClass().getSimpleName()
+    );
 }
Suggestion importance[1-10]: 3

__

Why: The improved_code is functionally equivalent to calling Indexer.super.lastRefreshedCheckpoint() which already throws UnsupportedOperationException. The suggestion adds a more descriptive message, but the behavior is essentially the same, making this a minor improvement.

Low
Remove redundant null check after non-nullable accessor

The variable was previously obtained via getEngineOrNull() (which can return null)
but is now obtained via getEngine() (which throws if no engine exists). The
assertNotNull check is now redundant and misleading, since getEngine() would throw
before returning null. Consider removing the null assertion or reverting to a
nullable accessor if null is a valid state being tested.

server/src/test/java/org/opensearch/index/shard/SegmentReplicationIndexShardTests.java [1007-1010]

 final Engine engineOrNull = getEngine(replicaShard);
-assertNotNull(engineOrNull);
 assertTrue(engineOrNull instanceof ReadOnlyEngine);
Suggestion importance[1-10]: 3

__

Why: The assertNotNull is indeed redundant since getEngine() would throw before returning null, but this is a minor test code quality issue with low impact on correctness.

Low
Possible issue
Fix TOCTOU race condition in null-check and usage

There is a TOCTOU (time-of-check-time-of-use) race condition: getIndexerOrNull() is
checked for null, but then getIndexer() is called separately, which could throw if
the indexer becomes null between the two calls. Capture the result of
getIndexerOrNull() in a local variable and use it directly.

server/src/main/java/org/opensearch/index/shard/RemoteStoreRefreshListener.java [578-580]

-if (indexShard.getIndexerOrNull() != null) {
-    sb.append(" engineType=").append(indexShard.getIndexer().getClass().getSimpleName());
+Indexer indexer = indexShard.getIndexerOrNull();
+if (indexer != null) {
+    sb.append(" engineType=").append(indexer.getClass().getSimpleName());
 }
Suggestion importance[1-10]: 6

__

Why: There is a genuine TOCTOU race condition where getIndexerOrNull() is checked for null but then getIndexer() is called separately. Capturing the result in a local variable is the correct fix and prevents a potential NPE or exception in concurrent scenarios.

Low
Ensure old indexer is explicitly closed on replacement

When currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine)) is
called, the previously held Indexer (which wraps the old engine) is returned and
passed to IOUtils.close. However, closing the EngineBackedIndexer wrapper will close
the underlying engine. This is correct, but it should be verified that the
EngineBackedIndexer.close() properly delegates to engine.close() — which it does.
However, the old engine reference that was set earlier via
IOUtils.close(currentEngineReference.getAndSet(null)) pattern is now changed, and
the returned old Indexer from getAndSet is not explicitly closed here, potentially
leaking resources.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5288]

-IOUtils.close(currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine)));
+final Indexer oldIndexer = currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine));
+IOUtils.close(oldIndexer);
Suggestion importance[1-10]: 3

__

Why: The existing code IOUtils.close(currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine))) already passes the returned old Indexer to IOUtils.close, so the old indexer is already being closed. The suggestion's improved_code is functionally equivalent to the existing code.

Low
Suggestions up to commit 88b1af9
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix unsafe cast in equals method

The equals method does not check if the input object is null or of a different type
before casting, which will throw a NullPointerException or ClassCastException for
non-WriterFileSet arguments. Add a null check and an instanceof check before
casting.

server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java [71-76]

 @Override
 public boolean equals(Object o) {
+    if (this == o) return true;
+    if (!(o instanceof WriterFileSet)) return false;
     WriterFileSet other = (WriterFileSet) o;
     return this.directory.equals(other.directory)
         && this.files.equals(other.files)
         && this.getWriterGeneration() == other.getWriterGeneration();
 }
Suggestion importance[1-10]: 8

__

Why: The equals method lacks null and type checks before casting, which will throw NullPointerException or ClassCastException for non-WriterFileSet arguments. This is a real bug in the new code introduced by this PR.

Medium
Prevent inconsistent state on close-and-swap failure

When IOUtils.close is called on the old engine reference and then
currentEngineReference is set to a new EngineBackedIndexer, if the IOUtils.close
call throws an exception, the currentEngineReference will already have been updated
to the new engine, potentially leaving the shard in an inconsistent state. The close
and set should be done atomically or the old reference should be captured before the
set.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5288]

-IOUtils.close(currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine)));
+final Indexer oldEngine = currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine));
+IOUtils.close(oldEngine);
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that getAndSet atomically replaces the reference before IOUtils.close is called, so if close throws, the old engine is already replaced. Capturing the old reference first and then closing it is safer and avoids potential resource leaks or inconsistent state.

Low
Preserve null-safe engine access semantics

The original code used getEngineOrNull() which safely returns null when no engine is
present, but the replacement getEngine() (via getEngine(replicaShard)) throws an
exception if the engine is not available. This changes the semantics and could cause
the test to fail with an exception rather than an assertion error. Consider using a
null-safe approach to preserve the original behavior.

server/src/test/java/org/opensearch/index/shard/SegmentReplicationIndexShardTests.java [1007-1009]

-final Engine engineOrNull = getEngine(replicaShard);
+final Engine engineOrNull = ((EngineBackedIndexer) replicaShard.getIndexerOrNull()) != null
+    ? ((EngineBackedIndexer) replicaShard.getIndexerOrNull()).getEngine()
+    : null;
 assertNotNull(engineOrNull);
 assertTrue(engineOrNull instanceof ReadOnlyEngine);
Suggestion importance[1-10]: 3

__

Why: While the concern about semantics is valid, the test immediately asserts assertNotNull(engineOrNull), so the behavior difference between throwing an exception vs. returning null is minimal in practice. The improved_code also calls getIndexerOrNull() twice which is not ideal, making this suggestion only marginally useful.

Low
General
Restrict internal engine-unwrapping helper visibility

The applyOnEngine method is public static but it exposes internal implementation
details (the EngineBackedIndexer type) and throws an IllegalStateException for any
non-EngineBackedIndexer implementation. This will break any future Indexer
implementation that is not backed by an Engine, making the abstraction leaky. The
method should be private or package-private since it is only used internally within
IndexShard, and callers should be migrated to use Indexer methods directly.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5984-5990]

-public static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
+static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
     if (indexer instanceof EngineBackedIndexer engine) {
         return applier.apply(engine.getEngine());
     } else {
         throw new IllegalStateException("Cannot apply function on indexer " + indexer.getClass() + " directly on IndexShard");
     }
 }
Suggestion importance[1-10]: 5

__

Why: The applyOnEngine method is public static but only used internally within IndexShard. Making it package-private (static) is a reasonable encapsulation improvement, though the change from public to package-private is minor and the improved code is nearly identical to the existing code.

Low
Fix misleading assertion error message

The error message in the second assertion incorrectly references engine.getClass()
instead of translogManager.getClass(), which would be misleading when debugging
failures. Update the message to show the actual translogManager type.

test/framework/src/main/java/org/opensearch/index/shard/IndexShardTestCase.java [1568-1579]

 public static Translog getTranslog(IndexShard shard) {
     Indexer indexer = getIndexer(shard);
     Engine engine = getEngine(shard);
     assert engine instanceof InternalEngine || engine instanceof NRTReplicationEngine
         : "only InternalEngines or NRTReplicationEngines have translogs, got: " + engine.getClass();
     indexer.ensureOpen();
     TranslogManager translogManager = indexer.translogManager();
     assert translogManager instanceof InternalTranslogManager : "only InternalTranslogManager have translogs, got: "
-        + engine.getClass();
+        + translogManager.getClass();
     InternalTranslogManager internalTranslogManager = (InternalTranslogManager) translogManager;
     return internalTranslogManager.getTranslog();
 }
Suggestion importance[1-10]: 5

__

Why: The assertion error message uses engine.getClass() instead of translogManager.getClass(), which would be misleading when debugging failures. This is a minor but valid correctness issue in the new code.

Low
Improve exception message with actual indexer type

The getSegmentInfosSnapshot() method throws IllegalStateException silently for
non-EngineBackedIndexer implementations, but callers have no way to anticipate this
at compile time. The error message should include the actual indexer type to aid
debugging, similar to how other IllegalStateException messages in this file are
constructed.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5785-5790]

 @Deprecated
 public GatedCloseable<SegmentInfos> getSegmentInfosSnapshot() {
     if (getIndexer() instanceof EngineBackedIndexer indexer) {
         return indexer.getEngine().getSegmentInfosSnapshot();
     }
-    throw new IllegalStateException("Cannot request SegmentInfos directly on IndexShard");
+    throw new IllegalStateException(
+        "Cannot request SegmentInfos directly on IndexShard for indexer type: " + getIndexer().getClass().getSimpleName()
+    );
 }
Suggestion importance[1-10]: 4

__

Why: Including the actual indexer type in the exception message improves debuggability when a non-EngineBackedIndexer is encountered. This is a minor but useful improvement for diagnosing runtime failures.

Low
Document unsupported interface method clearly for callers

Throwing UnsupportedOperationException from acquireSnapshot() means any caller of
this method on an EngineBackedIndexer will get a runtime exception with no
compile-time warning. Since this is a core interface method, callers have no way to
know it is unsupported without reading the implementation. Consider returning an
empty/no-op GatedCloseable or documenting clearly in the Indexer interface that this
method may throw, so callers can guard against it.

server/src/main/java/org/opensearch/index/engine/EngineBackedIndexer.java [383-388]

 @Override
 public GatedCloseable<CatalogSnapshot> acquireSnapshot() {
-    // TODO: Replace with a SegmentInfosCatalogSnapshot
-    // For now we throw an exception as this is not yet implemented
-    throw new UnsupportedOperationException("acquireSnapshot is not supported in EngineBackedIndexer");
+    // TODO: Replace with a SegmentInfosCatalogSnapshot once implemented
+    throw new UnsupportedOperationException(
+        "acquireSnapshot is not yet supported in EngineBackedIndexer; use Engine.getSegmentInfosSnapshot() instead"
+    );
 }
Suggestion importance[1-10]: 2

__

Why: The improved code is essentially the same as the existing code with only a slightly different exception message. This is a very minor documentation/message improvement with negligible impact.

Low
Suggestions up to commit fbe21d8
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix engine reference comparison after Indexer wrapping

When resetEngineToGlobalCheckpoint replaces the current engine with a ReadOnlyEngine
wrapped in EngineBackedIndexer, the previous currentEngineReference is closed via
IOUtils.close. However, the anonymous ReadOnlyEngine subclass overrides
acquireLastIndexCommit, acquireSafeIndexCommit, and getSegmentInfosSnapshot to
delegate to newEngineReference.get(). When these methods are called on the
EngineBackedIndexer wrapping the ReadOnlyEngine, they go through applyOnEngine,
which correctly unwraps to the ReadOnlyEngine. But the close() method of the
anonymous ReadOnlyEngine subclass checks newEngine == currentEngineReference.get(),
which now compares an Engine to an Indexer, so the check newEngine ==
currentEngineReference.get() will always be false, potentially causing the new
engine to be closed prematurely. The close() method should compare against the
EngineBackedIndexer wrapper, not the raw engine.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5288]

-IOUtils.close(currentEngineReference.getAndSet(new EngineBackedIndexer(readOnlyEngine)));
+@Override
+public void close() throws IOException {
+    assert Thread.holdsLock(engineMutex);
 
+    Indexer newIndexer = newEngineReference.get();
+    if (newIndexer == currentEngineReference.get()) {
+        // we successfully installed the new engine so do not close it.
+        newIndexer = null;
+    }
+    IOUtils.close(super::close, newIndexer);
+}
+
Suggestion importance[1-10]: 8

__

Why: This is a real correctness bug: after wrapping engines in EngineBackedIndexer, the anonymous ReadOnlyEngine subclass's close() method compares a raw Engine (newEngineReference.get()) against currentEngineReference.get() which now holds an Indexer. The comparison will always be false, potentially causing the new engine to be closed prematurely during resetEngineToGlobalCheckpoint.

Medium
Fix unsafe cast in equals method

The equals method does not check if o is null or if it is an instance of
WriterFileSet before casting, which will throw a NullPointerException or
ClassCastException for non-matching objects. Add a null check and an instanceof
check before casting.

server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java [71-76]

 @Override
 public boolean equals(Object o) {
+    if (this == o) return true;
+    if (!(o instanceof WriterFileSet)) return false;
     WriterFileSet other = (WriterFileSet) o;
     return this.directory.equals(other.directory)
         && this.files.equals(other.files)
         && this.getWriterGeneration() == other.getWriterGeneration();
 }
Suggestion importance[1-10]: 8

__

Why: The equals method lacks null and instanceof checks before casting, which will throw NullPointerException or ClassCastException when compared with null or non-WriterFileSet objects. This is a real bug in the new code.

Medium
General
Reduce visibility of engine-specific helper method

The applyOnEngine method is public static but it exposes internal implementation
details (the EngineBackedIndexer type) and throws an IllegalStateException for any
non-EngineBackedIndexer implementation. This will break any future Indexer
implementations that are not backed by an Engine, making the abstraction leaky. The
method should be private or protected to limit its scope, since it is only used
internally within IndexShard.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5984-5990]

-public static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
+static <T> T applyOnEngine(Indexer indexer, Function<Engine, T> applier) {
     if (indexer instanceof EngineBackedIndexer engine) {
         return applier.apply(engine.getEngine());
     } else {
         throw new IllegalStateException("Cannot apply function on indexer " + indexer.getClass() + " directly on IndexShard");
     }
 }
Suggestion importance[1-10]: 5

__

Why: The applyOnEngine method is public static but exposes internal EngineBackedIndexer details. Making it package-private or static (without public) would be a reasonable encapsulation improvement, though the change from public static to static is minor and the impact is limited.

Low
Log swallowed IOException in size calculation

Silently swallowing IOException in getTotalSize by returning 0 can produce incorrect
total size values without any indication of failure. This could lead to silent data
inconsistencies in replication or snapshot operations that rely on accurate file
sizes. At minimum, the exception should be logged.

server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java [51-59]

 public long getTotalSize() {
     return files.stream().mapToLong(file -> {
         try {
             return java.nio.file.Files.size(Path.of(directory, file));
         } catch (IOException e) {
+            // Log the error; returning 0 for this file
+            org.apache.logging.log4j.LogManager.getLogger(WriterFileSet.class)
+                .warn("Failed to get size of file [{}] in directory [{}]", file, directory, e);
             return 0;
         }
     }).sum();
 }
Suggestion importance[1-10]: 4

__

Why: Silently returning 0 on IOException in getTotalSize can produce incorrect results without any indication of failure, but logging the error is a minor improvement that doesn't fix the underlying issue of returning incorrect data.

Low
Add assertion before unsafe cast

This cast will throw a ClassCastException if the indexer is not an
EngineBackedIndexer. Unlike other similar locations in the same file that use a safe
pattern with instanceof, this method performs an unchecked cast without any guard,
which could cause confusing test failures. Add an instanceof check or a descriptive
assertion before casting.

test/framework/src/main/java/org/opensearch/index/shard/IndexShardTestCase.java [1361]

 public static List<DocIdSeqNoAndSource> getDocIdAndSeqNos(final IndexShard shard) throws IOException {
-    return EngineTestCase.getDocIds(((EngineBackedIndexer) shard.getIndexer()).getEngine(), true);
+    Indexer indexer = shard.getIndexer();
+    assert indexer instanceof EngineBackedIndexer : "Expected EngineBackedIndexer but got: " + indexer.getClass();
+    return EngineTestCase.getDocIds(((EngineBackedIndexer) indexer).getEngine(), true);
 }
Suggestion importance[1-10]: 4

__

Why: The unchecked cast to EngineBackedIndexer could cause confusing ClassCastException failures in tests; adding an assertion would improve debuggability, but this is a test utility method and the impact is limited to test code.

Low
Improve error messages for unsupported engine operations

The default implementations of lastRefreshedCheckpoint() and
currentOngoingRefreshCheckpoint() in the Indexer interface throw
UnsupportedOperationException. When the wrapped engine is not an InternalEngine,
these methods will throw, which can cause unexpected runtime failures. For
non-InternalEngine engines that don't support these operations, a more descriptive
error or a safe fallback value (e.g., returning SequenceNumbers.NO_OPS_PERFORMED)
should be provided.

server/src/main/java/org/opensearch/index/engine/EngineBackedIndexer.java [358-371]

 @Override
 public long lastRefreshedCheckpoint() {
     if (engine instanceof InternalEngine) {
         return ((InternalEngine) engine).lastRefreshedCheckpoint();
     }
-    return Indexer.super.lastRefreshedCheckpoint();
+    throw new UnsupportedOperationException(
+        "lastRefreshedCheckpoint is not supported for engine type: " + engine.getClass().getSimpleName()
+    );
 }
 
 @Override
 public long currentOngoingRefreshCheckpoint() {
     if (engine instanceof InternalEngine) {
         return ((InternalEngine) engine).currentOngoingRefreshCheckpoint();
     }
-    return Indexer.super.currentOngoingRefreshCheckpoint();
+    throw new UnsupportedOperationException(
+        "currentOngoingRefreshCheckpoint is not supported for engine type: " + engine.getClass().getSimpleName()
+    );
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion replaces a delegation to Indexer.super.lastRefreshedCheckpoint() (which already throws UnsupportedOperationException) with an explicit throw that includes the engine class name. The behavior is functionally equivalent but with a slightly better error message, making this a minor improvement.

Low
Improve error message for unsupported indexer type

The getSegmentInfosSnapshot() method is marked @Deprecated but still throws
IllegalStateException for non-EngineBackedIndexer implementations, which will cause
hard failures in callers that haven't yet migrated. The deprecation annotation alone
is insufficient; callers should be warned at compile time via the deprecation, but
the runtime behavior for unsupported indexer types should include a more informative
error message indicating the indexer type.

server/src/main/java/org/opensearch/index/shard/IndexShard.java [5785-5790]

+@Deprecated
 public GatedCloseable<SegmentInfos> getSegmentInfosSnapshot() {
     if (getIndexer() instanceof EngineBackedIndexer indexer) {
         return indexer.getEngine().getSegmentInfosSnapshot();
     }
-    throw new IllegalStateException("Cannot request SegmentInfos directly on IndexShard");
+    throw new IllegalStateException(
+        "Cannot request SegmentInfos directly on IndexShard for indexer type: " + getIndexer().getClass().getSimpleName()
+    );
 }
Suggestion importance[1-10]: 2

__

Why: The suggestion only adds the indexer class name to the error message, which is a very minor readability improvement with negligible functional impact. The existing_code and improved_code are nearly identical in behavior.

Low
Suggestions up to commit a817aa5
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix unsafe cast in equals method

The equals method does not check if the passed object is null or of the correct type
before casting, which will throw a NullPointerException or ClassCastException for
non-WriterFileSet arguments. Add a null check and an instanceof check before
casting.

server/src/main/java/org/opensearch/index/engine/exec/WriterFileSet.java [71-76]

 @Override
 public boolean equals(Object o) {
+    if (thi...

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

Persistent review updated to latest commit d4a8622

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 3, 2026

❌ Gradle check result for d4a8622: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 9566666

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

❌ Gradle check result for 9566666: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 3926bc0

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

❌ Gradle check result for 3926bc0: TIMEOUT

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit eae9a8a

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

❌ Gradle check result for eae9a8a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 11cadef

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

❌ Gradle check result for 11cadef: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 9, 2026

Persistent review updated to latest commit 217f00f

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 9, 2026

Persistent review updated to latest commit 91d376d

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit a817aa5

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for a817aa5: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit fbe21d8

@andrross
Copy link
Copy Markdown
Member

@mgodwan I just rebased the latest version increment that is needed to get the bwc tests to pass

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 88b1af9

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 88b1af9: SUCCESS

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit b7133e3

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 4d70feb

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 4d70feb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

mgodwan added 8 commits March 18, 2026 10:37
Signed-off-by: Mohit Godwani <mgodwan@amazon.com>

Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
Convert some classes to records
Add immutability to collections held in record classes

Signed-off-by: Mohit Godwani <mgodwan@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit e403aa9

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for e403aa9: SUCCESS

Copy link
Copy Markdown
Contributor

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes LGTM

@Bukhtawar Bukhtawar merged commit 6562711 into opensearch-project:main Mar 18, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants