Address review nits on NONE and LAST_TOKEN pooling PRs by aneesh-db · Pull Request #4744 · opensearch-project/ml-commons

aneesh-db · 2026-03-27T10:24:30Z

Description

Follow-up to address review comments from #4710 and #4711:

Early return for NONE pooling (comment): HuggingfaceTextEmbeddingTranslator now returns immediately when pooling is none, skipping unnecessary attention mask creation and switch statement.
Extract shared lastTokenPool (comment): Moved duplicated lastTokenPool method from both translators into a shared TextEmbeddingPoolingUtils class. Also normalizes to use toLongArray() consistently, avoiding float-to-long precision concerns.

Related Issues

Follow-up to #4710 and #4711

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2026-03-27T10:25:25Z

Failed to generate code suggestions for PR

github-actions · 2026-03-27T10:29:36Z

Failed to generate code suggestions for PR

pyek-bot · 2026-03-30T19:50:33Z

@aneesh-db Thanks for the follow-up PRs. Could you also remove the feature details from the older release notes? This was added in your previous PR.s

- Early return for NONE pooling in HuggingfaceTextEmbeddingTranslator to skip unnecessary attention mask creation and switch statement - Extract lastTokenPool to shared TextEmbeddingPoolingUtils class, eliminating duplication between ONNX and Huggingface translators - Use toLongArray() consistently for attention mask sum, avoiding float-to-long precision concerns Signed-off-by: Aneesh Nema <aneesh.nema@databricks.com>

github-actions · 2026-03-30T23:08:20Z

PR Reviewer Guide 🔍

(Review updated until commit `74bc0d2`)

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Extract shared lastTokenPool utility and update translators Relevant files: ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/TextEmbeddingPoolingUtils.java ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/HuggingfaceTextEmbeddingTranslator.java ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/ONNXSentenceTransformerTextEmbeddingTranslator.java Sub-PR theme: Early return for NONE pooling in HuggingfaceTextEmbeddingTranslator Relevant files: ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/HuggingfaceTextEmbeddingTranslator.java
⚡ Recommended focus areas for review Missing Normalize In the refactored code, after the early return for NONE pooling, the non-NONE pooling path no longer applies normalization. The original code had a shared normalization step after the if/else block. Verify that normalization is still applied for pooling modes like MEAN, CLS, LASTTOKEN, etc., and that it hasn't been accidentally dropped. // For other pooling modes, use last_hidden_state or first output embeddings = list.get("last_hidden_state"); if (embeddings == null) { embeddings = list.get(0); } Encoding encoding = (Encoding) ctx.getAttachment("encoding"); long[] attentionMask = encoding.getAttentionMask(); NDManager manager = ctx.getNDManager(); NDArray inputAttentionMask = manager.create(attentionMask).toType(DataType.FLOAT32, true); switch (pooling) { case "mean": embeddings = meanPool(embeddings, inputAttentionMask, false); break; case "mean_sqrt_len": embeddings = meanPool(embeddings, inputAttentionMask, true); break; case "max": embeddings = maxPool(embeddings, inputAttentionMask); break; case "weightedmean": embeddings = weightedMeanPool(embeddings, inputAttentionMask); break; case "cls": embeddings = embeddings.get(0); break; case "lasttoken": embeddings = TextEmbeddingPoolingUtils.lastTokenPool(embeddings, inputAttentionMask); break; default: throw new AssertionError("Unexpected pooling model: " + pooling); } Incorrect Pooling The `lastTokenPool` method sums the entire attention mask to find the last token index. For batched inputs (batch_size > 1), this would sum across all sequences, producing an incorrect index. Confirm that this method is only ever called with a single-sequence attention mask, or add a guard/reshape to handle the correct dimension. static NDArray lastTokenPool(NDArray embeddings, NDArray attentionMask) { long tokenCount = attentionMask.sum().toLongArray()[0]; long lastTokenIdx = Math.max(tokenCount - 1, 0); return embeddings.get(lastTokenIdx); Missing Release Notes The release notes entries for LAST_TOKEN pooling support and NONE pooling mode bug fix were removed from the release notes file. These features were introduced in PRs #4710 and #4711 and should still be documented. Confirm whether these entries should be preserved or moved to a different section. * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions * and limitations under the License. */ package org.opensearch.ml.engine.algorithms.text_embedding;

github-actions · 2026-03-30T23:08:40Z

PR Code Suggestions ✨

Latest suggestions up to 74bc0d2
Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Add null guard before using embeddings If `list` is empty, `list.get(0)` will throw an exception rather than returning `null`, so `embeddings` could still be `null` only if `list.get(0)` itself returns `null`. However, if `list` is completely empty, this will throw an `IndexOutOfBoundsException`. A null-check or size-check before calling `embeddings.normalize` and `embeddings.toFloatArray()` would prevent a potential `NullPointerException` if `embeddings` ends up null. ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/HuggingfaceTextEmbeddingTranslator.java [93-99] if (embeddings == null) { embeddings = list.get(0); +} +if (embeddings == null) { + throw new IllegalStateException("No embeddings found in model output for NONE pooling mode"); } if (normalize) { embeddings = embeddings.normalize(2, 0); } return embeddings.toFloatArray(); Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies that if `list` is empty, `list.get(0)` could throw an `IndexOutOfBoundsException`, and adding an explicit null/state check with a meaningful error message improves robustness and debuggability. However, this is a defensive edge case that may rarely occur in practice.	Low
Possible issue	Ensure attention mask summed over correct dimension The `attentionMask` passed here may be a multi-dimensional array (e.g., batch x sequence_length), so calling `.sum()` on it would sum all elements across all dimensions, not just the sequence dimension. This could produce an incorrect `tokenCount` if the mask is not already 1-D. Consider flattening or explicitly summing along the correct axis before calling `toLongArray()[0]`. ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/TextEmbeddingPoolingUtils.java [25-29] static NDArray lastTokenPool(NDArray embeddings, NDArray attentionMask) { - long tokenCount = attentionMask.sum().toLongArray()[0]; + // Sum along the sequence dimension (axis 0 for 1-D, or last axis for batched) + long tokenCount = attentionMask.flatten().sum().toLongArray()[0]; long lastTokenIdx = Math.max(tokenCount - 1, 0); return embeddings.get(lastTokenIdx); } Suggestion importance[1-10]: 4 __ Why: The concern about multi-dimensional masks is valid in theory, but looking at the existing code in both translators, the `attentionMask` passed to `lastTokenPool` is already a 1-D array derived from `encoding.getAttentionMask()`. The suggestion to use `.flatten()` is a minor defensive improvement but not critical given the actual usage context.	Low

Previous suggestions

Suggestions up to commit 3214126

Category	Suggestion	Impact
Possible issue	Add null check before using embeddings The early return for the `"none"` pooling path skips the normalization logic that exists at the end of the method for other pooling modes. However, the normalize check was added inline here, which is correct. But if `list.get(0)` also returns `null` (e.g., an empty list), calling `embeddings.normalize(2, 0)` will throw a `NullPointerException`. A null check should be added before using `embeddings`. ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/HuggingfaceTextEmbeddingTranslator.java [93-99] if (embeddings == null) { embeddings = list.get(0); +} +if (embeddings == null) { + throw new IllegalArgumentException("No embeddings found in model output for NONE pooling mode"); } if (normalize) { embeddings = embeddings.normalize(2, 0); } return embeddings.toFloatArray(); Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies that `list.get(0)` could theoretically return `null`, leading to a `NullPointerException` when calling `embeddings.normalize()` or `toFloatArray()`. This is a valid defensive programming concern, though in practice an empty NDList would be unusual.	Low
General	Validate attention mask dimensions before pooling The `attentionMask` is expected to be a 1D array (sequence_length), but the callers in both translators pass `inputAttentionMask` which may have been reshaped or broadcast to a higher-dimensional tensor before this call. If `attentionMask` is multi-dimensional, `sum()` will return the total sum across all dimensions, giving an incorrect `tokenCount`. A comment or a shape assertion should be added to document and enforce the expected shape. ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/text_embedding/TextEmbeddingPoolingUtils.java [25-29] static NDArray lastTokenPool(NDArray embeddings, NDArray attentionMask) { + // attentionMask must be 1-D (sequence_length): 1 for real tokens, 0 for padding + if (attentionMask.getShape().dimension() != 1) { + throw new IllegalArgumentException("attentionMask must be 1-D, got shape: " + attentionMask.getShape()); + } long tokenCount = attentionMask.sum().toLongArray()[0]; long lastTokenIdx = Math.max(tokenCount - 1, 0); return embeddings.get(lastTokenIdx); } Suggestion importance[1-10]: 3 __ Why: The suggestion adds a dimension check for `attentionMask`, but looking at the callers, `inputAttentionMask` is derived from `encoding.getAttentionMask()` which returns a 1D array, so this validation is unlikely to catch real issues. The suggestion is more of a defensive measure than addressing an actual bug.	Low

Release notes are auto-generated, removing entries added in opensearch-project#4710 and opensearch-project#4711. Signed-off-by: Aneesh Nema <aneesh.nema@databricks.com>

github-actions · 2026-03-31T05:52:39Z

Persistent review updated to latest commit 74bc0d2

aneesh-db · 2026-03-31T05:52:49Z

@pyek-bot done, removed the feature details from the older release notes

aneesh-db requested review from HenryL27, Zhangxunmt, austintlee, b4sjoo, dhrubo-os, jngz-es, mingshl, model-collapse, pyek-bot, rbhavna, rithin-pullela-aws, sam-herman, xinyual, ylwu-amzn and zane-neo as code owners March 27, 2026 10:24

aneesh-db had a problem deploying to ml-commons-cicd-env-require-approval March 27, 2026 10:26 — with GitHub Actions Error

aneesh-db had a problem deploying to ml-commons-cicd-env-require-approval March 27, 2026 10:26 — with GitHub Actions Failure

This was referenced Mar 27, 2026

Add LAST_TOKEN pooling implementation for text embedding models #4711

Merged

Add NONE pooling mode to support pre-pooled model outputs #4710

Merged

aneesh-db force-pushed the fix/pooling-review-nits branch from ee5558e to a00348e Compare March 27, 2026 10:28

aneesh-db requested a deployment to ml-commons-cicd-env-require-approval March 27, 2026 10:30 — with GitHub Actions Waiting

rithin-pullela-aws force-pushed the fix/pooling-review-nits branch from a00348e to 3214126 Compare March 30, 2026 23:06

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval March 30, 2026 23:08 — with GitHub Actions Failure

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval March 30, 2026 23:08 — with GitHub Actions Error

Remove manually added release notes entries

74bc0d2

Release notes are auto-generated, removing entries added in opensearch-project#4710 and opensearch-project#4711. Signed-off-by: Aneesh Nema <aneesh.nema@databricks.com>

aneesh-db requested a deployment to ml-commons-cicd-env-require-approval March 31, 2026 05:53 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address review nits on NONE and LAST_TOKEN pooling PRs#4744

Address review nits on NONE and LAST_TOKEN pooling PRs#4744
aneesh-db wants to merge 2 commits intoopensearch-project:mainfrom
aneesh-db:fix/pooling-review-nits

aneesh-db commented Mar 27, 2026

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

pyek-bot commented Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

aneesh-db commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aneesh-db commented Mar 27, 2026

Description

Related Issues

Check List

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

pyek-bot commented Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 74bc0d2)

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

aneesh-db commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Mar 30, 2026 •

edited

Loading

(Review updated until commit `74bc0d2`)

github-actions bot commented Mar 30, 2026 •

edited

Loading