Skip to content

Relax field-level meta validation constraints#20578

Open
urmichm wants to merge 1 commit intoopensearch-project:mainfrom
urmichm:meta-19884
Open

Relax field-level meta validation constraints#20578
urmichm wants to merge 1 commit intoopensearch-project:mainfrom
urmichm:meta-19884

Conversation

@urmichm
Copy link
Copy Markdown
Contributor

@urmichm urmichm commented Feb 9, 2026

Description

Restrictions on the field-level meta have been relaxed:

  • meta accepts any number of entries as long as the following restrictions are met
  • meta's values must be Strings of any length
  • meta's values can not be nulls
  • meta's keys must be Strings of any length

Related Issues

Resolves #19884

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing labels Feb 9, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 9, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

These changes relax validation constraints for field-level _meta metadata in the mapper. The maximum entry count, key length, and value length restrictions are removed from TypeParsers.parseMeta, while null-value and non-string type checks remain enforced. The corresponding test is updated to reflect these relaxed validation rules.

Changes

Cohort / File(s) Summary
Validation Logic Relaxation
server/src/main/java/org/opensearch/index/mapper/TypeParsers.java
Removed max entry count (5), per-key length (20 chars), and per-value length (50 chars) constraints. Retained null-value prohibition and added explicit error messaging for non-String values.
Test Updates
server/src/test/java/org/opensearch/index/mapper/TypeParsersTests.java
Replaced validation tests that expected MapperParsingException for excessive key length and max entries. New test constructs meta map with multiple long keys and asserts successful parsing with matching size.
Changelog
CHANGELOG.md
Added changelog entry documenting the relaxation of field-level meta validation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: relaxing validation constraints for field-level meta in OpenSearch mappings.
Description check ✅ Passed The description clearly explains the relaxed restrictions, references the resolved issue #19884, confirms testing is included, and follows the required template format.
Linked Issues check ✅ Passed The PR successfully implements the feature request from issue #19884 by allowing field-level _meta blocks with relaxed validation on entry count and string length constraints.
Out of Scope Changes check ✅ Passed All changes are directly related to relaxing field-level meta validation: modifications to TypeParsers.parseMeta, updated tests, and CHANGELOG entry align with issue #19884 requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 9, 2026

❗ AI-powered Code-Diff-Analyzer found issues on commit 5f656fd.

PathLineSeverityDescription
server/src/main/java/org/opensearch/index/mapper/TypeParsers.java89mediumIntentional removal of input validation limits on metadata fields (size limit of 5 entries, key length limit of 20 chars, value length limit of 50 chars). While documented as a feature change in CHANGELOG, this could enable resource exhaustion or DoS attacks through unbounded metadata. Without context from issue #19884, cannot definitively determine if this weakening of security controls is justified by legitimate requirements.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 1 | Low: 0


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 9, 2026

❌ Gradle check result for 5f656fd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 5f656fd: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for e2080f7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❗ AI-powered Code-Diff-Analyzer found issues on commit 6ee1908.

PathLineSeverityDescription
server/src/main/java/org/opensearch/index/mapper/TypeParsers.java89mediumIntentional removal of security validation limits: meta field entry count (was max 5), key length (was max 20 chars), and value length (was max 50 chars) protections removed. While documented in CHANGELOG, this enables potential DoS via resource exhaustion, unbounded data storage in meta fields, and possible covert channels. Change appears legitimate but creates exploitable attack surface.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 1 | Low: 0


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 71347c6

@urmichm
Copy link
Copy Markdown
Contributor Author

urmichm commented Mar 23, 2026

Hello @ShawnQiang1
Would you mind to have a look? :)

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 71347c6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@opensearch-trigger-bot opensearch-trigger-bot bot removed the stalled Issues that have stalled label Mar 24, 2026
@sandeshkr419
Copy link
Copy Markdown
Member

[Search Triage] @msfroh Can you help with the review here when you get the chance.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for d66e0c4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Copy Markdown
Contributor

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@urmichm -- I've checked the commit message that added these restrictions, and I'm inclined to agree with removing them or at least relaxing them. As @epugh called out, adding more verbose values would be really helpful for agents to understand and interpret the schema of an index.

Instead of wholesale removing the restrictions, may I suggest that we relax them? Going fully unbounded still opens the opportunity for users to shoot themselves in the foot. I would still prefer that a field not have (say) 1MB of metadata.

Maybe we could relax to the following:

  1. Max number of keys raised to 100 (from 5).
  2. Increase the max key length to 255 (from 20).
  3. Increase the max length of any string value to 10000 (from 50).

I would also potentially be open to allowing nested objects, arrays, and numeric fields in field metadata, but I think that's a more complicated change.

What do you think?

Edit: Incidentally, it looks like you also need to fix some Spotless errors with ./gradlew spotlessApply.

@urmichm
Copy link
Copy Markdown
Contributor Author

urmichm commented Mar 25, 2026

@msfroh thank you for your feedback
I agree that nested objects and arrays would be useful, but we would need to think how to calculate the size of such non-string values.

Naïve approach would be to convert the values into the string and check it's length. However, this approach has lots of limitations.

I can update the current restrictions to the suggested values and open a new issue to allow non-string types with proper validation

@urmichm
Copy link
Copy Markdown
Contributor Author

urmichm commented Mar 26, 2026

Given that the values could be 10,000 chars long, naïve approach is not as bad as it might seem.

Alternatively we can restrict the total size of meta, rather then number of keys.
Meta is sort of a json object, we can assume:

  • any string (key or value) is measured by length
  • numbers - 4 "bytes"
  • boolean - 1 "byte"
  • complex types (arrays / objects) - sum of all internal elements

Examples:

  • {"a": 45} = 5 "bytes"
    • 1 from a
    • 4 from 45
  • {"a": true} = 2 "bytes"
    • 1 from a
    • 1 from true
  • {"abc": 123, "obj":{"a":12, "b":"str"}} = 19 "bytes"
    • 3 from abc
    • 4 from 123
    • 3 from 'obj'
    • 9 from {"a":12, "b":"str"} (1+4+1+3)

We can split numbers into int and float.

cc: @msfroh

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for e06acde: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 8a74070: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 3ae0718: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Mikhail Urmich <urmich.m@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 42a4950: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Relax meta block on field level

3 participants