Improve AI release notes generation for OpenSearch by andrross · Pull Request #6055 · opensearch-project/opensearch-build

andrross · 2026-03-27T23:28:46Z

Fetch PR body from GitHub API and include it (truncated to 2000 chars) in the commit data sent to the LLM for better categorization and filtering decisions.
Add OpenSearch-specific prompt that filters non-user-facing changes (test additions, CI action bumps, test fixture deps, internal refactoring, release machinery) before categorization. Removes Infrastructure/Documentation/Refactoring categories that contradicted the exclusion policy. Only Breaking Changes, Features, Enhancements, Bug Fixes, and Maintenance categories are used.
Add borderline calls section: the LLM appends an HTML comment documenting debatable inclusion/exclusion/categorization decisions. This is extracted from the LLM output and written to a separate -borderline.md file.
Update Jenkinsfile to include borderline calls in the PR description when creating or updating release notes PRs. Also fix the existing PR path to update the PR body via gh pr edit.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2026-03-27T23:30:06Z

Failed to generate code suggestions for PR

github-actions · 2026-03-27T23:52:59Z

Failed to generate code suggestions for PR

codecov · 2026-03-28T00:00:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.61%. Comparing base (2ec4925) to head (22600f1).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6055      +/-   ##
==========================================
+ Coverage   96.58%   96.61%   +0.02%     
==========================================
  Files         405      405              
  Lines       18758    18851      +93     
==========================================
+ Hits        18118    18212      +94     
+ Misses        640      639       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-03-28T00:16:48Z

Failed to generate code suggestions for PR

github-actions · 2026-03-28T01:30:12Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 0d5720e.

Path	Line	Severity	Description
src/release_notes_workflow/release_notes.py	155	medium	PR body content fetched from external GitHub API (arbitrary user-controlled text) is passed directly into LLM prompts without sanitization. A malicious PR author could craft a PR description containing prompt injection payloads to manipulate LLM output, potentially causing the release notes generator to emit unexpected or misleading content into official release documentation.
jenkins/release-workflows/release-notes-generate.jenkinsfile	163	low	LLM-generated borderline calls content is read from a local file via `$(cat "$BORDERLINE_FILE")` and embedded directly into the GitHub PR body without sanitization. While the content is produced by an internal LLM, if the LLM were manipulated (e.g., via prompt injection from PR bodies), the resulting file content could include markdown or HTML that affects PR rendering in unexpected ways. Low severity as the content is passed as a string argument to `gh pr`, not executed as shell.

The table above displays the top 10 most important findings.

Total: 2 | Critical: 0 | High: 0 | Medium: 1 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-28T01:30:52Z

Failed to generate code suggestions for PR

gaiksaya

Looks good overall with minor comments. If this works for opensearch maybe we can extend to other components as well.
Adding @rishabh6788 for another review.

gaiksaya · 2026-03-30T17:05:49Z

+     * "breaking change" or "breaking" → Breaking Changes
+     * "feature" or "feat" → Features
+     * "enhancement" or "improve" → Enhancements
+     * "bug" or "fix" or "bugfix" → Bug Fixes
+     * "maintenance" or "version" or "support" → Maintenance


The categories also include infrastructure and documentation changes as per guidelines.
https://github.com/opensearch-project/opensearch-plugins/blob/main/RELEASE_NOTES.md#categories
I agree they do no provide any value. @peterzhuamazon maybe we also need to exclude them from consolidated ones.

gaiksaya · 2026-03-30T17:09:14Z

+   - Example: `([#456](https://github.com/opensearch-project/OpenSearch/pull/456))`
+
+6. **Important Notes:**
+   - If you cannot determine the appropriate category from labels OR content analysis, place the entry in "Maintenance"


In the consolidation workflow where we combine all non-core components' release notes into one, we place the uncategorized data in non-compliant section? Would doing the same here avoid confusing between actual maintenance?

That's probably a good idea.

The reason I hesitated is that anytime the LLM creates an "unknown" or "non-compliant" section it means manual work will be required. Probably not the end of the world if it truly has no idea how to categorize.

github-actions · 2026-03-30T19:05:40Z

Failed to generate code suggestions for PR

github-actions · 2026-03-30T19:51:28Z

Failed to generate code suggestions for PR

gaiksaya · 2026-03-30T21:08:44Z

+     * Maintenance: Routine upkeep such as dependency updates that ship in the distribution.
+   - Do not use "Infrastructure", "Documentation", or "Refactoring" as categories. Changes that would
+     belong to those categories should have been excluded by the filtering step. If a PR survived filtering
+     but does not fit any of the above categories, place it in "Maintenance".


Suggested change

but does not fit any of the above categories, place it in "Maintenance".

but does not fit any of the above categories, place it in "Unknown".

Actually I think I should strike the last sentence. I find it confusing given the instruction at the end.

- Fetch PR body from GitHub API and include it (truncated to 2000 chars) in the commit data sent to the LLM for better categorization and filtering decisions. - Add OpenSearch-specific prompt that filters non-user-facing changes (test additions, CI action bumps, test fixture deps, internal refactoring, release machinery) before categorization. Removes Infrastructure/Documentation/Refactoring categories that contradicted the exclusion policy. Only Breaking Changes, Features, Enhancements, Bug Fixes, and Maintenance categories are used. - Add borderline calls section: the LLM appends an HTML comment documenting debatable inclusion/exclusion/categorization decisions. This is extracted from the LLM output and written to a separate -borderline.md file. - Update Jenkinsfile to include borderline calls in the PR description when creating or updating release notes PRs. Also fix the existing PR path to update the PR body via gh pr edit. Signed-off-by: Andrew Ross <andrross@amazon.com>

github-actions · 2026-03-30T21:19:33Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Fetch and expose PR body in commit data Relevant files: src/git/git_commit_processor.py tests/tests_git/test_git_commit_processor.py Sub-PR theme: OpenSearch-specific prompt and borderline calls extraction Relevant files: src/llms/prompts.py src/release_notes_workflow/release_notes.py tests/tests_release_notes_workflow/test_release_notes.py Sub-PR theme: Jenkinsfile update to include borderline calls in PR description Relevant files: jenkins/release-workflows/release-notes-generate.jenkinsfile tests/jenkins/jenkinsjob-regression-files/release-workflows/release-notes-generate.jenkinsfile.txt
⚡ Recommended focus areas for review Hardcoded Component Name The OpenSearch-specific prompt is selected by checking `component.name == 'OpenSearch'` (exact string match). This is fragile — if the component name differs in casing or has a suffix, the wrong prompt will be used silently. Consider making this configurable or using a case-insensitive comparison. commit_prompt = AI_RELEASE_NOTES_PROMPT_COMMIT_OPENSEARCH if component.name == 'OpenSearch' else AI_RELEASE_NOTES_PROMPT_COMMIT Borderline File Naming The borderline filename is derived by replacing `.md` with `-borderline.md` in the original filename. If the filename contains `.md` in a path component or multiple `.md` occurrences, `str.replace` will replace the first occurrence, potentially producing an unexpected path. Using `removesuffix` or `os.path.splitext` would be safer. borderline_filename = filename.replace('.md', '-borderline.md') with open(os.path.join(os.getcwd(), 'release-notes', borderline_filename), 'w') as f: PR Body Injection Risk The `PR_BODY` variable is constructed by appending the raw content of the borderline file (via `$(cat "$BORDERLINE_FILE")`), which is LLM-generated text. This content is then passed directly to `gh pr edit --body "$PR_BODY"` and `gh pr create --body "$PR_BODY"`. If the LLM output contains shell metacharacters or special sequences, this could cause unexpected behavior. Consider writing the body to a temp file and using `--body-file` instead. BORDERLINE_FILE="$WORKSPACE/release-notes/${filename.replace('.md', '-borderline.md')}" PR_BODY="Add release notes for ${version}" if [ -f "\$BORDERLINE_FILE" ]; then PR_BODY="\${PR_BODY} ## Borderline Calls \$(cat "\$BORDERLINE_FILE")" fi EXISTING_PR=\$(gh pr list --head release-chores/release-notes-${version} --state open --json number --jq '.[0].number') if [ -n "\$EXISTING_PR" ]; then echo "Updating existing PR #\${EXISTING_PR} for component \${REPO_NAME} with new release notes." gh pr edit "\$EXISTING_PR" --body "\$PR_BODY" else gh pr create --title '[AUTO] Add release notes for ${version}' --body "\$PR_BODY" -H release-chores/release-notes-${version} -B ${ref} Inconsistent Filtering Rule The new OpenSearch prompt (step 1) says "When uncertain whether a change is user-facing, include it", but step 2 says "Do not use Infrastructure, Documentation, or Refactoring as categories. Changes that would belong to those categories should have been excluded by the filtering step." This creates a contradiction: uncertain changes are included but then have no valid category to land in, potentially forcing them into "Unknown". The instructions should clarify what category to use for borderline infrastructure/docs/refactoring changes that are included. - Do not use "Infrastructure", "Documentation", or "Refactoring" as categories. Changes that would belong to those categories should have been excluded by the filtering step.

github-actions · 2026-03-30T21:20:08Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Use case-insensitive component name comparison The component name comparison `component.name == 'OpenSearch'` is case-sensitive and fragile. If the component name is stored as `opensearch`, `OPENSEARCH`, or any other casing variant, the wrong prompt will be used silently. Use a case-insensitive comparison to be safe. src/release_notes_workflow/release_notes.py [159] -commit_prompt = AI_RELEASE_NOTES_PROMPT_COMMIT_OPENSEARCH if component.name == 'OpenSearch' else AI_RELEASE_NOTES_PROMPT_COMMIT +commit_prompt = AI_RELEASE_NOTES_PROMPT_COMMIT_OPENSEARCH if component.name.lower() == 'opensearch' else AI_RELEASE_NOTES_PROMPT_COMMIT Suggestion importance[1-10]: 5 __ Why: The case-sensitive comparison `component.name == 'OpenSearch'` could silently use the wrong prompt if the component name casing differs. A `.lower()` comparison is a simple and effective fix for this potential issue.	Low
General	Make borderline calls regex more robust The regex pattern requires a newline immediately after `BORDERLINE_CALLS`, but the LLM might output the comment with or without trailing whitespace before the newline. More critically, the closing `-->` could appear with leading whitespace or on the same line as content. Consider making the pattern more robust to handle variations in whitespace around the closing delimiter. src/release_notes_workflow/release_notes.py [37] -match = re.search(r'<!-- BORDERLINE_CALLS\s\n(.?)-->', raw, re.DOTALL) +match = re.search(r'<!--\sBORDERLINE_CALLS\s\n(.?)\s-->', raw, re.DOTALL) Suggestion importance[1-10]: 4 __ Why: The suggestion improves robustness by allowing optional whitespace before `BORDERLINE_CALLS`, but the current pattern already handles trailing whitespace after it with `\s*\n`. The improvement is minor since the LLM output format is controlled by the prompt.	Low
	Ensure test exercises the correct code path The test `test_generate_writes_borderline_file` relies on `mock_isfile.return_value =` `True` to trigger the changelog path, but the `generate` method also uses `get_commits_with_labels` for the commit path. Without mocking the commits processor and its return value, the test may not exercise the code path that writes the borderline file, causing the assertion `len(write_calls) == 2` to fail. Ensure the mock setup covers the actual code path being tested. tests/tests_release_notes_workflow/test_release_notes.py [361-362] +# The test should mock the commits processor to ensure the commit-based path is exercised, +# or set mock_isfile.return_value = False and mock the commits processor with borderline content. +# Alternatively, verify the test actually reaches the release_notes_raw assignment by asserting +# mock_ai_generator.generate_release_notes was called before checking write_calls. +mock_ai_generator.generate_release_notes.assert_called_once() write_calls = [c for c in mock_file_open.call_args_list if len(c[0]) > 1 and c[0][1] == 'w'] self.assertEqual(len(write_calls), 2) Suggestion importance[1-10]: 3 __ Why: The suggestion raises a valid concern about test setup, but the `improved_code` doesn't actually fix the mock setup issue — it just adds an assertion before the existing check. The core concern about whether the code path is properly exercised requires more substantial test changes.	Low
	Safely construct borderline filename with splitext If `filename` does not end with `.md` (e.g., it has no extension or a different extension), `filename.replace('.md', '-borderline.md')` will silently produce an incorrect filename or fail to replace anything, resulting in a borderline file with the same name as the release notes file. Use a more reliable method to construct the borderline filename. src/release_notes_workflow/release_notes.py [174-177] if borderline_calls: - borderline_filename = filename.replace('.md', '-borderline.md') + base, ext = os.path.splitext(filename) + borderline_filename = f"{base}-borderline{ext or '.md'}" with open(os.path.join(os.getcwd(), 'release-notes', borderline_filename), 'w') as f: f.write(borderline_calls) Suggestion importance[1-10]: 3 __ Why: While using `os.path.splitext` is more robust, in practice the `filename` variable is always expected to end with `.md` based on the codebase conventions, making this an edge case with minimal practical impact.	Low

andrross requested review from Divyaasm, gaiksaya, peterzhuamazon, prudhvigodithi, rishabh6788, tianleh and zelinh as code owners March 27, 2026 23:28

github-project-automation Bot added this to Engineering Effectiveness Board Mar 27, 2026

github-actions Bot added the repeat-contributor label Mar 27, 2026

andrross force-pushed the release-notes-update branch from 14ebe61 to 8257f38 Compare March 27, 2026 23:51

andrross force-pushed the release-notes-update branch from 8257f38 to 0409941 Compare March 28, 2026 00:15

andrross force-pushed the release-notes-update branch from 0409941 to eb682cd Compare March 28, 2026 01:29

gaiksaya reviewed Mar 30, 2026

View reviewed changes

andrross force-pushed the release-notes-update branch from eb682cd to 2bc6cd2 Compare March 30, 2026 19:03

andrross force-pushed the release-notes-update branch from 2bc6cd2 to 0d5720e Compare March 30, 2026 19:49

andrross mentioned this pull request Mar 30, 2026

Replace CHANGELOG.md with release-notes/AGENTS.md opensearch-project/OpenSearch#20974

Closed

1 task

gaiksaya reviewed Mar 30, 2026

View reviewed changes

andrross force-pushed the release-notes-update branch from 0d5720e to 22600f1 Compare March 30, 2026 21:17

gaiksaya approved these changes Mar 30, 2026

View reviewed changes

github-project-automation Bot moved this to 👀 In Review in Engineering Effectiveness Board Mar 30, 2026

rishabh6788 approved these changes Mar 30, 2026

View reviewed changes

gaiksaya merged commit 5b70446 into opensearch-project:main Mar 30, 2026
17 checks passed

github-project-automation Bot moved this from 👀 In Review to ✅ Done in Engineering Effectiveness Board Mar 30, 2026

andrross deleted the release-notes-update branch March 30, 2026 21:57

	but does not fit any of the above categories, place it in "Maintenance".
	but does not fit any of the above categories, place it in "Unknown".

Conversation

andrross commented Mar 27, 2026

Uh oh!

github-actions Bot commented Mar 27, 2026

Uh oh!

github-actions Bot commented Mar 27, 2026

Uh oh!

codecov Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Mar 28, 2026

Uh oh!

github-actions Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Mar 28, 2026

Uh oh!

gaiksaya left a comment

Choose a reason for hiding this comment

Uh oh!

gaiksaya Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gaiksaya Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrross Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

gaiksaya Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

andrross Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 30, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented Mar 30, 2026

PR Code Suggestions ✨

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Mar 28, 2026 •

edited

Loading

github-actions Bot commented Mar 28, 2026 •

edited

Loading

gaiksaya Mar 30, 2026 •

edited

Loading