Skip to content

fix: sanitize changes diagram input#2212

Merged
naorpeled merged 6 commits intoqodo-ai:mainfrom
nagai9999:fix/mermaid-diagram
Mar 24, 2026
Merged

fix: sanitize changes diagram input#2212
naorpeled merged 6 commits intoqodo-ai:mainfrom
nagai9999:fix/mermaid-diagram

Conversation

@nagai9999
Copy link
Copy Markdown
Contributor

@nagai9999 nagai9999 commented Feb 12, 2026

User description

Issue: #2211

PR Type

Enhancement, Tests


Description

  • Extract diagram sanitization logic into dedicated function

  • Remove backticks from node labels in mermaid diagrams

  • Add comprehensive unit tests for diagram sanitization

  • Handle missing closing fence and invalid diagram formats


Diagram Walkthrough

flowchart LR
  input["Raw diagram input"] -- "sanitize_diagram()" --> validation["Validate fence markers"]
  validation -- "Fix missing closing" --> cleanup["Remove backticks from labels"]
  cleanup -- "Return sanitized" --> output["Sanitized diagram string"]
Loading

File Walkthrough

Relevant files
Enhancement
pr_description.py
Extract and enhance diagram sanitization logic                     

pr_agent/tools/pr_description.py

  • Refactored diagram sanitization logic into new sanitize_diagram()
    function
  • Function validates fence markers, fixes missing closing backticks, and
    removes backticks from node labels
  • Updated _prepare_data() to use new sanitization function instead of
    inline logic
  • Returns empty string for invalid diagrams that don't start with fence
    markers
+19/-5   
Tests
test_pr_description.py
Add unit tests for diagram sanitization                                   

tests/unittest/test_pr_description.py

  • Added new test file with comprehensive unit tests for diagram
    sanitization
  • Tests cover invalid diagrams without fence markers, missing closing
    fences, and backtick removal
  • Tests verify backticks are only removed from node labels, not from
    edge labels
  • Tests confirm normal diagrams are processed correctly with newline
    prefix
+69/-0   


PR Type

Enhancement, Tests


Description

  • Extract diagram sanitization logic into dedicated sanitize_diagram() function

  • Fix missing closing fence markers and remove backticks from node labels

  • Add comprehensive unit tests covering edge cases and validation

  • Validate diagram input format and handle invalid diagrams gracefully


Diagram Walkthrough

flowchart LR
  raw["Raw diagram input"]
  validate["Validate fence markers"]
  fix["Fix missing closing fence"]
  clean["Remove backticks from labels"]
  output["Sanitized diagram string"]
  raw -- "sanitize_diagram()" --> validate
  validate --> fix
  fix --> clean
  clean --> output
Loading

File Walkthrough

Relevant files
Enhancement
pr_description.py
Extract and enhance diagram sanitization logic                     

pr_agent/tools/pr_description.py

  • Extracted diagram sanitization logic into new sanitize_diagram()
    function
  • Function validates fence markers, fixes missing closing backticks, and
    removes backticks from node labels using regex
  • Updated _prepare_data() to call new sanitization function instead of
    inline logic
  • Returns empty string for invalid diagrams that don't start with fence
    markers
+21/-5   
Tests
test_pr_description.py
Add unit tests for diagram sanitization                                   

tests/unittest/test_pr_description.py

  • Added new test file with comprehensive unit tests for
    sanitize_diagram() function
  • Tests cover invalid diagrams without fence markers, missing closing
    fences, and backtick removal from labels
  • Tests verify backticks are only removed from node labels, not from
    edge labels
  • Tests confirm normal diagrams are processed correctly with newline
    prefix and handle non-string inputs
+75/-0   

@nagai9999 nagai9999 marked this pull request as ready for review February 12, 2026 08:05
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Feb 12, 2026

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Markdown injection

Description: sanitize_diagram() only checks that input starts with a fence and ensures it ends with a
fence, but does not prevent additional embedded sequences inside the diagram body, which
can prematurely close the code block and allow markdown/HTML injection (e.g., a diagram
containing
\n\n<script>...</script>\n could escape the fenced block in rendered PR descriptions depending
on downstream rendering).
pr_description.py [772-785]

Referred Code
def sanitize_diagram(diagram_raw: str) -> str:
    """Sanitize a diagram string: fix missing closing fence and remove backticks."""
    diagram = diagram_raw.strip()
    if not diagram.startswith('```'):
        return ''

    # fallback missing closing
    if not diagram.endswith('```'):
        diagram += '\n```'  

    # remove backticks inside node labels: ["`label`"] -> ["label"]
    lines = diagram.split('\n')
    result = [re.sub(r'\["([^"]*?)"\]', lambda m: '["' + m.group(1).replace('`', '') + '"]', line) for line in lines]
    return '\n' + '\n'.join(result)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Consistent Naming Conventions

Objective: All new variables, functions, and classes must follow the project's established naming
standards

Status: Passed

Single Responsibility for Functions

Objective: Each function should have a single, well-defined responsibility

Status: Passed

When relevant, utilize early return

Objective: In a code snippet containing multiple logic conditions (such as 'if-else'), prefer an
early return on edge cases than deep nesting

Status: Passed

🔴
No Dead or Commented-Out Code

Objective: Keep the codebase clean by ensuring all submitted code is active and necessary

Status:
Unused import: The newly added pytest import is unused in the test module, introducing unnecessary dead
code.

Referred Code
import pytest
import yaml
Robust Error Handling

Objective: Ensure potential errors and edge cases are anticipated and handled gracefully throughout
the code

Status:
Input type safety: sanitize_diagram() calls .strip() on diagram_raw without guarding against non-string/None
inputs, which could raise an exception depending on upstream data types.

Referred Code
def sanitize_diagram(diagram_raw: str) -> str:
    """Sanitize a diagram string: fix missing closing fence and remove backticks."""
    diagram = diagram_raw.strip()
    if not diagram.startswith('```'):
        return ''
  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Feb 12, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect regex for sanitizing

Update the regex for sanitizing Mermaid node labels to correctly handle the
id["..."] format, as the current expression r'["([^"]*?)"]' fails to match
them.

pr_agent/tools/pr_description.py [782-785]

-# remove backticks inside node labels: ["`label`"] -> ["label"]
+# remove backticks inside node labels: A["`label`"] -> A["label"]
 lines = diagram.split('\n')
-result = [re.sub(r'\["([^"]*?)"\]', lambda m: '["' + m.group(1).replace('`', '') + '"]', line) for line in lines]
+result = [re.sub(r'(\w+\[")([^"]*)("\])', lambda m: m.group(1) + m.group(2).replace('`', '') + m.group(3), line) for line in lines]
 return '\n' + '\n'.join(result)
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a bug in the new sanitize_diagram function where the regex for node labels is wrong, and it provides a correct implementation that fixes the bug, which is critical for the feature to work as intended.

High
Learned
best practice
Add safe guards for missing values
Suggestion Impact:Added an isinstance check for diagram_raw and returned an empty string when it is not a str, preventing AttributeError on .strip().

code diff:

@@ -771,6 +771,8 @@
 
 def sanitize_diagram(diagram_raw: str) -> str:
     """Sanitize a diagram string: fix missing closing fence and remove backticks."""
+    if not isinstance(diagram_raw, str):
+        return ''
     diagram = diagram_raw.strip()

Guard against diagram_raw being None or non-string before calling .strip() to
avoid AttributeError when the YAML value is missing or not a string.

pr_agent/tools/pr_description.py [772-785]

 def sanitize_diagram(diagram_raw: str) -> str:
     """Sanitize a diagram string: fix missing closing fence and remove backticks."""
+    if not isinstance(diagram_raw, str):
+        return ''
     diagram = diagram_raw.strip()
     if not diagram.startswith('```'):
         return ''
-    
+
     # fallback missing closing
     if not diagram.endswith('```'):
-        diagram += '\n```'  
-    
+        diagram += '\n```'
+
     # remove backticks inside node labels: ["`label`"] -> ["label"]
     lines = diagram.split('\n')
     result = [re.sub(r'\["([^"]*?)"\]', lambda m: '["' + m.group(1).replace('`', '') + '"]', line) for line in lines]
     return '\n' + '\n'.join(result)

[Suggestion processed]

Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - When accessing attributes on values that may be missing/None, use safe access patterns (e.g., guard clauses) to prevent AttributeError/KeyError.

Low
  • Update
  • Author self-review: I have reviewed the PR code suggestions, and addressed the relevant ones.

@naorpeled
Copy link
Copy Markdown
Collaborator

/agentic_review

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 14, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (1) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Action required

1. Trailing whitespace in sanitize_diagram()📘 Rule violation ⚙ Maintainability
Description
The added line diagram += '\n`'   contains trailing whitespace, which can fail the repo's
pre-commit whitespace hooks. This reduces code quality and may block CI/pre-commit checks.
Code

pr_agent/tools/pr_description.py[R779-783]

+    
+    # fallback missing closing
+    if not diagram.endswith('```'):
+        diagram += '\n```'  
+    
Evidence
PR Compliance ID 15 requires modified files to satisfy pre-commit checks including whitespace
cleanup. In sanitize_diagram(), the new closing-fence append line includes trailing spaces (and a
whitespace-only blank line), violating whitespace cleanup expectations.

AGENTS.md
pr_agent/tools/pr_description.py[779-783]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`sanitize_diagram()` introduces trailing whitespace (and a whitespace-only blank line), which can fail pre-commit whitespace hooks.
## Issue Context
The repository requires whitespace cleanup via pre-commit for modified files.
## Fix Focus Areas
- pr_agent/tools/pr_description.py[779-783]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Ruff blank-line violation🐞 Bug ⚙ Maintainability
Description
sanitize_diagram() is introduced as a top-level function immediately after the PRDescription class
with only one separating blank line, which violates Ruff’s enforced pycodestyle blank-line rule and
will fail linting. This will block CI for the PR.
Code

pr_agent/tools/pr_description.py[R770-773]

     return pr_body
+def sanitize_diagram(diagram_raw: str) -> str:
+    """Sanitize a diagram string: fix missing closing fence and remove backticks."""
Evidence
The file shows only a single blank line between the end of the class method (and effectively the
class) and the new module-level function definition. Ruff is configured to enforce E- and F-code
checks, which include the two-blank-lines requirement for top-level definitions, so this will be
reported as an error during linting.

pr_agent/tools/pr_description.py[770-773]
pyproject.toml[47-56]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`sanitize_diagram()` is defined at module scope with only one blank line separating it from the preceding class block, which violates Ruff/pycodestyle blank-line rules and fails CI.
### Issue Context
Ruff is enabled for `E` codes (pycodestyle) in `pyproject.toml`, which includes the two-blank-lines requirement between top-level definitions.
### Fix Focus Areas
- pr_agent/tools/pr_description.py[770-773]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Unused pytest import🐞 Bug ⚙ Maintainability
Description
tests/unittest/test_pr_description.py imports pytest but never uses it, which will be flagged as an
unused import by Ruff (F401) and fail CI. This blocks merging even though the tests themselves may
pass.
Code

tests/unittest/test_pr_description.py[R1-4]

+from unittest.mock import MagicMock, patch
+
+import pytest
+import yaml
Evidence
The new test file imports pytest but uses only unittest.mock and plain asserts. Ruff is configured
to check F (Pyflakes) rules, which include F401 unused-import violations.

tests/unittest/test_pr_description.py[1-8]
pyproject.toml[47-56]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`pytest` is imported but not used, triggering Ruff F401.
### Issue Context
Ruff is configured to enforce `F` rules (Pyflakes) in `pyproject.toml`.
### Fix Focus Areas
- tests/unittest/test_pr_description.py[1-5]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (3)
4. Test imports not isorted📘 Rule violation ⚙ Maintainability
Description
The new test file places the stdlib import (unittest.mock) after third-party imports (pytest,
yaml), which violates isort/Ruff import ordering. This can cause lint failures in CI.
Code

tests/unittest/test_pr_description.py[R1-4]

+import pytest
+import yaml
+from unittest.mock import MagicMock, patch
+from pr_agent.tools.pr_description import PRDescription, sanitize_diagram
Evidence
PR Compliance ID 12 requires isort-compatible import grouping/ordering. In the added test file,
from unittest.mock ... (stdlib) comes after import pytest and import yaml (third-party), which
is not isort-compliant.

AGENTS.md
tests/unittest/test_pr_description.py[1-4]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new test file has non-isort import ordering (stdlib imports should come before third-party imports).
## Issue Context
Ruff/isort checks commonly fail CI when imports are not grouped/ordered correctly.
## Fix Focus Areas
- tests/unittest/test_pr_description.py[1-4]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. sanitize_diagram() accepts non-mermaid fence📎 Requirement gap ✓ Correctness
Description
sanitize_diagram() only checks for a generic code fence ( ` ) instead of requiring a Mermaid
fence ( `mermaid ), so non-Mermaid fenced blocks can be published unchanged. This violates the
requirement to validate Mermaid fences and remove/avoid invalid fenced blocks that break rendering.
Code

pr_agent/tools/pr_description.py[R776-778]

+    diagram = diagram_raw.strip()
+    if not diagram.startswith('```'):
+        return ''
Evidence
Compliance requires sanitization to validate the diagram begins with a Mermaid code fence and handle
invalid fences; the new implementation only validates diagram.startswith('`'), which allows
non-Mermaid fenced blocks to pass.

Validate Mermaid diagram begins with a Mermaid code fence; remove fence if invalid
pr_agent/tools/pr_description.py[772-778]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`sanitize_diagram()` currently treats any fenced block beginning with ` ``` ` as valid, instead of requiring a Mermaid fence (` ```mermaid `). This can allow invalid fenced blocks to be published, causing GitHub/GitLab rendering failures.
## Issue Context
Compliance requires validating that Mermaid diagrams start with a Mermaid code fence, and removing the fence if invalid.
## Fix Focus Areas
- pr_agent/tools/pr_description.py[772-788]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Overlong regex list-comprehension📘 Rule violation ✓ Correctness
Description
The new result = [...] line exceeds the 120-character limit and is likely to fail Ruff formatting
checks. This breaks the repository style compliance requirement.
Code

pr_agent/tools/pr_description.py[786]

+    result = [re.sub(r'\["([^"]*?)"\]', lambda m: '["' + m.group(1).replace('`', '') + '"]', line) for line in lines]
Evidence
The checklist requires adhering to Ruff configuration including 120-char lines; the added
re.sub(...) list comprehension is a single, very long line that exceeds this limit.

AGENTS.md
pr_agent/tools/pr_description.py[786-786]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A newly added list comprehension performing the regex substitution is longer than 120 characters, violating Ruff style requirements.
## Issue Context
Repository style compliance requires Python code to conform to Ruff configuration, including `line-length = 120`.
## Fix Focus Areas
- pr_agent/tools/pr_description.py[784-787]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

7. sanitize_diagram() uses single quotes 📘 Rule violation ⚙ Maintainability
Description
New code introduces pervasive single-quoted strings, which likely violates the repo’s Ruff-enforced
preference for double quotes and may add avoidable lint noise. This can cause CI/lint failures or
inconsistent style across the module.
Code

pr_agent/tools/pr_description.py[R772-793]

+def sanitize_diagram(diagram_raw: str) -> str:
+    """Sanitize a diagram string: fix missing closing fence and remove backticks."""
+    if not isinstance(diagram_raw, str):
+        return ''
+    diagram = diagram_raw.strip()
+    if not diagram.startswith('```mermaid'):
+        return ''
+    
+    # fallback missing closing
+    if not diagram.endswith('```'):
+        diagram += '\n```'  
+    
+    # remove backticks inside node labels: ["`label`"] -> ["label"]
+    result = []
+    for line in diagram.split('\n'):
+        line = re.sub(
+            r'\["([^"]*?)"\]',
+            lambda m: '["' + m.group(1).replace('`', '') + '"]',
+            line,
+        )
+        result.append(line)
+    return '\n' + '\n'.join(result)
Evidence
PR Compliance ID 12 requires adhering to Ruff styling including double quotes. The newly added
sanitize_diagram() uses many single-quoted strings (e.g., return '', split('\n'),
startswith('`mermaid')).

AGENTS.md
pr_agent/tools/pr_description.py[772-793]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new `sanitize_diagram()` function uses single-quoted strings throughout, which conflicts with the repository’s Ruff style expectation to prefer double quotes.
## Issue Context
This may introduce Ruff lint failures or inconsistent style compared to surrounding code.
## Fix Focus Areas
- pr_agent/tools/pr_description.py[772-793]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@naorpeled
Copy link
Copy Markdown
Collaborator

Hey @nagai9999,
thanks for this PR 🔥

Can you please address Qodo's comments?

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 19, 2026

Code Review by Qodo

Grey Divider

New Review Started

This review has been superseded by a new analysis

Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 19, 2026

Persistent review updated to latest commit 9251f62

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 19, 2026

Persistent review updated to latest commit 8733f30

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 19, 2026

Persistent review updated to latest commit 61faae5

@nagai9999
Copy link
Copy Markdown
Contributor Author

Hi @naorpeled, all Qodo comments have been addressed.

@naorpeled naorpeled merged commit d5712ac into qodo-ai:main Mar 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants