fix(heredoc): handle unbalanced single quotes and backticks in heredoc bodies inside `$(…)` (#420) by StudioLE · Pull Request #1067 · reubeno/brush

StudioLE · 2026-03-17T12:01:44Z

Fixes #420

Test results without fix

---
source: brush-parser/src/word.rs
expression: "test_parse(\"\\\"$(cat <<'EOF'\\na ` b\\nEOF\\n)\\\"\")?"
---
ParseTestResults(
  input: "\"$(cat <<\'EOF\'\na ` b\nEOF\n)\"",
  result: [
    WordPieceWithSource(
      piece: DoubleQuotedSequence([
        WordPieceWithSource(
          piece: Text("$"),
          start_index: 1,
          end_index: 2,
        ),
        WordPieceWithSource(
          piece: Text("(cat <<\'EOF\'\na ` b\nEOF\n)"),
          start_index: 2,
          end_index: 26,
        ),
      ]),
      start_index: 0,
      end_index: 27,
    ),
  ],
)

---
source: brush-parser/src/word.rs
expression: "test_parse(\"\\\"$(cat <<'EOF'\\nit's here\\nEOF\\n)\\\"\")?"
---
ParseTestResults(
  input: "\"$(cat <<\'EOF\'\nit\'s here\nEOF\n)\"",
  result: [
    WordPieceWithSource(
      piece: DoubleQuotedSequence([
        WordPieceWithSource(
          piece: Text("$"),
          start_index: 1,
          end_index: 2,
        ),
        WordPieceWithSource(
          piece: Text("(cat <<\'EOF\'\nit\'s here\nEOF\n)"),
          start_index: 2,
          end_index: 30,
        ),
      ]),
      start_index: 0,
      end_index: 31,
    ),
  ],
)

* Test case: [Here doc with unpaired single quote in command substitution]... 
    Oracle comparison:
      status matches (exit status: 0) ✔️
      stdout DIFFERS:
          ------ Oracle <> Test: stdout ---------------------------------
          + $(cat <<'EOF'
            it's here
          + EOF
          + )
            
          ---------------------------------------------------------------
      stderr matches ✔️
      temp dir matches ✔️
    FAILED.
* Test case: [Here doc with unpaired backtick in command substitution]... 
    Oracle comparison:
      status matches (exit status: 0) ✔️
      stdout DIFFERS:
          ------ Oracle <> Test: stdout ---------------------------------
          + $(cat <<'EOF'
            a ` b
          + EOF
          + )
            
          ---------------------------------------------------------------
      stderr matches ✔️
      temp dir matches ✔️
    FAILED.
* Test case: [Here doc with three single quotes in command substitution]... 
    Oracle comparison:
      status matches (exit status: 0) ✔️
      stdout DIFFERS:
          ------ Oracle <> Test: stdout ---------------------------------
          + $(cat <<'EOF'
            it's Bob's it's
          + EOF
          + )
            
          ---------------------------------------------------------------
      stderr matches ✔️
      temp dir matches ✔️
    FAILED.
* Test case: [Here doc with unpaired single quote in command substitution (#420)]... 
    Oracle comparison:
      status matches (exit status: 0) ✔️
      stdout DIFFERS:
          ------ Oracle <> Test: stdout ---------------------------------
          - that's "cool"
          + 
            
          ---------------------------------------------------------------
      stderr DIFFERS:
          ------ Oracle <> Test: stderr ---------------------------------
          + error: failed to parse word '$(
          +   sed 's/nice/cool/' - <<-EOF
          +   that's
          +   "nice"
          + EOF
          + )'
          + 
          ---------------------------------------------------------------
      temp dir matches ✔️
    FAILED.

Context

The tokenizer flattens heredoc bodies into $(...) token text. When the body contains unbalanced single quotes or backticks, the word parser's quote-matching rules fail to consume them, causing the entire command substitution parse to fail.

If you compare the snapshots of parse_command_sub_with_unbalanced_single_quote and parse_command_sub_with_unbalanced_backtick prior to the fix with the snapshot of brush_parser__word__tests__parse_command_sub_with_balanced_backticks you can clearly see the parser is failing to match the CommandSubstitution.

Solution

The solution is to add a fallback in command_piece so orphan ' and ` are consumed as literal characters when word_piece cannot match them.

Caution

I believe this is safe because of PEG's ordered choice semantics: properly paired quotes are consumed by word_piece first; the fallback only fires for characters no other rule can handle.

However, it would be worth a maintainer validating this

Note

Unbalanced " inside heredoc bodies within "$(…)" is a separate issue tracked in #1066. The " is consumed by the outer double_quoted_sequence rule before command_piece is reached, so this fix does not address it.

RE: Contributor Policy

Please be transparent about your use of AI.

Claude Code was used to diagnose an issue I was facing.
It was then used to explore possible solutions
Unable to parse unbalanced double quote in heredoc in command substitution #1066 was identified as a separate case
After a LOT of back and forth we narrowed it down to a simple one line fix

It also remains your responsibility to review, test, understand, and vouch for all code and content that you submit. Accordingly, you must fully understand all code and other content in your contributions and be able to explain their behavior and purpose.

I have reviewed, tested and understand WHY the fix works.

Caution

However, the fix is inside peg::parser! macro and I definitely do not have knowledge of how the macro works. Therefore I'm heavily relying on existing tests to identify if this causes regressions.

reubeno#420) The tokenizer flattens heredoc bodies into `$(...)` token text. When the body contains unbalanced single quotes or backticks, the word parser's quote-matching rules fail to consume them, causing the entire command substitution parse to fail. Add a fallback in `command_piece` so orphan `'` and `` ` `` are consumed as literal characters when `word_piece` cannot match them.

github-actions · 2026-03-17T12:19:55Z

Performance Benchmark Report

Benchmark name	Baseline (μs)	Test/PR (μs)	Delta (μs)	Delta %
`clone_shell_object`	`17.01 μs`	`17.17 μs`	`0.16 μs`	`🟠 +0.93%`
`eval_arithmetic`	`0.15 μs`	`0.15 μs`	`-0.00 μs`	`⚪ Unchanged`
`expand_one_string`	`1.51 μs`	`1.52 μs`	`0.01 μs`	`⚪ Unchanged`
`for_loop`	`27.05 μs`	`26.77 μs`	`-0.28 μs`	`🟢 -1.05%`
`full_peg_complex`	`57.85 μs`	`57.68 μs`	`-0.17 μs`	`⚪ Unchanged`
`full_peg_for_loop`	`6.25 μs`	`6.20 μs`	`-0.05 μs`	`🟢 -0.85%`
`full_peg_nested_expansions`	`16.57 μs`	`16.57 μs`	`-0.01 μs`	`⚪ Unchanged`
`full_peg_pipeline`	`4.26 μs`	`4.21 μs`	`-0.05 μs`	`🟢 -1.13%`
`full_peg_simple`	`1.82 μs`	`1.79 μs`	`-0.03 μs`	`🟢 -1.37%`
`function_call`	`2.94 μs`	`2.97 μs`	`0.02 μs`	`⚪ Unchanged`
`instantiate_shell`	`56.13 μs`	`55.63 μs`	`-0.50 μs`	`⚪ Unchanged`
`instantiate_shell_with_init_scripts`	`26429.60 μs`	`27457.51 μs`	`1027.91 μs`	`🟠 +3.89%`
`parse_peg_bash_completion`	`2086.11 μs`	`2075.02 μs`	`-11.10 μs`	`⚪ Unchanged`
`parse_peg_complex`	`19.35 μs`	`19.94 μs`	`0.59 μs`	`🟠 +3.04%`
`parse_peg_for_loop`	`2.04 μs`	`2.00 μs`	`-0.05 μs`	`🟢 -2.35%`
`parse_peg_pipeline`	`2.16 μs`	`2.15 μs`	`-0.01 μs`	`⚪ Unchanged`
`parse_peg_simple`	`1.12 μs`	`1.10 μs`	`-0.01 μs`	`🟢 -1.25%`
`run_echo_builtin_command`	`18.17 μs`	`18.54 μs`	`0.37 μs`	`⚪ Unchanged`
`tokenize_sample_script`	`3.43 μs`	`3.36 μs`	`-0.07 μs`	`🟢 -2.07%`

Code Coverage Report: Only Changed Files listed

Package	Base Coverage	New Coverage	Difference
brush-core/src/results.rs	🟢 83.33%	🟢 80.16%	🔴 -3.17%
brush-parser/src/word.rs	🟢 93.7%	🟢 94%	🟢 0.3%
Overall Coverage	🟢 75.39%	🟢 75.4%	🟢 0.01%

Minimum allowed coverage is 70%, this run produced 75.4%

Test Summary: bash-completion test suite

Outcome	Count	Percentage
✅ Pass	1583	75.06
❗️ Error	18	0.85
❌ Fail	154	7.30
⏩ Skip	339	16.07
❎ Expected Fail	13	0.62
✔️ Unexpected Pass	2	0.09
📊 Total	2109	100.00

reubeno · 2026-03-28T00:47:28Z

@StudioLE Thanks for another contribution, and for the detailed summary 😄

Thanks also for your patience -- I've been extremely busy and hadn't yet had time to give the change the review it deserves. I aim to get to it within the next couple of days.

reubeno

The logic behind the fix seems sound. I think there are more cases that will be problematic for us in this area (e.g., perhaps unmatched parentheses chars inside the embedded here doc?) -- but let's at least get in this fix.

…c bodies inside `$(…)` (reubeno#1067) Adds a fallback in `command_piece` so orphan `'` and `` ` `` are consumed as literal characters when `word_piece` cannot match them.

reubeno added the state: needs review PR or issue author is waiting for a review label Mar 27, 2026

reubeno approved these changes Mar 28, 2026

View reviewed changes

reubeno merged commit dd57e07 into reubeno:main Mar 28, 2026
43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(heredoc): handle unbalanced single quotes and backticks in heredoc bodies inside `$(…)` (#420)#1067

fix(heredoc): handle unbalanced single quotes and backticks in heredoc bodies inside `$(…)` (#420)#1067
reubeno merged 1 commit intoreubeno:mainfrom
StudioLE:#420-fix-unbalanced-single-quote-backtick-in-heredoc-command-sub

StudioLE commented Mar 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 17, 2026

Uh oh!

reubeno commented Mar 28, 2026

Uh oh!

reubeno left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

StudioLE commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test results without fix

Context

Solution

Note

Uh oh!

github-actions bot commented Mar 17, 2026

Performance Benchmark Report

Code Coverage Report: Only Changed Files listed

Test Summary: bash-completion test suite

Uh oh!

reubeno commented Mar 28, 2026

Uh oh!

reubeno left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

StudioLE commented Mar 17, 2026 •

edited

Loading