Skip to content

Bypass SyntaxNode in JuliaLowering; fix kw bug#60162

Merged
topolarity merged 3 commits intoJuliaLang:masterfrom
mlechu:use-kw-in-syntaxtree
Dec 2, 2025
Merged

Bypass SyntaxNode in JuliaLowering; fix kw bug#60162
topolarity merged 3 commits intoJuliaLang:masterfrom
mlechu:use-kw-in-syntaxtree

Conversation

@mlechu
Copy link
Copy Markdown
Member

@mlechu mlechu commented Nov 18, 2025

I set out to fix a stdlib compilation failure JuliaLang/JuliaLowering.jl#98, but ran
into a limitation in the JuliaSyntax AST. Instead of hacking in a fix, this PR
attempts to address JuliaLang/JuliaLowering.jl#77 too.

kw

The kw form isn't produced in SyntaxNode/SyntaxTree like it is in Expr,
which simplifies the parser, but causes problems once we start start caring
about the semantics of these trees. Parsing "f(a=1)" will always produce an
Expr(:call, :f, Expr(:kw, :a, 1)), but the equivalent SyntaxNode
representation (call f (= a 1)) makes Expr(:call, :f, Expr(:(=), :a, 1))
unrepresentable, even though that's valid syntax a macro could produce.

To fix this, we need to convert = to kw within certain forms before
macro expansion.

One problem

There is currently no good place to put this conversion. Here's the path
from source to lowering:

    (source text)
         |
       1 |
         v
[JS] RawGreenNode
         |      \ 5a
      2* |       \
         v        ----> Expr
[JS] SyntaxNode  /      /
         |      / 5b   /
      3* |     /      /
         v    /      /
[JL] SyntaxTree0 <--- 6
         |
       4 |
         v
[JL] SyntaxTree1
         |
         |
         v
 (desugaring and beyond)

*(2) and (3) are pretty trivial, and data structure names are used as
shorthand for what actually matters to this issue (the tree represented by the
data structure).

  1. is parsing, which is the most complex of any of these steps.
  2. JS._to_SyntaxNode, which just deletes trivia
  3. JL._convert_nodes, which is more-or-less a one-to-one conversion, and I
    think is an artifact of JuliaSyntax and JuliaLowering being in separate
    repositories. @c42f has talked about wanting to replace SyntaxNode with
    SyntaxTree eventually.
  4. Random fixups in JuliaLowering's macro expansion step to make desugaring
    easier
  5. JS._node_to_expr, which is used for femtolisp lowering (5a) and for
    JuliaLowering's compatibility with existing macros (5b)
  6. JL.expr_to_syntaxtree, which is also used for Expr-macro compatibility

Expr gets the chance to swap out = within (call (parameters ...)) in
_node_to_expr, but to change SyntaxTree we can only change parsing. (The
inverse is also true: if someone wants to change parsing to RawGreenNode,
JuliaLowering will have to eat it).

This PR implements the following instead, where _green_to_ast happens at (2)
and may fix up the AST before lowering. Ideally this means more room for parser
cleanup and eventually eliminating our "pre-desugaring" mixed in with macro
expansion.

(source text)
     |
   1 |
     v        3
RawGreenNode ---> Expr
     |
   2 |   4 -----> Expr
     v    /       /
SyntaxTree0 <----- 5
     |
     |      <- TODO delete
     v
SyntaxTree1
     |
     |
     v
(desugaring and beyond)

The catch is that if we want SyntaxTree to be different from RawGreenNode, we do
need to convert them to Expr differently. There are a few things we could do:

  • Separate the RawGreenNode->Expr and SyntaxTree->Expr transformations
    (pictured, and implemented).
  • Make RawGreenNode->SyntaxTree a necessary prerequisite for creating an Expr.
    This would add a step to our current femtolisp-lowered parsing (why)
  • Give up on making SyntaxTree structurally different from Expr (I wouldn't
    mind this) and try to feed them through the same generic code (not so sure
    about that).

For now I've done the lazy thing by using a boolean parameter to implement the
first option.

I also took this opportunity to put the green tree into the syntax graph. This
doesn't really change anything given how little we look at the green tree, but
wasn't difficult, and it's a neat representation.

kw notes

I've made SyntaxTree and Expr the same here, though some inconsistencies could
be ironed out in the future with syntax evolution. Ideally, we use kw
wherever a textual = behaves more like a =>, or if "=" is a parse error,
wherever it would be more consistent to treat it like =>.

Kind Source Expr and this PR Ideal
call* x(a=1,;b=1) (kw a 1) (kw b 1) (kw a 1) (kw b 1)
dotcall x.(a=1,;b=1) (kw a 1) (kw b 1) (kw a 1) (kw b 1)
ref x[a=1,;b=1] (kw a 1) (= b 1) (kw a 1) (kw b 1)
curly x{a=1,;b=1} (= a 1) (= b 1) (kw a 1) (kw b 1)
tuple (a=1,;b=1) (= a 1) (kw b 1) (kw a 1) (kw b 1)
vect [a=1,;b=1] (= a 1) (= b 1) (= a 1) (kw b 1)
braces {a=1,;b=1} (= a 1) (= b 1) (= a 1) (kw b 1)
macrocall @x(a=1,;b=1) (= a 1) (kw b 1) (= a 1) (kw b 1)

@mlechu mlechu requested review from Keno and c42f November 18, 2025 00:04
@mlechu mlechu added parser Language parsing and surface syntax compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels Nov 19, 2025
mlechu added a commit to mlechu/julia that referenced this pull request Nov 21, 2025
Fix JuliaLang#60152, which impacted both lowering implementations.
      `remove-argument-side-effects` assumes all `kw` arguments from a
     `parameters` block had already been dumped into the argument list. In addition:

- JuliaLowering hit a MethodError in the dumped-`kw` case regardless.  There are
     other issues with `kw` which I'm ignoring in this PR: see
     JuliaLang#60162

- Delete some ancient history: `&` [used to be a valid argument](JuliaLang@a378b75#diff-5d79463faae0f7f19454c7f9888498d9f876082e258ab3efdca36a0ee64b0c87L72) head sometime in
     2012 apparently!
mlechu added a commit to mlechu/julia that referenced this pull request Nov 21, 2025
Fix JuliaLang#60152, which impacted both lowering implementations.
      `remove-argument-side-effects` assumed all `kw` arguments from a
     `parameters` block had already been dumped into the argument list, which is
     not true in some cases.  In addition:

- JuliaLowering hit a MethodError in the dumped-`kw` case regardless.  There are
     other issues with `kw` which I'm ignoring in this PR (see
     JuliaLang#60162)

- Delete some ancient history: `&` [used to be a valid argument](JuliaLang@a378b75#diff-5d79463faae0f7f19454c7f9888498d9f876082e258ab3efdca36a0ee64b0c87L72) head sometime in
     2012 apparently!
Keno pushed a commit that referenced this pull request Nov 22, 2025
Fix #60152, which impacted both lowering implementations.
      `remove-argument-side-effects` assumed all `kw` arguments from a
`parameters` block had already been dumped into the argument list, which
is
     not true in some cases.  In addition:

- JuliaLowering hit a MethodError in the dumped-`kw` case regardless.
There are other issues with `kw` which I'm ignoring in this PR (see
#60162)

- Delete some ancient history: `&` [used to be a valid
argument](a378b75#diff-5d79463faae0f7f19454c7f9888498d9f876082e258ab3efdca36a0ee64b0c87L72)
head sometime in 2012 apparently!
Copy link
Copy Markdown
Member

@topolarity topolarity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familiar enough yet to review the details of the transform, so it would be good to get review from @Keno or @c42f w.r.t. the more intimate JuliaSyntax / JuliaLowering bits. Nonetheless I like the direction of the changes and will merge soon (today) barring feedback.

This might be too adventurous, but in a world where Expr / SyntaxTree are 1:1, it would be nice to converge to a pipeline like:

    (source text)
         |
         |
         v
[JS] RawGreenNode
         |
         v
[JS] SyntaxTree0 ----> Expr
         |  ^            |
		 |  |            |
	     |  \--(lossy)---/
         |
         v
[JL] SyntaxTree1
         |
         |
         v
 (desugaring and beyond)

which has one less "path" to Expr (so hopefully less room for unique bugs)


#-------------------------------------------------------------------------------
# Conversion from the raw parsed tree
# TODO: move to JuliaSyntax. Replace SyntaxNode?
Copy link
Copy Markdown
Member

@topolarity topolarity Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it viable to replace node_to_expr with this new pipeline + SyntaxTree -> Expr conversion?
(maybe this is already what you meant by "feed them through the same generic code"?)

Would be nice to be able to reduce some of the special conversion pathways here.

Copy link
Copy Markdown
Member Author

@mlechu mlechu Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if SyntaxTree and Expr have the same structure, then it would make sense to have node_to_expr handle both (RawGreenNode->x) if possible. They aren't the same now, though, and I think duplicating conversion code is the only way to bring them closer incrementally rather than all at once.

@topolarity topolarity force-pushed the use-kw-in-syntaxtree branch from a813d18 to 5611a24 Compare December 2, 2025 19:09
@mlechu
Copy link
Copy Markdown
Member Author

mlechu commented Dec 2, 2025

This might be too adventurous, but in a world where Expr / SyntaxTree are 1:1, it would be nice to converge to a pipeline like:

Looks about right!

@topolarity topolarity merged commit 296cca2 into JuliaLang:master Dec 2, 2025
6 of 8 checks passed
@topolarity
Copy link
Copy Markdown
Member

Nice! Looks like this fixes JuliaSyntaxHighlighting, Markdown, StyledStrings, and Profile

@mlechu Can you add those to the tests when you rebase #60170 ?

topolarity pushed a commit that referenced this pull request Dec 4, 2025
…nsion (#60170)

On top of #60162. This
simplification lets macro expansion handle the macro name as an
identifier rather than the nonterminal `K"macro_name"` node. That and
the old `MacroName` node (as well as `CmdMacroName`, `StrMacroName`) are
mainly the result of restrictions on raw parser output: we can't parse
`@Base.x` to a non-contiguous `@x`, even though that's what we want in
the AST in almost every case.
KristofferC pushed a commit that referenced this pull request Dec 18, 2025
Fix #60152, which impacted both lowering implementations.
      `remove-argument-side-effects` assumed all `kw` arguments from a
`parameters` block had already been dumped into the argument list, which
is
     not true in some cases.  In addition:

- JuliaLowering hit a MethodError in the dumped-`kw` case regardless.
There are other issues with `kw` which I'm ignoring in this PR (see
#60162)

- Delete some ancient history: `&` [used to be a valid
argument](a378b75#diff-5d79463faae0f7f19454c7f9888498d9f876082e258ab3efdca36a0ee64b0c87L72)
head sometime in 2012 apparently!

(cherry picked from commit 2be8847)
KristofferC pushed a commit that referenced this pull request Feb 4, 2026
Fix #60152, which impacted both lowering implementations.
      `remove-argument-side-effects` assumed all `kw` arguments from a
`parameters` block had already been dumped into the argument list, which
is
     not true in some cases.  In addition:

- JuliaLowering hit a MethodError in the dumped-`kw` case regardless.
There are other issues with `kw` which I'm ignoring in this PR (see
#60162)

- Delete some ancient history: `&` [used to be a valid
argument](a378b75#diff-5d79463faae0f7f19454c7f9888498d9f876082e258ab3efdca36a0ee64b0c87L72)
head sometime in 2012 apparently!

(cherry picked from commit 2be8847)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compiler:lowering Syntax lowering (compiler front end, 2nd stage) parser Language parsing and surface syntax

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants