Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change AST for iterations to use iteration kind #433

Merged
merged 1 commit into from
Jul 26, 2024
Merged

Conversation

c42f
Copy link
Member

@c42f c42f commented May 16, 2024

I got inspired to implement #432 as suggested there... This should be a complete implementation of that proposed solution, but may still need experimentation/thought to decide whether Cartesian iteration should be nested or flat.


The = node which has traditionally been used for iteration specifications in for loops and generators doesn't have normal assignment semantics. Let's consider

for x in xs
    body
end

which has been parsed as (for (= x xs) (block body)).

Problems:

  • The iteration does create a binding for x, but not to the expression on the right hand side of the =.
  • The user may use in or in the source code rather than =. The parser still uses a = node for consistency but this only emphasizes that there's something a bit weird going on.

So this use of = is not assignment; merely assignment-like.

In this change, we use a new iteration kind instead of = so the for loop parses as (for (iteration x xs) (block body)) instead.

We also reuse iteration to replace the cartesian_iteration head - cartesian iteration is just an iteration with an even number of children greater than two. Being less specific here with the naming (omitting the "cartesian") seems appropriate in trying to represent the surface syntax; cartesian semantics come later in lowering and a macro may decide to do something else with the iteration spec.

These changes are also used for generators.

After the changes we have tree structures such as

julia> parsestmt(SyntaxNode, "for i in is body end")
line:col│ tree                                   │ file_name
   1:1  │[for]
   1:4  │  [iteration]
   1:5  │    i
   1:10 │    is
   1:12 │  [block]
   1:13 │    body

julia> parsestmt(SyntaxNode, "for i in is, j in js body end")
line:col│ tree                                   │ file_name
   1:1  │[for]
   1:4  │  [iteration]
   1:5  │    i
   1:10 │    is
   1:14 │    j
   1:19 │    js
   1:21 │  [block]
   1:22 │    body

julia> parsestmt(SyntaxNode, "[a for i = is, j = js if z]")
line:col│ tree                                   │ file_name
   1:1  │[comprehension]
   1:2  │  [generator]
   1:2  │    a
   1:7  │    [filter]
   1:7  │      [iteration]
   1:8  │        i
   1:12 │        is
   1:16 │        j
   1:20 │        js
   1:26 │      z

julia> parsestmt(SyntaxNode, "[a for i = is for j = js if z]")
line:col│ tree                                   │ file_name
   1:1  │[comprehension]
   1:2  │  [generator]
   1:2  │    a
   1:7  │    [iteration]
   1:8  │      i
   1:12 │      is
   1:18 │    [filter]
   1:18 │      [iteration]
   1:19 │        j
   1:23 │        js
   1:29 │      z

@c42f
Copy link
Member Author

c42f commented May 22, 2024

I managed to chat to @JeffBezanson about this and he thinks it makes sense, thanks Jeff :)

Will merge shortly I guess. Need to assess where we are at with breaking changes, the release cycle has gotten a bit blocked eek

@LilithHafner
Copy link
Member

Would this be breaking to Julia or just JuliaSyntax?

@JeffBezanson
Copy link
Member

I think just JuliaSyntax since there is a compatibility mapping to Expr.

@c42f
Copy link
Member Author

c42f commented May 23, 2024

Oh indeed.

As long as this package supports Julia 1.x we can never change parsing to Expr - except to fix bugs and any "minor changes" agreed as practically non-breaking. That one API has the same compat restrictions as Base independent of JuliaSyntax's version number.

@c42f
Copy link
Member Author

c42f commented May 23, 2024

Long term, I want some highly refined and improved version of JuliaSyntax.SyntaxNode to eventually be the standard syntax tree representation.

That's ultimately what changes like this are about: it's a long term project to refine our AST data model to make it more precise and easier to work with.

@c42f
Copy link
Member Author

c42f commented Jul 3, 2024

Writing the lowering of for, I think I can now say that this should be nested, somehow: For expression provenance purposes, I really do really want the individual clauses of a cartesian iteration to have their own nodes in the tree.

I wonder whether it'd be ok to nest these within a block. Or maybe a tuple. Neither of these are great for the purposes of semantic AST, but the surface syntax doesn't really carry strong semantics, due to it being, in some sense, an IR for macros. I vaguely feel that there are other cases where where block captures a sequence which isn't semantically a sequence of statements. Need to find these cases again and consider how they compare.

@c42f c42f mentioned this pull request Jul 23, 2024
@c42f
Copy link
Member Author

c42f commented Jul 23, 2024

Managed to chat to @JeffBezanson again about this. We agreed that K"in" could be the best option. So the new tree would be something like (mock up!):

julia> parsestmt(SyntaxNode, "for i in is, j in js body end")
line:col│ tree                                   │ file_name
   1:1  │[for]
   1:4  │  [iteration]
             [in]
   1:5  │      i
   1:10 │      is
             [in]
   1:14 │      j
   1:19 │      js
   1:21 │  [block]
   1:22 │    body

The `=` node which has traditionally been used for iteration
specifications in `for` loops and generators doesn't have normal
assignment semantics. Let's consider

    for x in xs
        body
    end

which has been parsed as `(for (= x xs) (block body))`.

Problems:
* The iteration does create a binding for `x`, but not to the expression
  on the right hand side of the `=`.
* The user may use `in` or `∈` in the source code rather than `=`.
  The parser still uses a `=` node for consistency but this only
  emphasizes that there's something a bit weird going on.

So this use of `=` is not assignment; merely assignment-like.

In this change, we use `in` instead of `=` and wrap this in an
`iteration` node so that all iteration (including over multiple
iterators) has the same structure.

Thus the `for` loop above parses as `(for (iteration (in x xs)) (block body))`
instead.

The `cartesian_iteration` head naturally becomes `iteration` instead -
being less specific here with the naming seems appropriate in trying to
represent the surface syntax; cartesian semantics come later in lowering
and a macro may decide to do something else with the iteration spec.

These changes are also used for generators.

After the changes we have tree structures such as

    julia> parsestmt(SyntaxNode, "for i in is body end")
    line:col│ tree                                   │ file_name
       1:1  │[for]
       1:4  │  [iteration]
       1:4  │    [in]
       1:5  │      i
       1:10 │      is
       1:12 │  [block]
       1:13 │    body

    julia> parsestmt(SyntaxNode, "for i in is, j in js body end")
    line:col│ tree                                   │ file_name
       1:1  │[for]
       1:4  │  [iteration]
       1:4  │    [in]
       1:5  │      i
       1:10 │      is
       1:13 │    [in]
       1:14 │      j
       1:19 │      js
       1:21 │  [block]
       1:22 │    body

    julia> parsestmt(SyntaxNode, "[a for i = is, j = js if z]")
    line:col│ tree                                   │ file_name
       1:1  │[comprehension]
       1:2  │  [generator]
       1:2  │    a
       1:7  │    [filter]
       1:7  │      [iteration]
       1:7  │        [in]
       1:8  │          i
       1:12 │          is
       1:15 │        [in]
       1:16 │          j
       1:20 │          js
       1:26 │      z

    julia> parsestmt(SyntaxNode, "[a for i = is for j = js if z]")
    line:col│ tree                                   │ file_name
       1:1  │[comprehension]
       1:2  │  [generator]
       1:2  │    a
       1:7  │    [iteration]
       1:7  │      [in]
       1:8  │        i
       1:12 │        is
       1:18 │    [filter]
       1:18 │      [iteration]
       1:18 │        [in]
       1:19 │          j
       1:23 │          js
       1:29 │      z
@c42f c42f force-pushed the caf/iteration-spec branch from ced4ffb to ddfb944 Compare July 24, 2024 21:59
Copy link

codecov bot commented Jul 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.51%. Comparing base (835a527) to head (ddfb944).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #433      +/-   ##
==========================================
- Coverage   96.75%   96.51%   -0.25%     
==========================================
  Files          14       14              
  Lines        4259     4249      -10     
==========================================
- Hits         4121     4101      -20     
- Misses        138      148      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@c42f c42f merged commit da801cc into main Jul 26, 2024
38 checks passed
@c42f c42f deleted the caf/iteration-spec branch July 26, 2024 01:33
c42f added a commit to c42f/JuliaLowering.jl that referenced this pull request Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants