Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow macrocall in function def syntax #456

Merged
merged 4 commits into from
Aug 2, 2024
Merged

Allow macrocall in function def syntax #456

merged 4 commits into from
Aug 2, 2024

Conversation

Keno
Copy link
Member

@Keno Keno commented Jul 8, 2024

@c42f
Copy link
Member

c42f commented Jul 17, 2024

So apparently this already became allowed since 1.10 by defaulting to JuliaSyntax and we just didn't know that this case was different in the flisp parser?

In that case I think we should lean into it and drop the min_supported_version to 1.10.

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the upstream discussion this seems fine (including the min_supported_version) but it needs tests in tests/parser.jl.

Should cover any weird cases we want to work (or not) as discussed upstream.

@Keno
Copy link
Member Author

Keno commented Jul 25, 2024

Tests added

@Keno Keno force-pushed the kf/macrocallsig branch from 31e8ef8 to 8543f48 Compare July 26, 2024 00:47
@Keno
Copy link
Member Author

Keno commented Jul 26, 2024

Updated to match the revised logic upstream, though I had to do a tiny bit of refactoring in parse_atom also to treat operators in atom position as identifiers to match what upstream does. Could be replaced by an explicit check for operators instead if desired.

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to bump operators in atom position as K"Identifier" is good. I've been wanting to do this for a while!

It's a breaking change at the SyntaxNode AST level, but that's fine. The fact that you didn't need to change the tests shows that we should have a lot more tests for token kinds...

Comment on lines +3501 to +3505
if is_dotted(peek_token(ps))
bump_dotsplit(ps, emit_dot_node=true)
else
bump(ps, remap_kind=K"Identifier")
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump_dotsplit has a remap_kind argument

Suggested change
if is_dotted(peek_token(ps))
bump_dotsplit(ps, emit_dot_node=true)
else
bump(ps, remap_kind=K"Identifier")
end
bump_dotsplit(ps, emit_dot_node=true, remap_kind=K"Identifier")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't work, because that also remaps the dotted version, which is undesirable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it works for SyntaxNode but then there's an error in Expr conversion. So we can leave that for now, I'll open a separate issue for it. But we should be consistent.

Why did you need to have bump(ps, remap_kind=K"Identifier") here anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to have bump(ps, remap_kind=K"Identifier") here anyway?

Just to avoid having an explit is_operator check on the sig_kind in KSet"Identifier var $". It also seems odd that e.g. isa gets turned into Identifiers, but + does not. I'm happy with either way though.

src/parser.jl Outdated Show resolved Hide resolved
Co-authored-by: Claire Foster <[email protected]>
@Keno
Copy link
Member Author

Keno commented Jul 28, 2024

@c42f are we good to go here? What else is needed? (just trying to cut down on my excessive in-flight PR load)

@c42f
Copy link
Member

c42f commented Jul 30, 2024

I pushed one extra test case.

Disambiguating all the function argument cases is one of the most subtle parts of the parser. I guess we want macro call and normal call to act exactly the same, syntactically?

In that case, the following divergence is unfortunate and maybe should be fixed (see logic in was_eventually_call()):

Function signature inside parentheses:

julia> parsestmt(SyntaxNode, "function (f(x)) body end")
line:col│ tree                                   │ file_name
   1:1  │[function]
   1:11 │  [call]
   1:11 │    f
   1:13 │    x
   1:16 │  [block]
   1:17 │    body

Vs macro call around the first (only) argument of an anonymous function:

julia> parsestmt(SyntaxNode, "function (@f(x)) body end")
line:col│ tree                                   │ file_name
   1:1  │[function]
   1:10 │  [tuple-p]
   1:11 │    [macrocall-p]
   1:12@f
   1:14 │      x
   1:17 │  [block]
   1:18 │    body

Alas both of these already parsed in 1.10 so changing this obscure case would technically be breaking.

@Keno
Copy link
Member Author

Keno commented Jul 30, 2024

I agree these are unfortunate

parsestmt(SyntaxNode, "function (f(x)) body end")

I think this is a strict-mode candidate for disallowance.

function (@f(x)) body end

This just needs to remain an anonymous function because @nospecialize is of this form, so it's commonly used.

@c42f
Copy link
Member

c42f commented Jul 31, 2024

I think this is a strict-mode candidate for disallowance.

Yes I've started adding comments "Syntax Edition TODO: ..." to the JuliaSyntax source for each dubious syntax which we should disallow.

For syntax I'd like to really push toward the idea of an ongoing sequence of Syntax Editions rather than a single "strict mode".

This just needs to remain an anonymous function because @nospecialize is of this form

Let's leave this as-is, then. I agree this is the better solution if we can eventually delete the form which currently "works" for normal call nodes.

@Keno
Copy link
Member Author

Keno commented Jul 31, 2024

parsestmt(SyntaxNode, "function (@f(x)) body end")

Turns out this one is a syntax error with the flisp parser (but not JuliaSyntax). I think we should call that a JuliaSyntax bug and keep it a syntax error both before and after this PR.

julia> function (@foo(a)) end
ERROR: syntax: ambiguous signature in function definition. Try adding a comma if this is a 1-argument anonymous function.
Stacktrace:
 [1] top-level scope
   @ REPL:1

@Keno
Copy link
Member Author

Keno commented Jul 31, 2024

I adjusted upstream to not accidentally allow this, but @c42f could I trouble you to implement the syntax error for that here? I don't think I know JuliaSyntax well enough to make sure I get this right.

@c42f
Copy link
Member

c42f commented Aug 1, 2024

I'm happy to adjust things!

But the old "ambiguous signature" error just feels very ad hoc to me.

The old logic is in valid-func-sig? here

https://github.com/JuliaLang/julia/blob/19165bec4a3e2ed0de4f2f43bccc3fcac2d76662/src/julia-parser.scm#L1330

And this predicate is used in exactly one place here

https://github.com/JuliaLang/julia/blob/19165bec4a3e2ed0de4f2f43bccc3fcac2d76662/src/julia-parser.scm#L1493

This leads to parsing inconsistencies such as:

julia> JuliaSyntax.fl_parse("function f(x[]) body end")
:(function f(x[])
      #= none:1 =#
      #= none:1 =#
      body
  end)

julia> JuliaSyntax.fl_parse("function (x[]) body end")
ERROR: ParseError("ambiguous signature in function definition. Try adding a comma if this is a 1-argument anonymous function.")

Whereas JuliaSyntax currently just parses these the same: they're both one-argument functions.

What if I add the ambiguity error back, but only when there's a macro call?

@c42f
Copy link
Member

c42f commented Aug 1, 2024

I believe the "correct" / strict / consistent version of this ambiguity warning would be to require that all one-argument anonymous functions have a trailing comma. But that's breaking :-(

Copy link

codecov bot commented Aug 2, 2024

Codecov Report

Attention: Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.67%. Comparing base (ae7d6ac) to head (7613256).
Report is 21 commits behind head on main.

Files Patch % Lines
src/parser.jl 90.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #456      +/-   ##
==========================================
+ Coverage   96.43%   96.67%   +0.24%     
==========================================
  Files          14       14              
  Lines        4208     4275      +67     
==========================================
+ Hits         4058     4133      +75     
+ Misses        150      142       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@c42f c42f force-pushed the kf/macrocallsig branch 2 times, most recently from cc303f5 to 398ddf2 Compare August 2, 2024 04:33
@c42f
Copy link
Member

c42f commented Aug 2, 2024

I've put in a targeted ambiguity warning. As explained in the commit message, I've made two very specific forms

function (@f(x))
    body
end

function ($f)
    body
end

into an ambiguity error. I think these are the only truly ambiguous cases as the parser doesn't know how the interpolated syntax will expand. It could be either of these:

function (@f(x),)
    body
end

function @f(x)
    body
end

As part of this, I noticed that the parsing of the unusual form function (f(x),) end was buggy and didn't emit a 1-arg tuple, so I also fixed that. Disambiguating anonymous vs named functions is a gift that keeps giving 😢 😆

This case is ambiguous as it might be either one of the following;
require the user to explicitly disambiguate between them

```
function (@f(x),)
    body
end

function @f(x)
    body
end
```

For the same reasons, `function ($f) body end` is also ambiguous.

Also fix parsing of `function (f(x),) end` to correctly emit a tuple.
@c42f c42f force-pushed the kf/macrocallsig branch from 398ddf2 to 6f62399 Compare August 2, 2024 04:37
@Keno
Copy link
Member Author

Keno commented Aug 2, 2024

Alright, sounds good to me.

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me, merge when ready.

@Keno Keno merged commit 0cd5642 into main Aug 2, 2024
36 checks passed
@Keno Keno deleted the kf/macrocallsig branch August 2, 2024 15:22
@Keno
Copy link
Member Author

Keno commented Aug 2, 2024

@c42f What's the procedure for bumping in base? tag here and then bump as a dep?

@c42f
Copy link
Member

c42f commented Aug 4, 2024

@Keno ok it's tricky, because JuliaSyntax is, in a sense versioning two APIs with one version number.

We need to respect JuliaSyntax's semver for this so next release will be 1.0 (see #472) but there's further breaking changes we need to make there. In particular, #474 is a logical consequence of the breaking changes to the AST which you made in this PR, and we need to follow through with that before 1.0.


Ok I've made an issue to talk through this whole mess :-/

#481

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants