-
-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Perl support and some tests #1301
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow you've been busy! Solid progress. I left a bunch of comments; happy to discuss at meet-up if helpful
In addition to the inline comments, might be worth adding tests for the other string types
packages/cursorless-vscode-e2e/suite/fixtures/recorded/languages/perl/changeArg.yml
Outdated
Show resolved
Hide resolved
packages/cursorless-vscode-e2e/suite/fixtures/recorded/languages/perl/changeClassName.yml
Outdated
Show resolved
Hide resolved
packages/cursorless-vscode-e2e/suite/fixtures/recorded/languages/perl/changeCondition.yml
Outdated
Show resolved
Hide resolved
scopeType: {type: ifStatement} | ||
usePrePhraseSnapshot: true | ||
initialState: | ||
documentContents: if ( $this_is_a_test ) { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth testing "state" on this one as well once that's implemented
packages/cursorless-vscode-e2e/suite/fixtures/recorded/languages/perl/changeItem.yml
Outdated
Show resolved
Hide resolved
) { | ||
// heredoc_content does not seem to supported by tree-sitter-perl, | ||
// leaving it anyway since it won't hurt | ||
if (node.type === "string_content" || node.type === "heredoc_content") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a valid tree-sitter node type? If not I'd drop it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically again copied from Ruby, though I'm indeed not sure if it will be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a test like
"aaa (bbb) ccc"
'aaa (bbb) ccc'
q(aaa (bbb) ccc)
qq(aaa (bbb) ccc)
- use multiple cursors, with one cursor in middle of each
bbb
- say "clear round"
should end up with
"aaa ccc"
'aaa ccc'
q(aaa ccc)
qq(aaa ccc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the test (had to use a failure as it wasn't working at all, and manually updated that to a hopefully passing test once the behaviour is fixed). Not yet looked whether I can fix the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed last night, string extractor reverted to default behaviour now, and test reduced to the top two ("basic") strings. Though removed for now, most of them do work, here's take round
with multi cursor on the middle b
:
But until string parsing itself is handled correctly in all cases for Perl, I don't want to deal too much with those including any failing tests for unusual string representations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add tests for the ones that work. For the one that doesn't, are you certain that behaviour is wrong? Does perl take into account the second (
when determining where the string ends, or does it just blindly look for the next )
that it sees?
for more information, see https://pre-commit.ci
Now supports more regular expression syntaxes, and it captures all of the regular expressionas as long as there are no whitespace comments in these regular expressions (not yet supported by the parser).
for more information, see https://pre-commit.ci
Was not yet working after the tree-sitter-perl updates, now fixed to match the behaviour of the python implementation.
The implemented string parser wasn't working well enough; back to the default parser which doesn't handle all Perl strings well either, but it can at least support `round`. Future improvement is likely to happen.
Not perfect yet, some complicated constructs do not work well yet and I'm not sure yet if it is me or the parser. Here is some exotic, syntactically valid Perl: ```perl foo( $bar )->bar(); foo( $bar )->bar( $baz ); ```
For the file or a file scoped package declaration, takes the entire file. For a lexically scoped package, takes just that package. Should ideally take all of the file but not any packages declared within to be "correct", but that is not yet possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming along nicely! Left a few comments below
packages/cursorless-vscode-e2e/src/suite/fixtures/recorded/languages/perl/changeArg4.yml
Outdated
Show resolved
Hide resolved
packages/cursorless-vscode-e2e/src/suite/fixtures/recorded/languages/perl/changeArg.yml
Show resolved
Hide resolved
packages/cursorless-vscode-e2e/src/suite/fixtures/recorded/languages/perl/changeCall4.yml
Show resolved
Hide resolved
packages/cursorless-vscode-e2e/src/suite/fixtures/recorded/languages/perl/changeClassName.yml
Show resolved
Hide resolved
@@ -0,0 +1,46 @@ | |||
languageId: perl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a test from within the body of both classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the top body it doesn't work yet (which is technically correct; the nodes following the package and name nodes are siblings, not children, and thus can't be found). I've added a test for the second one, but the first one will need matcher improvement in some way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be able to just do something like source_file.package_statement.package_name!
, similar to what you're already doing for "class" to handle this situation
packages/cursorless-vscode-e2e/src/suite/fixtures/recorded/languages/perl/changeCondition.yml
Show resolved
Hide resolved
@@ -0,0 +1,22 @@ | |||
languageId: perl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a "state" test for this one as well
packages/cursorless-vscode-e2e/src/suite/fixtures/recorded/languages/perl/changeList.yml
Show resolved
Hide resolved
packages/cursorless-vscode-e2e/src/suite/fixtures/recorded/languages/perl/takeClass5.yml
Show resolved
Hide resolved
Not perfect yet, I've not been able to yet find a way to get the first argument when on the opening paren, last argument on the closing paren, or the arg left of the cursor. But behaviour much improved otherwise (respects comma's much better now, which was broken before).
Does not yet manage to find the condition for a C-style loop like ``` for ( my $i = 0; $i < 10; $i += 1 ) {} ``` when the cursor is at the keyword at the start of the line. Also, the tree-sitter-perl fails processing for loops that assign to the default variable, so can't add all the tests for that yet. Example: ``` for ( 1 .. 10 ) { say "$_" } ``` Taken a note to try and fix that or raise issues for.
Now works from the `for` keyword
tree-sitter-perl doesn't yet differentiate between LHS and RHS of assignments (lvalue and rvalue) so support for taking the name of a variable is not implemented yet.
Hash values only; rvalue not yet available in the parse tree, and have yet to work out how to return just the value node from a return value.
Can't work out how to get the full branch node selected for ternary operators, so not implemented this just yet. if/elsif/else behaviour is like other languages that support it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely getting there! Not a ton left here; I believe my comments capture everything remaining to do. Good work!
- anchor: {line: 0, character: 16} | ||
active: {line: 0, character: 16} | ||
marks: {} | ||
thrownError: {name: NoContainingScopeError} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feelings about adding a test for behaviour that we don't actually want to preserve. any thoughts @AndreasArvidsson? some schools of thought support it; I believe it's called "characterization testing" in the literature
@@ -0,0 +1,34 @@ | |||
languageId: perl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth adding support for "funk name" (via functionName
scope type), and then just reusing this test with "change funk name"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both "value" tests are good, but worth adding a few more versions of value:
- return values
- right-hand side of assignments
- possibly other things; see other languages, and sometimes can be handy to see if there are any types / fields called "value" in the tree-sitter spec for your language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Top two are still on my todo list: the 1st one I forgot to ask about in the meetings; the 2nd isn't possible with the parser atm. If Ganesan implements things like I suggested in the issue that I raised then that would become possible.
Other languages, well I don't find the tests particularly easy to read nor to even find (without extensive clicking around in the tree). Can certainly have a look at the grammar for value though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Top two are still on my todo list: the 1st one I forgot to ask about in the meetings; the 2nd isn't possible with the parser atm. If Ganesan implements things like I suggested in the issue that I raised then that would become possible.
👍. Worth linking issue here so we can track it
Other languages, well I don't find the tests particularly easy to read nor to even find (without extensive clicking around in the tree). Can certainly have a look at the grammar for value though.
That's fair. The other one that comes to mind is default value for a function parameter, if such a thing is supported. For example:
function foo(bar = 0) {}
Saying "value"
from anywhere in bar = 0
should target 0
. Also, in that case, "chuck value"
should clean up the =
I also just had a look at the typescript definitions to see if I was missing anything, and the one thing I missed is yield
statements in generators. Not sure if Perl supports such a thing. So to sum up:
- return statements,
- key-value pairs,
- default values for function parameters (if they exist)
- clean up leading
=
- clean up leading
- RHS of assignments
- clean up leading
=
(kinda extra credit, but can be useful for converting an assignment to a declaration, if such a thing makes sense 🤷♂️
- clean up leading
- yield statements, if they exist
- other things that feel like they should be values in Perl?
@@ -0,0 +1,22 @@ | |||
languageId: perl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
> = { | ||
map: "hash", | ||
list: "array", | ||
condition: matcher( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing ternaries. prob not a showstopper to ship without it; depending how common ternaries are in Perl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth adding some "round" tests using the q- / qq- style strings
"source_file", | ||
], | ||
className: "package_statement.package_name!", | ||
name: ["function_definition[name]", "*[key]"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- should include the matchers from
className
here as well -
"name"
should match variable name in assignment statement
className: "package_statement.package_name!", | ||
name: ["function_definition[name]", "*[key]"], | ||
value: ["*[value]"], | ||
branch: cascadingMatcher( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing ternaries. prob not a showstopper to ship without it; depending how common ternaries are in Perl
import { patternFinder } from "../util/nodeFinders"; | ||
import { branchMatcher } from "./branchMatcher"; | ||
|
||
const nodeMatchers: Partial< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing "state" (statement
)
Btw you might consider using checkboxes in the issue description as your todo list, as that way it's easy for us to stay on the same page about what's left to do |
Just wanted to check in to see where we were with this one; are we still waiting on stuff to be changed in tree-sitter-perl? Fwiw the new query-based approach to defining languages in Cursorless has now merged, so might make it easier to define some complex tree patterns if you want to leverage it. See the updated https://www.cursorless.org/docs/contributing/adding-a-new-language/#2-define-parse-tree-patterns-in-cursorless |
Checklist