-
-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better error reporting. #467
Comments
I'm experiencing the same issue with error messages. The message always seems to report the top level rule, even if part of the rule is matched correctly and the actual error is at some other place. For example, I was using the following grammar for testing:
And then tried the following input:
The problem with that input is that the grammar is expecting
But instead I got:
This error is very misleading, the first few tokens of the rule are actually correct, and the failure comes when the parser finds Follow-up:
So, it looks like the errors are always |
Hi! Some examples: I leave the link to our simplified SQL grammar, but it seems to be redundant for examples understanding.
It seems to me that the main problem with errors is that we don't track tokens that our grammar expects to see at an farthest errored position and that library user can't interact with the sequence of rules that were applied to parse the input. My proposal is to tinker logic in the
Later we can can pass these info into Please see proposed changes at MR. |
I believe this issue could be solved by the introduction of a "cut" or "trap" feature in Pest, as described here: #934 In a PEG parser, a failure in a rule isn't necessarily fatal - it just means that the parser will attempt to try a different alternative, popping up a level if need be. If no alternatives can be found that match, then the whole parse fails. Unfortunately, this means that a failure in a deeply nested rule won't be considered a failure of that specific rule, but rather a failure of the grammar as a whole - therefore you get an error at the top level, which isn't much help to the user. The solution is a way to tell the parser to "commit" to the current alternative. If you see a pattern like "if (", you know it's an if-statement, there's no need to check alternate rules. Even if the input tokens afterwards are garbage, like "if (%!__", we know this is a bad if-statement, not some other kind of statement. In prolog, this operation is called "cut", and it's a token that is placed within the grammar that means "once you see this token, commit to this branch and don't try any other branches. If subsequently you see an error, then the error is here, not in some parent rule. |
I played with the online editor and wasn't very happy with the error messages.
It seems to report only the name of the root rule:
expected bar
, while I would expect to see a rather more descriptive message:The text was updated successfully, but these errors were encountered: