You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before 9902c7d, Ruby, Java and JavaScript would return the same error multiple times in a single parse invocation. This wasn't in line with the Go implementation: In Go, if there was an error with, e..g., a tag line, that error would be reported exactly once.
The cause
The token matcher returns a boolean value indicating whether the token is of the requested type (match_*). If the token matcher determines a token is of a specific type, it needs to return true. At the same time, while the type matches, there may be an issue parsing the token content. This happens where the line is a tagline with invalid tags or language selection tokens with non-existing languages. In order for the parser to be able to report the error after the parse, the token matcher adds any errors to the parser context before returning.
There lies the problem: a token may be given to the token matcher multiple times before it is finally accepted by the parser and added to the AST (notably due to lookahead functionality). Each time the problematic token is parsed, the error is added to the parser context by the token matcher, regardless of whether the token will end up in the AST (meaning that it's unclear if the error actually matters).
The proposed solution
In PR #152, the Perl code changes the interface contract of the token matcher: it will return 2 values. First the acceptance boolean and second a parser error if it occurred. Subsequently, the lookahead code is designed to look only at the acceptance boolean and the AST building functions match_token_at_* receive the error, adding it to the parser's error list when the token is accepted into the AST.
The design is better this way because exceptions should be used to indicate exceptional (unexpected) situations and return values should be used for "regular business". Parse errors - although being errors - are really in the "regular business" of a parser, meaning that the error should be returned from the token matcher to the parser which can deal with it as part of its regular processing.
This issue can be closed when
The community decides this is the way to go forward
The mainstream languages have been coded to follow the new pattern
Java
JavaScript/TypeScript
Ruby
(Others?)
This item was created after discussion during the community zoom meeting on 17 Aug 2023.
The text was updated successfully, but these errors were encountered:
The problem
Before 9902c7d, Ruby, Java and JavaScript would return the same error multiple times in a single
parse
invocation. This wasn't in line with the Go implementation: In Go, if there was an error with, e..g., a tag line, that error would be reported exactly once.The cause
The token matcher returns a boolean value indicating whether the token is of the requested type (
match_*
). If the token matcher determines a token is of a specific type, it needs to returntrue
. At the same time, while the type matches, there may be an issue parsing the token content. This happens where the line is a tagline with invalid tags or language selection tokens with non-existing languages. In order for the parser to be able to report the error after the parse, the token matcher adds any errors to the parser context before returning.There lies the problem: a token may be given to the token matcher multiple times before it is finally accepted by the parser and added to the AST (notably due to lookahead functionality). Each time the problematic token is parsed, the error is added to the parser context by the token matcher, regardless of whether the token will end up in the AST (meaning that it's unclear if the error actually matters).
The proposed solution
In PR #152, the Perl code changes the interface contract of the token matcher: it will return 2 values. First the acceptance boolean and second a parser error if it occurred. Subsequently, the lookahead code is designed to look only at the acceptance boolean and the AST building functions
match_token_at_*
receive the error, adding it to the parser's error list when the token is accepted into the AST.The design is better this way because exceptions should be used to indicate exceptional (unexpected) situations and return values should be used for "regular business". Parse errors - although being errors - are really in the "regular business" of a parser, meaning that the error should be returned from the token matcher to the parser which can deal with it as part of its regular processing.
This issue can be closed when
This item was created after discussion during the community zoom meeting on 17 Aug 2023.
The text was updated successfully, but these errors were encountered: