Robustify - Improved/Extended Lexing Behavior & Error Handling
Pre-releaseThis patch was authored and released by @tdotclare.
Major Changes
- Converts
LeafError
from Enum to source-reporting Struct - Adjusts
Lexer
state change table for human clarity - Adds
Lexer
robustness for parameters with no whitespace boundaries - Adds
Lexer
parameter handling for bin/oct/hex constantInt
s and hexDouble
s - Adds
Lexer
internal methods for additional checking peek/pop sequences - Makes
Parameter
depth a state variable inLexer
instead of an enum property - Adjusts
Character
exts for additional granularity on token types - Adjusts
Character
exts for start/body validity of token types - Adds
Character
exts for bin/oct/hex numerics - Adjusts
Parser
behavior to allow decayingtagBodyIndicators
toraw
when tag is known to have no body - Adjusts
Parser
to allow replacingLexed
tokens when necessary for above - Re-enables a number of
TestCase
functions from Leaf3 and adjusts for Leaf4 syntax
Problems Solved
Better/clearer error handling properties
Much easier to follow state changes during lexing In most cases
Parameter
processing is still complicated but other cases are clearly handled
Better Parameter
handling with whitespace
EG, the varying inputs below all properly lex to the correct interpretation now. Before, the first three would inaccurately lex, and only the fourth would correctly lex the parameters to
operator(not) variable(one) operator(||) operator(not) variable(two)
"#if(!one||!two)"
"#if(!one || !two)"
"#if(! one||! two)"
"#if(! one || ! two)"
Better handling of tagBodyIndicator
Previously syntax like #(index):#(value)
would error because the colon was universally assumed to indicate the start of a body - now, cases where it's impossible for a tag to take a body (eg, anonymous functions for now) will mutate the tBI back to a raw colon. Next step to improving this is to make observers on tag and built-in control structures to allow parsing to inquire as to expected state (eg, a function may take two parameters and no body or one parameter and a body and both are acceptable)
Improved syntax options for constant numerics
You can specify bin/oct/hex Int
and hex Double
constants now in Swift-manner literals... eg
0b1111 // Constant Int
1_000_000 // Constant Int
0x0.50 // Constant Double
Why? Why not?