Skip to content

Robustify - Improved/Extended Lexing Behavior & Error Handling

Pre-release
Pre-release
Compare
Choose a tag to compare
@tanner0101 tanner0101 released this 10 May 22:48
0ab66c9
This patch was authored and released by @tdotclare.

Major Changes

  • Converts LeafError from Enum to source-reporting Struct
  • Adjusts Lexer state change table for human clarity
  • Adds Lexer robustness for parameters with no whitespace boundaries
  • Adds Lexer parameter handling for bin/oct/hex constant Ints and hex Doubles
  • Adds Lexer internal methods for additional checking peek/pop sequences
  • Makes Parameter depth a state variable in Lexer instead of an enum property
  • Adjusts Character exts for additional granularity on token types
  • Adjusts Character exts for start/body validity of token types
  • Adds Character exts for bin/oct/hex numerics
  • Adjusts Parser behavior to allow decaying tagBodyIndicators to raw when tag is known to have no body
  • Adjusts Parser to allow replacing Lexed tokens when necessary for above
  • Re-enables a number of TestCase functions from Leaf3 and adjusts for Leaf4 syntax

Problems Solved

Better/clearer error handling properties

Much easier to follow state changes during lexing In most cases

Parameter processing is still complicated but other cases are clearly handled

Better Parameter handling with whitespace

EG, the varying inputs below all properly lex to the correct interpretation now. Before, the first three would inaccurately lex, and only the fourth would correctly lex the parameters to
operator(not) variable(one) operator(||) operator(not) variable(two)

"#if(!one||!two)"
"#if(!one || !two)"
"#if(! one||! two)"
"#if(! one || ! two)"

Better handling of tagBodyIndicator

Previously syntax like #(index):#(value) would error because the colon was universally assumed to indicate the start of a body - now, cases where it's impossible for a tag to take a body (eg, anonymous functions for now) will mutate the tBI back to a raw colon. Next step to improving this is to make observers on tag and built-in control structures to allow parsing to inquire as to expected state (eg, a function may take two parameters and no body or one parameter and a body and both are acceptable)

Improved syntax options for constant numerics

You can specify bin/oct/hex Int and hex Double constants now in Swift-manner literals... eg

0b1111 // Constant Int
1_000_000 // Constant Int
0x0.50 // Constant Double

Why? Why not?