Avoid redunant backtrack state updates #43

osa1 · 2022-02-07T13:47:26Z

In the Lua lexer I see code like

'>' => {
    self.0.set_accepting_state(Lexer_ACTION_13);          // 2
    match self.0.next() {
        None => {
            self.0.__done = true;
            match self.0.backtrack() {                    // 6
                ...
            }
        }
        Some(char) => match char {
            '>' => {
                self.0.reset_accepting_state();           // 12
                match Lexer_ACTION_31(self) {
                    ...
                }
            }
            '=' => {
                self.0.reset_accepting_state();           // 18
                match Lexer_ACTION_11(self) {
                    ...
                }
            }
            _ => match self.0.backtrack() {               // 23
                ...
            },
        },
    }
}

Here we don't need to set accepting state in line 2. Lines 6 and 23 call the semantic action set in line 2, and could call the function directly instead of going through backtrack. Lines 12 and 18 ignore accepting state entirely and call other semantic action functions.

I think if we move set_accepting_state calls to the branches it may become easier in the code generator to handle these cases. In that case the code above would look like:

'>' => {
    match self.0.next() {
        None => {
            self.0.set_accepting_state(Lexer_ACTION_13);       
            self.0.__done = true;
            match self.0.backtrack() {                
                ...
            }
        }
        Some(char) => match char {
            '>' => {
                self.0.set_accepting_state(Lexer_ACTION_13);       
                self.0.reset_accepting_state();      
                match Lexer_ACTION_31(self) {
                    ...
                }
            }
            '=' => {
                self.0.set_accepting_state(Lexer_ACTION_13);       
                self.0.reset_accepting_state();     
                match Lexer_ACTION_11(self) {
                    ...
                }
            }
            _ => {
                self.0.set_accepting_state(Lexer_ACTION_13);       
                match self.0.backtrack() {        
                    ...
                }
            }
        },
    }
}

Now in generate_state_arm, we can avoid generating set_accepting_state if we backtrack or call another semantic action function. We also need to use peek instead of next.

The text was updated successfully, but these errors were encountered:

osa1 · 2022-02-07T18:15:46Z

With new_from_iter it's more important to avoid clones in set_accepting_state as not all iterators will be cheap to clone. For example, ropey's char iterator is quite a bit larger than std's Chars: https://github.com/cessen/ropey/blob/36b21712ce9cdc125fa4580e573966a8bd3fd08c/src/iter.rs#L288-L296

osa1 added the perf Related to runtime performance label Feb 7, 2022

osa1 mentioned this issue Feb 7, 2022

Allow initializing lexers with a char iterator #42

Merged

osa1 changed the title ~~Avoid redunant __last_match updates~~ Avoid redunant backtrack state updates Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid redunant backtrack state updates #43

Avoid redunant backtrack state updates #43

osa1 commented Feb 7, 2022 •

edited

Loading

osa1 commented Feb 7, 2022

Avoid redunant backtrack state updates #43

Avoid redunant backtrack state updates #43

Comments

osa1 commented Feb 7, 2022 • edited Loading

osa1 commented Feb 7, 2022

osa1 commented Feb 7, 2022 •

edited

Loading