Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing into a flat AST / Side effects in actions #386

Open
DrSloth opened this issue Oct 16, 2024 · 3 comments
Open

Parsing into a flat AST / Side effects in actions #386

DrSloth opened this issue Oct 16, 2024 · 3 comments

Comments

@DrSloth
Copy link

DrSloth commented Oct 16, 2024

Hello, i am investigating peg to parse the grammar of a programming language i am working on.

Is there any way to store the generated AST in a flat Vec instead of building it up with boxes?
The documentation says that rust actions must be deterministic and must not have side effects, is this not possible because of this? Any "workarounds"?

Thanks in advance and have a nice day ::)

@kevinmehall
Copy link
Owner

There are two places where this comes up:

PEG involves backtracking, which means that if a rule fails, it will throw away the result, back up, and try another alternative of a / or similar. So you may see actions called that don't end up contributing to the final result. If you structure your grammar to not call any side-effecting functions on paths that later have fallible expressions, you can avoid this, but that won't be statically checked. Alternatively, if the object you return from the action can undo the side effect on drop (like Box freeing the memory), or accept unneeded items in your Vec, that's also fine.

Secondly, when overall parsing fails, rust-peg will re-run the parse in a slower mode that traces intermediate failures in order to build the "expected" set for the error position. This will run all the actions a second time. It assumes the action code is deterministic and expects this parse to take the same path as the prior parse and may panic if this is not the case. That's just for control flow, e.g the result of a {? } block, but doesn't care about the exact return values of a regular {} block, which will be discarded when the Err is returned instead.

You could pass in a Vec or other type of arena as a grammar argument:

peg::parser!{
grammar test(arena: &mut Vec<Node>) for str {
    pub rule expression() -> usize = { let id = arena.len(); arena.push(...); id }
}}

let mut vec = Vec::new();
test::expression("input str", &mut vec);

@DrSloth
Copy link
Author

DrSloth commented Oct 17, 2024

Thank you!
Nice that is exactly what i am doing and its working right now. Does &mut work? I think i had to pass something in a RefCell because rules expect Fn closures which can't take &mut (??) maybe i just misread that error.

The only problem here would be that the AST takes more space than necessary but that is an evil that i am willing to take for now.

@kevinmehall
Copy link
Owner

&mut works in simple cases, but might break with things like precedence!{} blocks that do more complicated argument passing. Grammar arguments are semi-undocumented because I was never happy with the limitations like that. But sometimes you need them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants