Capture slice and value #283

bruceiv · 2022-01-17T00:17:11Z

I'd like a built-in expression to capture both the result of a parsing expression and the input slice that spans it, like $(e) but produces a tuple of (slice spanning e, return value of e). (This is basically the consumed combinator in nom; $$(e) might be a good syntax.)

My motivation is that I'm working on a programming-language parser where each AST node has a location field that's basically a three-tuple: (start index, length, identifier for source file). I can't get the file identifier from the position!() expression, but I've set up the ParseSlice implementation for my file type to insert it. However, since there's no direct access to the input object within the available expressions in peg, the slice operator seems to be the only way to get this value, and if I use slice I don't get the return value of the expression. I think I can work around it for the use-cases I have by pulling the location fields from some of the subexpressions, but the only fully-general solution I can think of would be something like this which parses the expression twice (I haven't tried it, I'm not sure about the return value of &e):

rule cap_ret<T>(f: rule<T>) -> (Span, T)
= v:&f() p:$(f()) { (p, v) }

The text was updated successfully, but these errors were encountered:

kevinmehall · 2022-01-17T20:14:32Z

This would be pretty easy to add and would be generally useful. I'm trying to avoid adding more symbols to make grammars easier to read for new users, but $$ is close enough to $ that the difference could be easily documented.

For your use case though, there's an existing feature might be better than repurposing ParseSlice: the undocumented ##method() expression calls input.method(pos). Here's an example of it defined and used in rust-peg's own meta-grammar. This is undocumented because I plan to replace it with a more flexible expression accepting a block of Rust code in the grammar with access to input and position (#284) once I can come up with a good syntax for it.

You could use this to make a customized replacement for position!() that returns the additional info you need. Then wrap it in a generic rule:

rule spanned<T>(inner: rule<T>) = start:##custom_position() v:inner() end:##custom_position() { ... }
...
spanned(<foo>)

emk · 2023-10-11T12:54:14Z

For the case where someone needs a value T plus the input slice, I tried doing this, as @bruceiv suggested:

        /// Return both the value and slice matched by the rule.
        rule with_slice<T>(r: rule<T>) -> (T, &'input str)
            = value:&r() input:$(r()) { (value, input) }

This passed all my tests in a moderately complex grammar.

kevinmehall added the feature Something not supported that perhaps should be label Jan 17, 2022

kevinmehall mentioned this issue May 13, 2022

Values parsed inside span #297

Closed

kevinmehall mentioned this issue Jun 10, 2024

Capture both details and full input #377

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capture slice and value #283

Capture slice and value #283

bruceiv commented Jan 17, 2022

kevinmehall commented Jan 17, 2022

emk commented Oct 11, 2023 •

edited

Loading

Capture slice and value #283

Capture slice and value #283

Comments

bruceiv commented Jan 17, 2022

kevinmehall commented Jan 17, 2022

emk commented Oct 11, 2023 • edited Loading

emk commented Oct 11, 2023 •

edited

Loading