Replies: 11 comments
-
1200 rules is... a LOT! Wow. OK, I don't think we had that problem and the design is not really suited for that use-case. The only thing I can think of depends on how your grammar is structured. You could, at some point, write a custom rule and detach the "inner grammar" - if that is possible. Put the method to call the inner parser (and everything that belongs to it) in it's own compilation unit, separate from the "outer grammar". Like parsing a large JSON file and knowing that one specific, nested attribute contains XML. The JSON grammar is parsed in compilation unit 1 and, when matching the specific field, calls an XML parser implemented in compilation unit 2. Does that help? (If it is even possible in your case) |
Beta Was this translation helpful? Give feedback.
-
The grammar in question is SQL-99, which you can find readily enough on the web. It's pretty interconnected, so that approach, which I had considered, doesn't really work out. I haven't got right down into the gory details of how the template library works, but I wonder if it is possible to short-circuit some of the template matching with explicit instantiation. I don't know if this would work but the idea goes like this: Imagine you have a sub-component I'm not quite sure how this solution would look in the PEGTL, or even if it's possible, but perhaps having an interface that uses inheritance and virtual functions to implement the boundary between grammar parts. Obviously, if you used it in a fine-grained way, it would be really slow compared to the templated version, but it achieve the objective of isolating components of a large grammar. |
Beta Was this translation helpful? Give feedback.
-
Can you show us your grammar? |
Beta Was this translation helpful? Give feedback.
-
It's via a meta-parser for the grammar as presented on this page: https://ronsavage.github.io/SQL/sql-99.bnf.html |
Beta Was this translation helpful? Give feedback.
-
The book it is based on was published in 2001, PEGs are based on a paper from 2004, hence the grammar described by the book's BNF is almost certainly a CFG, not a PEG. It is highly unlikely that this will work. You could do a basic check by using our grammar analysis, but that might not catch all problems, only the most glaring ones (aka left recursion). |
Beta Was this translation helpful? Give feedback.
-
Absolutely! Unfortunately the grammar analysis tool doesn't work with the utf8/icu rules, and I haven't yet invested the time to understand enough details to fix it. The problem you point out is in fact the motivation for this question. :) As I am writing unit tests for sub-components of the grammar, I'm rewriting bits of the source grammar. Eliminating left recursion is one issue. Another is adding/removing intermediate productions to make it easier to produce a useful parse tree. However, this is rather tedious with the long compile time, hence the desire to find ways to speed things up. T. |
Beta Was this translation helpful? Give feedback.
-
Hello @d-frey, I'm trying to do something similar but with an sql like language. I've not been able to figure out how to properly switch statements from the examples. How can I handle a statements like below: CREATE COLLECTION test1;
SELECT * FROM test1 JOIN test2 ON test1.id = test2.id; The grammar includes How do I have the action move from state to a different state or signal to the action that:
|
Beta Was this translation helpful? Give feedback.
-
@drtconway I'll look into adding grammar analysis integration for the ICU rules. |
Beta Was this translation helpful? Give feedback.
-
@kelvinhammond It sounds like what we internally call "switching style" might be useful for you, i.e. changing the States and/or Actions for different parts of the grammar, see https://github.com/taocpp/PEGTL/blob/master/doc/Actions-and-States.md#changing-actions and below. |
Beta Was this translation helpful? Give feedback.
-
@drtconway The grammar analysis now also supports the ICU rules. |
Beta Was this translation helpful? Give feedback.
-
@drtconway As Colin fixed the grammar analysis for the ICU rules I am closing this issue for now. As mentioned earlier, the PEGTL does not support splitting the grammar into different compile units directly, as it is heavily based on (variadic) templates. If you feel there is more to discuss or specific things that could be improved to allow multiple TUs, feel free to add more comments or even re-open this issue. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I have a large grammar with about 1200 productions.
It takes about 5 minutes and 2.5Gb RAM to compile on my pretty current PC.
Are there techniques I can use to partition the compilation, or organise things to reduce the resource envelope? I couldn't see any in the documentation.
Tom.
Beta Was this translation helpful? Give feedback.
All reactions