The new parser(Parrot) can parse Groovy source code and construct the related AST, which is almost identical to the one generated by the old parser(except the corrected node position, e.g. line, column of node). Currently all features of Groovy are available. In addition, the following new features have been added:
- do-while loops; enhanced (now supporting commas) classic for loops, e.g.
for(int i = 0, j = 10; i < j; i++, j--) {..}
) - lambda expressions, e.g.
stream.map(e -> e + 1)
- method references and constructor references
- try-with-resources, AKA ARM
- switch-expression
- sealed type
- record type
- code blocks, i.e.
{..}
- Java style array initializers, e.g.
new int[] {1, 2, 3}
- default methods within interfaces
- additional places for type annotations
- new operators: identity operators(
===
,!==
), elvis assignment(?=
),!in
,!instanceof
- safe index, e.g.
nullableVar?[1, 2]
- non-static inner class instantiation, e.g.
outer.new Inner()
- runtime groovydoc, i.e. groovydoc starting with
/**@
; groovydoc attached to AST node as metadata
JVM system properties to control parsing:
Option | Description | Default | Example |
---|---|---|---|
groovy.antlr4.cache.threshold | antlr4 relies on DFA cache heavily for better performance, so antlr4 will not clear DFA cache, thus OutOfMemoryError will probably occur. Groovy trades off parsing performance and memory usage, when the count of Groovy source files parsed hits the cache threshold, the DFA cache will be cleared. Note: 0 means managing the DFA cache automatically, -1 means never clearing DFA cache, so requiring bigger JVM heap size. Or set a greater value, e.g. 200 to clear DFA cache if threshold hits. **Note: ** the threshold specified is the count of groovy source files |
0 |
-Dgroovy.antlr4.cache.threshold=200 |
groovy.antlr4.sll.threshold | Parrot parser will try SLL mode and then try LL mode if SLL failed. But the more tokens to parse, the more likely SLL will fail. If SLL threshold hits, SLL will be skipped. Setting the threshold to 0 means never trying SLL mode, which is not recommended at most cases because SLL is the fastest mode though SLL is less powerful than LL. **Note: ** the threshold specified is the token count |
-1 (disabled by default) |
-Dgroovy.antlr4.sll.threshold=1000 |
groovy.antlr4.clear.lexer.dfa.cache | Clear the DFA cache for lexer. The DFA cache for lexer is always small and important for parsing performance, so it's strongly recommended to leave it as it is util OutOfMemoryError will truely occur | false |
-Dgroovy.antlr4.clear.lexer.dfa.cache=true |
Parrot is based on the highly optimized version of antlr4(com.tunnelvisionlabs:antlr4), which is licensed under BSD. On 20161103 Parrot was contributed to Apache Groovy, but the project will be maintained as a lab to experiment new features for Groovy. You can find it at apache/groovy.
Can someone explain to me the importance of the Parrot compiler? Basically explain like I am 5?
the syntax of Groovy hasn’t evolved in a long time
the current / old parser is a bit complicated to evolve
and is using a very old version of the parsing library
so any change we’d want to make to the language (a new operator, for example) becomes very complicated
So we’ve been wanting to upgrade the underlying parser library for a while, but since the library evolved a lot, that also required a rewrite of the grammar of the language
But there’s another thing to consider
Groovy’s always been adopted by Java developers easily because of how close to the Java syntax it’s always been
so most Java programs are also valid Groovy programs
it’s been important to Groovy’s success to have this source compatibility
Java 8's been out for a while already
and we’ve been asked countless times if we’d support this or that particular syntax enhancement from Java 8
for “copy’n paste compatibility”, if you will
We decided to upgrade to a newer version of our parsing library (from v2 to v4 of Antlr)
to allow Groovy’s syntax to continue to evolve
to also support new operators and things like that
but to also support Java 8 constructs, for continued compatibility
And that’s about it