Releases: parsingdata/metal
Metal 10.0.0
This release includes the following major improvements:
- #360: There are additional
def
shorthands with the same parameters as theuntil
shorthands with the difference that the terminator is not included in the parsegraph. This allows for writingdef
s until the beginning of a next token, without consuming it. - #375: There is a new expression
Join
and matching shorthandsjoin
. With this expression you can join any arbitrary set of expressions and this evaluates to a single flattened list. - #392: The expression
ref
accepts multiple names and finds values that matches any of these names. The values are returned in parsing order, not in the order the names are specified. - #399: Introducing
ParseValueCache
. This cache keeps track of any value by name, which improves the performance ofref
by name significantly, since it will not need to process the entire ParseGraph for eachref
anymore.
Additionally, some smaller changes have been made. Most significantly:
- #367 Bugfix: The name of an
until
is now included in the scope name of the innerdef
. - #405: There is a huge performance gain when using ConcatenatedValueSource for concatenating large chunks of data.
- #402: Hashes of all immutable objects within metal are cached, which improves performance when using these objects in a HashMap.
- #362: Builds are now automatically run by GitHub actions.
Note: The next Metal version will require at least a JRE of version 17 instead of 11.
Full Changelog: v9.0.0...metal-10.0.0
For a more detailed list of changes, please see the 10.0.0 Milestone.
Metal 9.0.0
This release includes a major improvement and a big change:
- Cycle detection has been improved by implementing a cache. This means that using
Sub
Token
s is now considerably faster, especially in large-scale and/or nested situations. - Metal now requires at least a JRE of version 11 instead of 8.
Additionally, some smaller changes have been made. Most significantly:
- An exception is no longer thrown when a
Def
is parsed with a size greater thanInteger.MAX_VALUE
. - Many
ValueExpression
s now haveShorthand
versions where arguments and return value are of typeSingleValueExpression
. This improves usability by no longer requiring wrapping everything inlast(...)
-expressions when no lists are involved. - A minimal set of Javadoc has been added to the
Shorthand
, to make Metal easier to use especially when using an IDE.
For a more detailed description, please see the 9.0.0 Milestone.
Metal 8.0.0
This release adds two major enhancements:
- The
Scope
ValueExpression
allows referencing values inside a dynamically constructed scope, useful for referencing values inside a specific part of the structure. This is especially useful for data structures that have many repeating structures:Scope
makes it much easier to reference the current, local version of a value. - The
CurrentIteration
ValueExpression
allows referencing the count of a running iteration (e.g., when inside an iteratingToken
such asRep
,RepN
orWhile
). Passing a value allows referencing multiple levels of iteration (0=current, 1=parent, 2=parent-of-parent, etc.)
Additionally, the code has been cleaned up and improved in many ways. Most significantly:
- Instead of returning
ImmutableList<Optional<Value>>
from evaluating aValueExpression
, this now returns anImmutableList<Value>
. To signify the absence of a value, there is now aNOT_A_VALUE
constant, that replacesempty()
instances inside the list of results. - There are
ValueExpression
implementations that always return a single value (e.g.,Last
or any of theFold*
). To enhance the typing of theseValueExpression
s, there is now aSingleValueExpression
that derives fromValueExpression
and provides anevalSingle(...)
method that directly returns anOptional<Value>
.
Apart from these changes, many small fixes have been performed based on applying many new build and analysis tools on our code base, including CI on Windows.
Note: From Metal 9.x.x onward, the required version of Java will change from at least 8 to at least 11 (or higher, depending on when it will be released).
For a more detailed description, please see the 8.0.0 Milestone.
Metal 7.1.0
This release provides some small changes/additions:
- A new implementation of
ConcatenatedValueSource
, in order to resolve some runtime performance issues that were introduced in the previous (7.0.0) release. - Related to the previous change is an additional variant of the
Cat
ValueExpression
that takes a single operand and then folds all the resulting values after evaluation. - The result of
ValueExpression
s used as a source by aTie
token are now cached, in order to prevent complex operations (e.g., decompressing data) from being performed repeatedly. - The behaviour of the
Pre
token before the 6.0.0 release is still often used in Metal descriptions, so this has been added to theShorthand
with the more precise namewhen
.
For a more detailed description, please see the 7.1.0 Milestone.
Metal 7.0.0
This release introduces lazy input reading to reduce heap usage and improve scalability and performance. Lazy IO involves the following changes:
- Data read by
Def
andUntil
tokens is no longer cached, so it is only read when referenced (e.g., by a comparison expression or by requesting the data throughgetValue()
after a parse). - The
Len
(length of a value),Cat
(concatenation of values) andBytes
(splitting values) expressions are also lazy, so using these does not automatically cause a read. - The
Nod
token has been removed since large pieces of data that are unused can now be safely described usingDef
. The shorthand remains but maps toDef
.
There has also been a cleanup of the API:
- All remaining uses of integers describing offsets and lengths have been converted to
BigInteger
. The remaining limitation is that once a value is read it is stored in a Java byte array, which means its size is limited to 2GB. - The
Environment
has been renamed toParseState
andCallbacks
has been moved out of it. A newEnvironment
class has been introduced that aggregates all inputs to a parse (Scope
,ParseState
,Callbacks
andEncoding
). ParseState
is now passed in toValueExpression.eval()
instead of only theParseGraph
. As a resultCURRENT_OFFSET
now works correctly again, also aroundSub
andTie
tokens.
Some fixes/changes:
- Negative offsets are no longer attempted to be handled. They either cause a fail (during parse) or throw an exception (when passed in directly).
- JDK9 is now supported, but Metal itself is still Java 8 compatible.
- FindBugs no longer incorrectly reports a problem with
equals()
.
For a more detailed description, please see the 7.0.0 Milestone.
Metal 6.0.0
Many changes in this release. Functional:
- A new
Token
calledUntil
has been added, which greatly simplifies parsing data that has a terminator instead of some explicit size. - A new
ValueExpression
calledBytes
has been added, which turns a list of values into a list of single-byte values. - Support for deep, value-based equality checks on all
Token
s. - The
Def
Token
has been generalized by removing itspredicate
field. Instead there is now a generalPost
Token
to express post-conditions. - The
Pre
Token
is now more strict: if its condition is nottrue
, it fails. - Two new
ComparisonExpression
s have been added:GtEqNum
andLtEqNum
for greater-than-or-equals and lower-than-or-equals respectively.
Non-functional:
- Migrated to Java 8. Many parts of the API have been improved, e.g.,
ParseResult
is nowOptional<Environment>
,Reducer
is nowBinaryOperator<ValueExpression>
, the oldOptionalValue
class is now a properOptional<Value>
, etc. - The implementation is still recursive and (almost entirely) immutable, but because all recursive code has been augmented with trampolines, this will no longer cause stack overflows.
ValueExpression
s of the formlast(ref(x))
have been optimized to immediately return a single value, greatly improving runtime performance.- To evaluate a
ValueExpression
, only aParseGraph
is now needed instead of an entireEnvironment
.
For a more detailed description and all other changes, see the milestones associated with this release: Migrate to Java 8 and 6.0.0.
Metal 5.0.0
Most important changes:
- A new type of
Token
has been added:Tie
, which is short for Token-in-(Value)Expression. It allows aToken
to use the output value of aValueExpression
as input for anotherToken
. Simply put, this allows nested parsing. - Since
Tie
makes it possible to parse other data than the directly underlying inputstream, the mechanism to relate aParseValue
back to its underlying data has been redesigned. EachParseValue
now has aSlice
of aSource
, which refers to someValueExpression
, the underlyingByteStream
or to some constant value. - Another new type of
Token
has been added:TokenRef
, which allows referencing by name of a previously parsedToken
. This allows constructing recursive tokens such as a linked list without having to rely on workarounds such as anonymous classes to put the structure together. - The
Str
Token
has been removed! Instead we have a scheme to dynamically attachCallback
instances by adding them to theEnvironment
. - There is now an
Nth
ValueExpression
that allows indexing of a result of some otherValueExpression
.
Additionally, several bugs have been fixed, code coverage has increased again (100% now!), documentation has been written and more tests have been added (including automated mutation testing). For a full overview of the changes, please have a look at the accompanying milestone: https://github.com/parsingdata/metal/milestone/3?closed=1.
Finally, note that the parent pom has been renamed from metal-parent
to just metal
. To use it, please use the following Maven dependency:
<dependency>
<groupId>io.parsingdata</groupId>
<artifactId>metal</artifactId>
<version>5.0.0</version>
</dependency>
Metal 4.0.1
Bugfix in this version:
ParseValue
s created byDef
tokens with an explicitEncoding
have an incorrect name.
Metal 4.0.0
Changes in this version (1 and 2 are major changes):
ValueExpression
now returns a list of results instead of a single value. What this means is thatref("label")
now returns all items with the name 'label' instead of just the most recent match. Expressions such aslast()
andfirst()
have been adjusted to operate on lists, so what used to beref("label")
is equivalent tolast(ref("label"))
from this version on. AllValueExpressions
now operate only on lists. For example,add(x, y)
now adds the values with the same index in lists 'x' and 'y' (e.g.add([ 1, 2, 3 ], [ 4, 5, 6 ])
=>[ 5, 7, 9 ]
).Token
implementations now all have a name. So instead of having to use astr()
to add a name to a scope, eachToken
has a constructor with an optional first argument.- A new
ValueExpression
has been added:count()
, which returns the size of a list. Useful for counting the amount of times some value has been parsed withcount(ref("label"))
. - A lot of tests have been added, some small optimizations and cleanup of
toString()
implementations. Test code coverage is now at 99%! - All
Token
implementations now expose their fields as public members. This is possible since they are all immutable (except for the collections ofseq
andcho
, which return a copy of their collection field). - Fixed a bug in
nod
: it now checks whether the resulting size is negative. - Signedness is now encoded as an enum (
Sign
) instead of a boolean. ref
now accepts aToken
in addition to aString
for matching.
To use the Metal 4.0.0 core library, add the following dependency in Maven:
<dependency>
<groupId>io.parsingdata</groupId>
<artifactId>metal-core</artifactId>
<version>4.0.0</version>
</dependency>
Metal 3.1.0
The project has been changed into a Maven multiple module project. Which instead of just metal now has two artifacts: metal-core and metal-formats. The former replaces the original artifact and the latter contains format descriptions that can be used by client applications or as examples. To continue using the core library, update the dependency to:
<dependency>
<groupId>io.parsingdata</groupId>
<artifactId>metal-core</artifactId>
<version>3.1.0</version>
</dependency>
This release also brings three additions to the core API (1-3) and a small internal change (4):
- Added
ByToken.getAll()
to retrieve allParseItems
resulting from a providedToken
(#22 by @ccreeten) - Added
Elvis
operator asValueExpression
as an alternative to a standard ternary (#34 by @rdvdijk) - Added
Len
ValueExpression
to return the length in bytes of a providedValueExpression
(#37 by @gertjanal) - Removed some unnecessary dependencies on the Java standard library (#17 by @rdvdijk and @jvdb)