-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current iteration #242
Current iteration #242
Conversation
… iterable and counting siblings a long the way.
… it represents. Cleaned up test a bit.
…nd count the iterations of the iterable afterwards.
…de an iterable scope.
Codecov Report
@@ Coverage Diff @@
## master #242 +/- ##
============================================
- Coverage 100% 99.85% -0.15%
- Complexity 1016 1025 +9
============================================
Files 90 92 +2
Lines 1381 1387 +6
Branches 141 141
============================================
+ Hits 1381 1385 +4
- Misses 0 1 +1
- Partials 0 1 +1
Continue to review full report at Codecov.
|
} | ||
|
||
private Trampoline<ParseGraph> findCurrentIterable(final ParseItem item, final ParseGraph iterableCandidate) { | ||
if (!item.isGraph()) { return complete(() -> iterableCandidate); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if I would do something like rep(def("val", CURRENT_ITERATION))
or rep(tie(any("val"), CURRENT_ITERATION))
? Because when I eval the expression, there is no value to stop at? Seems like it would blow up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean infinite loop, no, but there will be a null pointer exception, because the head becomes null. Nice catch! Will add item == null
to this if statement and add tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see your point now. Because a def of size 0 is not added to the parseGraph. However, then the iterable is in fact empty, which will return an empty value and the parsing will fail, see line 50. So it will not blow up.
It is a nice discussion though what you would expect to be "an iteration". Should a parsing attempt, where eventually 0 bytes are parsed, be considered an iteration? Also, would we expect rep(def("val", add(CURRENT_ITERATION, con(1))))
to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if I read the code as is, I think your first comment does happen (I indeed meant something like a NPE)? Because there is no value yet when you eval CURRENT_ITERATION, and the only way the recursion stops is if it encounters a value? Could you add these samples as a test?
As to the discussion, I would probably expect your example to work, because we are in fact in an iteration? I also think conceptually a 0 bytes parsed should still be considered an iteration, as I feel it is the iteration that happens, not what happens inside the iteration (wow, confusing sentence 😄). What would be the downsides of this, or what problems would it pose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented CURRENT_INDEX differently. Instead of extracting it from ParseGraph, it is kept in the ParseState. I added tests for examples similar to rep(def("val", add(CURRENT_ITERATION, con(1))))
.
I will need to improve the code a little before I commit.
|
||
/** | ||
* A {@link ValueExpression} that represents the current iteration in an | ||
* iterable {@link io.parsingdata.metal.token.Token} (e.g. when inside a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps import the class names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -0,0 +1,84 @@ | |||
/* | |||
* Copyright 2013-2016 Netherlands Forensic Institute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2018?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
return ImmutableList.create(Optional.of(createFromNumeric(currentIteration, new Encoding()))); | ||
} | ||
|
||
private Trampoline<ParseGraph> findCurrentIterable(final ParseItem item, final ParseGraph iterableCandidate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really like some inline comments or javadoc stating what these things do 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not applicable anymore. ;) These methods are removed, because of the new implementation.
|
||
@Override | ||
public ImmutableList<Optional<Value>> eval(final ParseState parseState, final Encoding encoding) { | ||
final ParseGraph currentIterable = findCurrentIterable(parseState.order, ParseGraph.EMPTY).computeResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool way to calculate the iteration by the way 😄
…within ParseState, instead of extracting it out of the ParseGraph. The iteration is now independent on the parse result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me! Please see my comments.
Also: I think you need to cover one more branch somewhere to get coverage back to 100%.
.gitignore
Outdated
@@ -1,5 +1,6 @@ | |||
# Maven | |||
target/ | |||
out/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, where did this come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good question! Apparently, IntelliJ has created this directory during build. Unclear why though. Wasn't planning to commit this. Will remove it.
if (parseState.iterations.head == null) { | ||
return ImmutableList.create(Optional.empty()); | ||
} | ||
return ImmutableList.create(Optional.of(createFromNumeric(parseState.iterations.head, new Encoding()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default new Encoding()
is probably good enough here. For a minute I thought we needed to pass the encoding
on, but I think this is correct.
Side note: Don't we have a DEFAULT_ENCODING
available somewhere? I wonder how often we do new Encoding()
like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is not a default encoding as constant available. We do this 6 times in core; 4 in Shorthand, 1 in CurrentOffset, and 1 in CurrentIteration. In test scope we call it 7 times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point: #245
return Arrays.asList(new Object[][] { | ||
{ "[0, 1, 2, 3, 255] rep(CURRENT_ITERATION), def(255)", seq(rep(VALUE_EQ_ITERATION), VALUE_EQ_255), stream(0, 1, 2, 3, 255), enc(), true }, | ||
{ "[0, 1, 2, 3] repn=4(CURRENT_ITERATION)", repn(VALUE_EQ_ITERATION, con(4)), stream(0, 1, 2, 3), enc(), true }, | ||
{ "[255, 0, 1, 2, 3, 255] def(255), while<3(CURRENT_ITERATION), def (255)", seq(VALUE_EQ_255, whl(VALUE_EQ_ITERATION, not(eq(con(3)))), VALUE_EQ_255), stream(255, 0, 1, 2, 3, 255), enc(), true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: the description of the test should actually be while!=3
instead of while<3
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct. I will adjust it.
{ "[0, 1] seq(!CURRENT_ITERATION, ...)", seq(VALUE_NOT_EQ_ITERATION, VALUE_NOT_EQ_ITERATION), stream(0, 1), enc(), true }, | ||
{ "[0] !CURRENT_ITERATION", VALUE_NOT_EQ_ITERATION, stream(0), enc(), true }, | ||
{ "[0 | 0, 1 | 0, 0, 2 | 0, 0, 0, 3] rep(CURRENT_ITERATION)", rep(def("value", add(CURRENT_ITERATION, con(1)), eqNum(CURRENT_ITERATION))), stream(0, 0, 1, 0, 0, 2, 0, 0, 0, 3), enc(), true }, | ||
{ "[1, 1, 0, 1 | 0 | 2 | 3] repn=4(size), rep(def(any, sizeRef(CURRENT_INDEX)))", seq(repn(def("size", 1), con(4)), rep(def("value", nth(ref("size"), CURRENT_ITERATION)))), stream(1, 1, 0, 1, 0, 2, 3), enc(), true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to read this last case over a few times until I understood what is going on here. Would it be possible to add a separate unit test to make absolutely sure that this works as intended? The expression nth(ref("size"), CURRENT_ITERATION)
will result in three times a 1
and one time a 0
. But did that happen in the expected order, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, yes, it happed in the expected order, which is why I omitted the 1
, but I do understand why it is confusing. I wanted to add a test that resulted in a 0-size def, which does not result into an item within the parseGraph, but does count as a iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know when you've added that test 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've improved the above test to verify this behaviour. It now does the following:
- Iterate over a list of sizes (
repn
=4, read in the values (1, 1, 0, 1)). - Iterate again for 4 times (
repn
=4), this time only parsing a value ifnot(eq(nth(ref("size"), CURRENT_ITERATION)
, so if theCURRENT_ITERATION
th value in the list is non-zero (this happens in iterations 0, 1 and 3. - Then parse a value of 1 byte that must be equal to
CURRENT_ITERATION
.
The fact that the parsed and value-checked values are 0, 1 and 3 verifies that CURRENT_ITERATION
continues counting even if an iteration does not produce a value (when
is implemented with a zero-size def
inside a cho
).
I agree with the original comment that the test is a little complex, maybe now even more so. We should probably simplify it, but for now I think verifying behaviour is most important.
core/src/main/java/io/parsingdata/metal/token/IterableToken.java
Outdated
Show resolved
Hide resolved
@@ -36,43 +37,55 @@ | |||
public final ParseGraph order; | |||
public final BigInteger offset; | |||
public final Source source; | |||
public final ImmutableList<BigInteger> iterations; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that now could happen is that closeBranch(Token)
is called with a Token
that was not the one used for addBranch(Token)
. This would of course be a bug, but since these methods are public
, we might need to guard for it. We could fix that by keeping track which token is used for the iteration. E.g. something like:
public final ImmutableList<Pair<Token, BigInteger>> iterations;
Whenever we call closeBranch(Token)
, we can verify that the Token
of the current iteration is indeed the one being passed. We could even pass Token
to iter()
and be sure that we are iterating the expected Token
(but that might be overkill).
(Note: We have a Pair
class in Selection
, but it is not generic.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! I'm usually not a fan of using a Pair class, but I cannot think of an alternative, other then creating a new class, which would most likely be implemented as the Pair class. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it in this case be an identity comparison instead of a (semantic) equality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's a good point... If we want to really make sure we are using the same object, instead of just an equal one.
} | ||
|
||
public static ParseState createFromByteStream(final ByteStream input) { | ||
return createFromByteStream(input, ZERO); | ||
} | ||
|
||
public ParseState addBranch(final Token token) { | ||
return new ParseState(order.addBranch(token), source, offset); | ||
if (token.isIterable()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a oneliner... Does @jvdb approve of this heresy? 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it could, but should it? ;) The code now clearly shows the difference between the two states, which will not be as clear if it is a oneliner. @jvdb will approve this, I'm sure! :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be entirely honest, seeing multiple identical addBranch()
-invocations does hurt a little... But I'm not religious about this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so you both would prefer:
return new ParseState(order.addBranch(token), source, offset, token.isIterable() ?
iterations.add(ZERO) : iterations);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer that yes 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little worried when looking at the top of the diff and saw the addition of a list to the ParseState
. But it does make the implementation of CurrentIteration
so much simpler (and more correct in corner cases) that it's absolutely worth the added data (which is going to be negligible in terms of memory usage/speed anyway).
I think @rdvdijk's point about checking that the correct Token
is closed is a valid point though, so please let's have a look at doing that.
Furthermore, a very nice optimization of the iterable tokens by using a base class. The amount of red in the diff of the affected classes says enough!
I do hope however that we can fix the BetterCodeHub score though, so let's look at that as well.
In short: excellent work!!
this.order = checkNotNull(order, "order"); | ||
this.source = checkNotNull(source, "source"); | ||
this.offset = checkNotNegative(offset, "offset"); | ||
this.iterations = checkNotNull(iterations, "iteration"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, missing s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
return new ParseState(order.add(parseReference), source, offset, iterations); | ||
} | ||
|
||
public ParseState iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iterate/addIteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, iterate it is.
core/src/main/java/io/parsingdata/metal/expression/value/reference/CurrentIteration.java
Outdated
Show resolved
Hide resolved
/** | ||
* A {@link ValueExpression} that represents the 0-based current iteration in an | ||
* iterable {@link Token} (e.g. when inside a {@link Rep} or {@link RepN}). | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps also mention it can be checked if a Token is iterable, using a method reference link?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
|
||
public final Token token; | ||
|
||
IterableToken(String name, final Token token, Encoding encoding) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing final.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
this.token = checkNotNull(token, "token"); | ||
} | ||
|
||
protected final Optional<ParseState> parse(final Environment environment, final Function<Environment, Boolean> stopCondition, final Function<Environment, Optional<ParseState>> ifIterationFails) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function<Env, Boolean>
= Predicate<Env>
:) (unless you would want it to return a null)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tnx!
.map(nextParseState -> intermediate(() -> iterate(environment.withParseState(nextParseState), count - 1))) | ||
.orElseGet(() -> complete(Util::failure)); | ||
final BigInteger count = counts.head.get().asNumeric(); | ||
return parse(environment, env -> env.parseState.iterations.head.compareTo(count) >= 0, env -> failure()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a method like ParseState#currentIteration = iterations.head
?
return complete(() -> success(environment.parseState.closeBranch(this))); | ||
} | ||
return token | ||
.parse(environment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My head is a bit foggy, but isn't this an unsafe recursive call? If you would just nest iterable tokens which do not stop at the first iteration?
core/src/main/java/io/parsingdata/metal/expression/value/reference/CurrentIteration.java
Show resolved
Hide resolved
The updated current iteration implementation is very nicely done by the way 😄 |
core/src/main/java/io/parsingdata/metal/expression/value/reference/CurrentIteration.java
Outdated
Show resolved
Hide resolved
{ "[0, 1] seq(!CURRENT_ITERATION, ...)", seq(VALUE_NOT_EQ_ITERATION, VALUE_NOT_EQ_ITERATION), stream(0, 1), enc(), true }, | ||
{ "[0] !CURRENT_ITERATION", VALUE_NOT_EQ_ITERATION, stream(0), enc(), true }, | ||
{ "[0 | 0, 1 | 0, 0, 2 | 0, 0, 0, 3] rep(CURRENT_ITERATION)", rep(def("value", add(CURRENT_ITERATION, con(1)), eqNum(CURRENT_ITERATION))), stream(0, 0, 1, 0, 0, 2, 0, 0, 0, 3), enc(), true }, | ||
{ "[1, 1, 0, 1 | 0 | 2 | 3] repn=4(size), rep(def(any, sizeRef(CURRENT_INDEX)))", seq(repn(def("size", 1), con(4)), rep(def("value", nth(ref("size"), CURRENT_ITERATION)))), stream(1, 1, 0, 1, 0, 2, 3), enc(), true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let me know when you've added that test 👍
…and CURRENT_ITERATION behaviour inside an IterableToken.
…mmutablePair class.
Implementation of current iteration for Rep, Repn and While loops. The loops are 0-based, meaning that the first iteration of a loop returns 0 when CURRENT_ITERATION is used.
This issue is scoped to only implement referencing to the current iteration. Current iteration of nested loops are supported, but the implementation to reference to a parent loop from within nested loops is handled in a different issue (#240).
TODO based on review comments: