-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature/loop without var decl #474
base: main
Are you sure you want to change the base?
feature/loop without var decl #474
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! It looks like there are some more ways to use for-of (and for-in) loops: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for...of#examples for example exotic cases like for (let [a,b,c] of foo)
or for (a.b.c of bar)
. I wonder if we should at least already plan for supporting those as well.
Given that the LHS cannot be an arbitrary expression but must be one of a few cases (variable definition, variable, desctructuring pattern, property access, ...), probably the best/right/only way to do this is by supporting different types of these loops, with different inputs and possibly additional data (e.g. the property name as string or the shape of the desctructuring pattern). So then I think there are three high-level options:
-
Have a single BeginForOfLoop with some sort of "type" enum that indicates whether we declare a new variable, use a destructuring pattern, etc.
-
Have different BeginForOfLoop ops, e.g. BeginPlainForOfLoop, BeginForOfLoopWithReassignment, BeginForOfLoopWithPropertyAccess etc. That might be most consistent with other operations, and indeed I just realized we already have a
BeginForOfLoopWithDestruct
operation -
Design it like plain for loops, with a
BeginForOfLoopHeader
andBeginForOfLoopBody
. I think this will be difficult though because in the header, we would need exactly one additional operation of a specific kind, which is hard to implement in FuzzIL (typically we don't want such constraints so that we can just copy instructions between programs or minimize them away, etc.)
I'm currently leaning towards (2), mostly because it's consistent with what we already have and because it keeps things a bit simpler in the sense that a given operation always has the same number of inputs and outputs. WDYT?
I wanted to implement case 3 with commit 89d2d77 I noticed that there are still some persisting compilation errors. See this before and after: before
after
I assume this error stems from the JavascriptLifter in the assign method because there we create the line "let v3 = c;" This will be fixed soon |
This is ready for the final review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks really good, just 2 questions.
loopVar = existingVar | ||
} else { | ||
loopVar = emit(LoadNamedVariable(identifier.name)).output | ||
map(identifier.name, to: loopVar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do this here but not in the case above (line 436)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the case starting in line 438 is used for global variables and the case in line 436 for non-global variables. In both cases, the compiler just has an identfier as input. For global variables, lookupIdentifer will return nil because the identifier is not associted with a FuzzIL variable. Either because the global Variable has not been set but also when it has actually been set before, i.e. with StoreNamedVariable (see example "a" and "b")
Illustration
for (a of ["new a"]) {} // line 469. lookupIdentifier didn't return FuzzIL Variable because a is used for the first time here
output("value of a: " + a);
b = "old b";
for (b of ["new b"]) {} // line 469. lookupIdentifier didn't return FuzzIL Variable because the b identifier was not mapped to a FuzzIL variable
output("value of b: " + b);
var c = "old c";
for (c of ["new c"]) {} // line 467 because lookupIdentifier returned the FuzzIL Variable that the previous line mapped the identifier 'c' to
output("value of c: " + c);
for (var d of ["new d"]) {} // line 457. lhs of for loop header is not an identfier...
output("value of d: " + d);
loopVar = existingVar | ||
} else { | ||
loopVar = emit(LoadNamedVariable(identifier.name)).output | ||
map(identifier.name, to: loopVar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
@@ -886,7 +886,7 @@ public class JavaScriptLifter: Lifter { | |||
w.assign(expr, to: instr.output) | |||
|
|||
case .loadNamedVariable(let op): | |||
w.assign(Identifier.new(op.variableName), to: instr.output) | |||
w.assign(Identifier.new(op.variableName), to: instr.output, forceInlining: true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed here? To support reasignment to global variables? Why don't we inline there without this, i.e. what kind of code would we generate?
I feel like this is maybe not the right way to approach it. In our IL, I guess we would currently generate code like this:
v2 <- LoadNamedVariable 'foo'
ForInLoopWithReassign v1, v2
...
EndForInLoop
So intuitively, I'd expect that to look more like this in JS
let v2 = foo;
for (v2 in v1) {
...
}
I don't really have a great alternative though. I guess with the current way variables work, we could add an explicit StoreNamedVariable:
v2 <- LoadNamedVariable 'foo'
ForInLoopWithReassign v1, v2
StoreNamedVariable v2, 'foo'
...
EndForInLoop
Probably this isn't super important at the moment (?). Maybe in the future we'll want to redesign the way our variables work (because that's also causing a number of other issues in the compiler), and then we can also solve this properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to justify this with an example. Let's pretend the newly added forceInlining parameter doesn't exist. Then we would have an issue in the part where we lift FuzzIL->JS
JS before compiling to FuzzIL
for (a of ["new a"]) {}
b = "old b";
for (b of ["new b"]) {}
JS after lifting FuzzIL
const v1 = ["new a"];
let v2 = a;
for (v2 of v1) {
}
b = "old b";
const v5 = ["new b"];
let v6 = b;
for (v6 of v5) {
}
As we can see here, the lifting was not successful because the iterator in the loops became a new js variable even though we want to use the global variable. The reason why inlining can not happen here is a bit specific: The For*LoopWithReassignment loops we defined in JSOperations.swift have the property reassigning defined in Semantics.swift. This causes shouldTryInlining to return false. This is why I introduced forceInlining.
Without loss of generality, we will only talk about for-of loops in the following, even though everything applies to for-in loops as well. The problem (Expected variable declaration as init part of a for-of loop, found Identifier) addressed with this PR was the most common reason why compilation of JS files in test/mjsunit failed, excluding the yet to be supported different parameter types and currently pending PRs.
Description of the problem
Previously we could compile this
case 1
but not this
case 2
or even this
case 3
Solution in a nutshell
This PR only addresses case 2 so far. To support this new feature, in a nutshell, we first verify if the identifier points to a declared variable (so we know we are not in case 3) and if so, we give the variable corresponding to the identifier as input into BeginForInLoop and do not create a new inner output for the iterator. During lifting to FuzzIL and JS we check the number of inputs to know if the iterator is the second input (as in case 2) or the inner output (as in case 1). Because iterating over an object with the iterator inherently changes the iterators value, we need to mark this FuzzIL instruction as having the "reassigning" property. Otherwise inlining would happen because the lifter thinks that the iterator still has the previous value, e.g. 'foo' in case 2
Some questions and next steps
The current design choice is to give BeginForInLoop a parameter called usesPredeclaredIterator to distinguish case 1 from case 2. However, there is also the option of heaving unique IL instructions for each case. I like the current solution more but if you like the other option better I can change it.
case 3 is not supported. I want to change that by utilizing LoadNamedVariable and StoreNamedVariable
Writing tests. Do you have suggestions where to implement it`? Otherwise I would write the obligatory CompilerTest and now, since we also changed FuzzIL itself, also tests in FuzzilliTests