-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eof: new contract creation #15512
base: develop
Are you sure you want to change the base?
eof: new contract creation #15512
Conversation
Thank you for your contribution to the Solidity compiler! A team member will follow up shortly. If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother. If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix. |
f8a9ae7
to
947c5dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only skimmed through the PR for now. Here's some initial feedback.
My main question would be whether it wouldn't make more sense to deal with loadimmutable
/setimmutable
in a separate PR first. It's related dataloadn
and looks like creation depends on immutables to some extent.
libevmasm/Assembly.cpp
Outdated
case EofCreate: | ||
{ | ||
ret.bytecode.push_back(static_cast<uint8_t>(Instruction::EOFCREATE)); | ||
ret.bytecode.push_back(static_cast<uint8_t>(item.data())); | ||
break; | ||
} | ||
case ReturnContract: | ||
{ | ||
ret.bytecode.push_back(static_cast<uint8_t>(Instruction::RETURNCONTRACT)); | ||
ret.bytecode.push_back(static_cast<uint8_t>(item.data())); | ||
break; | ||
} |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also probably worth considering whether we should maybe create a general assembly item for operations with immediates like I mentioned in #15456 (comment). Though I guess we'd need more than one to accommodate different numbers of immediates and their sizes?
I'm not sure about this and it's also something we can refactor later, but I just wanted to put the general idea for discussion.
6567616
to
82930c3
Compare
feeb58b
to
75122e0
Compare
This comment was marked as resolved.
This comment was marked as resolved.
7bde2af
to
e1840a3
Compare
a200d15
to
107b86e
Compare
107b86e
to
2bb05a2
Compare
def54bb
to
46af949
Compare
792dd39
to
d256b13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still a few things to fix here, but we're very close to being done with it. The most important one is the proper validation and resolution of subobject names in builtins. The others are trivial to fix.
There's also a bunch of small things that could be cleaned up. I didn't want to drop a ton of comments so I just fixed them myself and pushed it as a fixup into a copy of your eof-contract-creation
branch. Please take a look and cherry-pick it into the PR.
e213e0e
to
57bcb13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, since only a few cosmetic things are left now.
@@ -0,0 +1,13 @@ | |||
object "a" { | |||
code { | |||
let addr := eofcreate("b", 0, 0, 0, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to also cover the case where the name refers to data
rather than an object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But this should generate an error. Do wanna fix this in #15536 ?
We would have to somehow distinguish between data names nad object names. One idea is to create dedicated struct which holds the names and implements proper find
depends on what are we looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this in separated commit d0e7890
Let me know if it's ok besides naming which has to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test needs to be here, because in #15536 you don't have eofcreate()
and returncontract()
and those are the only things that you can test it with.
And we don't have to distinguish objects from data just for this test, it should already work on EOF because only object names are supposed to be accepted. The test is only meant to confirm that it works correctly. But yeah, being able to distiguish them would not hurt because it would let us generate a more specific error message rather than just saying "this object does not exist".
The general idea with the struct and separating data from objects is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
// dup1 | ||
// dup1 | ||
// dup1 | ||
// eofcreate(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually looks a bit ambiguous, as if EOFCREATE
was getting only one argument.
How about we change the notation to for immediates in asm output to {}
in AssemblyItem::toAssemblyText)
? Then we'd get this:
// eofcreate(0) | |
// eofcreate{0} |
I'd do that for RETURNCONTRACT
and AUXDATALOADN
as well. And I'd also return true for RETURNCONTRACT
from AssemblyItem::canBeFunctional()
. Is there actually a good reason why it's not like that already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change to brackets in AssemblyItem::toAssemblyText
. Fot DATALOADN I would do this in a separated PR #15545.
Regarding ssemblyItem::canBeFunctional()
It depends how you define functional context. I assumed that if there is no return value it's not functional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends how you define functional context. I assumed that if there is no return value it's not functional.
canBeFunctional()
determines whether the parameters of the item should be printed in parentheses after it (in functional style) or as separate instructions before it.
At least that's how I understand it from how Functionalizer
uses it:
solidity/libevmasm/Assembly.cpp
Lines 324 to 345 in 879d8e6
if (!( | |
_item.canBeFunctional() && | |
_item.returnValues() <= 1 && | |
_item.arguments() <= m_pending.size() | |
)) | |
{ | |
flush(); | |
m_out << m_prefix << (_item.type() == Tag ? "" : " ") << expression << std::endl; | |
return; | |
} | |
if (_item.arguments() > 0) | |
{ | |
expression += "("; | |
for (size_t i = 0; i < _item.arguments(); ++i) | |
{ | |
expression += m_pending.back(); | |
m_pending.pop_back(); | |
if (i + 1 < _item.arguments()) | |
expression += ", "; | |
} | |
expression += ")"; | |
} |
Here m_pending
is a list of previous expressions, which are assumed to be the arguments.
I think that we want true
for almost everything. The only things for which we return false seem to be PUSH
and DUP
instructions and Tag
, AssignImmutable
, VerbatimBytecode
items. Not sure about the reasons for all of them but e.g. DUP
is defined as taking the whole top of the stack as arguments (so e.g. DUP16
takes 17 arguments).
The closest thing we have to RETURNCONTRACT
is probably RETURN
and for that we return true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I wonder if it wouldn't be better to show sub names like we do for dataOffset
/dataSize
instead of the numeric container IDs.
So for example in the assembly we would print eofcreate{sub_0}
rather than eofcreate{0}
.
Though I guess with EOF the ID approach is actually viable because we can't have nesting so maybe the current approach is still good. It's something to consider though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First fixed. I will check the later.
57bcb13
to
bc58b12
Compare
0c242cd
to
5e691b4
Compare
libevmasm/AssemblyItem.cpp
Outdated
else if (type() == EOFCreate) | ||
return 4; | ||
else if (type() == ReturnContract) | ||
return 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at #15534 I realized that here we should have actually used the instruction info, since for the new assembly item types we actually do have it:
if (
type() == Operation ||
type() == AuxDataLoadN ||
type() == EOFCreate ||
type() == ReturnContract
)
// The latest EVMVersion is used here, since the InstructionInfo is assumed to be
// the same across all EVM versions except for the instruction name.
return static_cast<size_t>(instructionInfo(instruction(), EVMVersion()).args);
Same in returnValues()
, canBeFunctional()
, nameAndData()
and operator<<()
, at least for EOFCREATE
and RETURNCONTRACT
, because they have their own item types only due to the immediate argument and otherwise would just be an Operation
.
EDIT: I guess the instruction()
field is not set in these items. But it really should. It would simplify things in SemanticInformation
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First fixed but regarding the second it requires adding new AssemblyItem
contructor which accepts AssemblyItemType
and Instruction
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would help a lot if we had an easy distinction between AssemblyItemType
which has m_instruction
properly set or not. Having this member properly set we still have to manually filter by item type like here i.e.:
bool SemanticInformation::altersControlFlow(AssemblyItem const& _item)
{
if (_item.type() != evmasm::Operation ||
_item.type() != evmasm::ReturnContract)
return false;
switch (_item.instruction())
{
// note that CALL, CALLCODE and CREATE do not really alter the control flow, because we
// continue on the next instruction
case Instruction::JUMP:
case Instruction::JUMPI:
case Instruction::RETURN:
case Instruction::SELFDESTRUCT:
case Instruction::STOP:
case Instruction::INVALID:
case Instruction::REVERT:
case Instruction::RETURNCONTRACT:
return true;
default:
return false;
}
}
So my idea is to add a static helper isSingleInstruction
to AssemblyItem
. I will prepare separated PR with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added new constructor and adjusted instruction()
function accordingly.
5e691b4
to
d1d6eb3
Compare
Depends on #15456.MergedDepends on #15521DroppedDepends on: #15529MergedDepends on #15535MergedDepends on #15536