Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eof: new contract creation #15512

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

rodiazet
Copy link
Contributor

@rodiazet rodiazet commented Oct 14, 2024

Depends on #15456. Merged
Depends on #15521 Dropped
Depends on: #15529 Merged
Depends on #15535 Merged
Depends on #15536

Copy link

Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.

If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.

If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.

@rodiazet rodiazet force-pushed the eof-contract-creation branch 4 times, most recently from f8a9ae7 to 947c5dd Compare October 14, 2024 13:36
@cameel cameel added EOF has dependencies The PR depends on other PRs that must be merged first labels Oct 14, 2024
@cameel cameel changed the title Eof contract creation EOF contract creation Oct 14, 2024
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only skimmed through the PR for now. Here's some initial feedback.

My main question would be whether it wouldn't make more sense to deal with loadimmutable/setimmutable in a separate PR first. It's related dataloadn and looks like creation depends on immutables to some extent.

Comment on lines 1427 to 1441
case EofCreate:
{
ret.bytecode.push_back(static_cast<uint8_t>(Instruction::EOFCREATE));
ret.bytecode.push_back(static_cast<uint8_t>(item.data()));
break;
}
case ReturnContract:
{
ret.bytecode.push_back(static_cast<uint8_t>(Instruction::RETURNCONTRACT));
ret.bytecode.push_back(static_cast<uint8_t>(item.data()));
break;
}

This comment was marked as resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also probably worth considering whether we should maybe create a general assembly item for operations with immediates like I mentioned in #15456 (comment). Though I guess we'd need more than one to accommodate different numbers of immediates and their sizes?

I'm not sure about this and it's also something we can refactor later, but I just wanted to put the general idea for discussion.

libevmasm/AssemblyItem.h Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGenerator.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the eof-contract-creation branch 2 times, most recently from 6567616 to 82930c3 Compare October 15, 2024 13:06
@cameel cameel removed the has dependencies The PR depends on other PRs that must be merged first label Oct 16, 2024
@rodiazet rodiazet force-pushed the eof-contract-creation branch 2 times, most recently from feeb58b to 75122e0 Compare October 18, 2024 13:23
@rodiazet

This comment was marked as resolved.

@rodiazet rodiazet force-pushed the eof-contract-creation branch 2 times, most recently from 7bde2af to e1840a3 Compare October 18, 2024 13:29
@rodiazet rodiazet changed the title EOF contract creation eof: new contract creation Oct 18, 2024
@rodiazet rodiazet force-pushed the eof-contract-creation branch 3 times, most recently from a200d15 to 107b86e Compare October 18, 2024 14:11
@cameel cameel added the has dependencies The PR depends on other PRs that must be merged first label Oct 18, 2024
libevmasm/Assembly.cpp Outdated Show resolved Hide resolved
libevmasm/SemanticInformation.cpp Show resolved Hide resolved
libevmasm/SemanticInformation.cpp Show resolved Hide resolved
test/libyul/objectCompiler/eof/auxdata_load_store.yul Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the eof-contract-creation branch 5 times, most recently from def54bb to 46af949 Compare October 21, 2024 11:05
libsolidity/codegen/ir/IRGenerationContext.cpp Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGenerator.cpp Outdated Show resolved Hide resolved
libyul/backends/evm/EVMDialect.cpp Outdated Show resolved Hide resolved
test/tools/yulInterpreter/EVMInstructionInterpreter.cpp Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGeneratorForStatements.cpp Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGeneratorForStatements.cpp Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGenerationContext.cpp Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGenerator.cpp Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGenerator.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the eof-contract-creation branch 7 times, most recently from 792dd39 to d256b13 Compare October 23, 2024 14:32
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a few things to fix here, but we're very close to being done with it. The most important one is the proper validation and resolution of subobject names in builtins. The others are trivial to fix.

There's also a bunch of small things that could be cleaned up. I didn't want to drop a ton of comments so I just fixed them myself and pushed it as a fixup into a copy of your eof-contract-creation branch. Please take a look and cherry-pick it into the PR.

libsolidity/codegen/ir/IRGenerationContext.cpp Outdated Show resolved Hide resolved
libyul/AsmAnalysis.cpp Outdated Show resolved Hide resolved
libyul/backends/evm/EVMDialect.cpp Outdated Show resolved Hide resolved
libsolidity/codegen/ir/IRGenerator.cpp Outdated Show resolved Hide resolved
@rodiazet rodiazet force-pushed the eof-contract-creation branch 6 times, most recently from e213e0e to 57bcb13 Compare October 24, 2024 15:29
cameel
cameel previously approved these changes Oct 25, 2024
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, since only a few cosmetic things are left now.

libsolidity/codegen/ir/IRGenerator.cpp Outdated Show resolved Hide resolved
@@ -0,0 +1,13 @@
object "a" {
code {
let addr := eofcreate("b", 0, 0, 0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to also cover the case where the name refers to data rather than an object.

Copy link
Contributor Author

@rodiazet rodiazet Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But this should generate an error. Do wanna fix this in #15536 ?
We would have to somehow distinguish between data names nad object names. One idea is to create dedicated struct which holds the names and implements proper find depends on what are we looking for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this in separated commit d0e7890
Let me know if it's ok besides naming which has to be fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test needs to be here, because in #15536 you don't have eofcreate() and returncontract() and those are the only things that you can test it with.

And we don't have to distinguish objects from data just for this test, it should already work on EOF because only object names are supposed to be accepted. The test is only meant to confirm that it works correctly. But yeah, being able to distiguish them would not hurt because it would let us generate a more specific error message rather than just saying "this object does not exist".

The general idea with the struct and separating data from objects is good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

// dup1
// dup1
// dup1
// eofcreate(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually looks a bit ambiguous, as if EOFCREATE was getting only one argument.

How about we change the notation to for immediates in asm output to {} in AssemblyItem::toAssemblyText)? Then we'd get this:

Suggested change
// eofcreate(0)
// eofcreate{0}

I'd do that for RETURNCONTRACT and AUXDATALOADN as well. And I'd also return true for RETURNCONTRACT from AssemblyItem::canBeFunctional(). Is there actually a good reason why it's not like that already?

Copy link
Contributor Author

@rodiazet rodiazet Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change to brackets in AssemblyItem::toAssemblyText. Fot DATALOADN I would do this in a separated PR #15545.

Regarding ssemblyItem::canBeFunctional()
It depends how you define functional context. I assumed that if there is no return value it's not functional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends how you define functional context. I assumed that if there is no return value it's not functional.

canBeFunctional() determines whether the parameters of the item should be printed in parentheses after it (in functional style) or as separate instructions before it.

At least that's how I understand it from how Functionalizer uses it:

if (!(
_item.canBeFunctional() &&
_item.returnValues() <= 1 &&
_item.arguments() <= m_pending.size()
))
{
flush();
m_out << m_prefix << (_item.type() == Tag ? "" : " ") << expression << std::endl;
return;
}
if (_item.arguments() > 0)
{
expression += "(";
for (size_t i = 0; i < _item.arguments(); ++i)
{
expression += m_pending.back();
m_pending.pop_back();
if (i + 1 < _item.arguments())
expression += ", ";
}
expression += ")";
}

Here m_pending is a list of previous expressions, which are assumed to be the arguments.

I think that we want true for almost everything. The only things for which we return false seem to be PUSH and DUP instructions and Tag, AssignImmutable, VerbatimBytecode items. Not sure about the reasons for all of them but e.g. DUP is defined as taking the whole top of the stack as arguments (so e.g. DUP16 takes 17 arguments).

The closest thing we have to RETURNCONTRACT is probably RETURN and for that we return true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I wonder if it wouldn't be better to show sub names like we do for dataOffset/dataSize instead of the numeric container IDs.

So for example in the assembly we would print eofcreate{sub_0} rather than eofcreate{0}.

Though I guess with EOF the ID approach is actually viable because we can't have nesting so maybe the current approach is still good. It's something to consider though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First fixed. I will check the later.

libyul/AsmAnalysis.cpp Outdated Show resolved Hide resolved
Comment on lines 195 to 198
else if (type() == EOFCreate)
return 4;
else if (type() == ReturnContract)
return 2;
Copy link
Member

@cameel cameel Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at #15534 I realized that here we should have actually used the instruction info, since for the new assembly item types we actually do have it:

	if (
		type() == Operation ||
		type() == AuxDataLoadN ||
		type() == EOFCreate ||
		type() == ReturnContract
	)
		// The latest EVMVersion is used here, since the InstructionInfo is assumed to be
		// the same across all EVM versions except for the instruction name.
		return static_cast<size_t>(instructionInfo(instruction(), EVMVersion()).args);

Same in returnValues(), canBeFunctional(), nameAndData() and operator<<(), at least for EOFCREATE and RETURNCONTRACT, because they have their own item types only due to the immediate argument and otherwise would just be an Operation.

EDIT: I guess the instruction() field is not set in these items. But it really should. It would simplify things in SemanticInformation too.

Copy link
Contributor Author

@rodiazet rodiazet Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First fixed but regarding the second it requires adding new AssemblyItem contructor which accepts AssemblyItemType and Instruction.

Copy link
Contributor Author

@rodiazet rodiazet Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would help a lot if we had an easy distinction between AssemblyItemTypewhich has m_instruction properly set or not. Having this member properly set we still have to manually filter by item type like here i.e.:

bool SemanticInformation::altersControlFlow(AssemblyItem const& _item)
{
	if (_item.type() != evmasm::Operation ||
	    _item.type() != evmasm::ReturnContract)
		return false;

	switch (_item.instruction())
	{
	// note that CALL, CALLCODE and CREATE do not really alter the control flow, because we
	// continue on the next instruction
	case Instruction::JUMP:
	case Instruction::JUMPI:
	case Instruction::RETURN:
	case Instruction::SELFDESTRUCT:
	case Instruction::STOP:
	case Instruction::INVALID:
	case Instruction::REVERT:
	case Instruction::RETURNCONTRACT:
		return true;
	default:
		return false;
	}
}

So my idea is to add a static helper isSingleInstruction to AssemblyItem. I will prepare separated PR with this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added new constructor and adjusted instruction() function accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EOF external contribution ⭐ has dependencies The PR depends on other PRs that must be merged first
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants