This is ByteSpector, a tool to analyse EVM bytecode. A quick preview of what ByteSpector can do is available at https://bytespector.org.
This project provides an EVM bytecode disassembler and Control Flow Graph (CFG) generator. ByteSpector can verify the CFGs by generating a Dafny file that encodes the semantics of the EVM bytecode. The Dafny file can be verified with Dafny. If a CFG is successfully verified, we obtain the following guarantees on the CFG and the bytecode:
- the CFG simulates the bytecode concrete executions ; as a result the CFG can be used to over-approximate the behaviour of the bytecode; = the bytecode does not generate stack-underflow exceptions. (stack-overflows can be checked too).
Warning
The disassembler supports the latest opcodes like PUSH0
, EIP-3855, and is ready for RJump
s EIP-4200 but they are not fully implemented yet.
Warning
Using the tool with the Java implementation requires Java version >= 23.
The disassembler takes as an input some binary representation, EVM bytecode, and produces a readable version of it.
For instance the following binary string, prog
:
60015f5b600a81106014575060405260206040f35b905f5b600a8110602c57506001600a91019190506003565b9060020290601756
is disassembled into the more readable (but arguably still opaque!) EVM assembly code, left-hand side below, and the CFG is depicted on the right-hand side.
The CFG generator outputs a DOT representation of the graph. Hovering some items (segments) reveals some information about the instructions, their gas cost, and for the segment about their stack effects.
Tip
The generator uses a combination of abstract interpretation, loop folding (using weakest pre-conditions) and automata minimisation. It can re-construct CFGs with nested loops, function calls.
Examples of CFGs in DOT format and SVG format are available in the test folder.
A front end is provided at https://bytespector.org or alternatively,
you can use the Graphviz-Online tool to visualise the dot
files.
The input to the disassembler and CFG generator is the deployed (bin-runtime
) part of the compiled code if you compile Solidity with solc
.
For the examples in the repo I have used Yul and solc --yul
to get a text representation of the Yul code that includes the binary representation hexadecimal string.
The CFGs can be formally verified using a Dafny proof object.
The project is written in Dafny but Dafny's backends can be used to generate some target code in several languages. To begin with we have generated artefacts in Python, Java and C# (Dotnet) code.
So you don't need to install Dafny to use the disassembler, you can run the Python or java versions provided in the build/libs
.
To disassemble the bytecode in a given file file.evm
use the following command:
evm-dis git:(main) ✗ ./disassemble.sh file.evm
To generate the CFG in DOT format for the bytecode in a file file.evm
use the following command:
evm-dis git:(main) ✗ ./makeCFG.sh src/dafny/tests/src/simple/simpleCall.bin
Processing file: src/dafny/tests/src/simple/simpleCall.bin
Max depth size: 100
Shortname: simpleCall.bin
CFG (DOT) format generared in build/dot/simpleCall.bin/simpleCall.bin.dot
To generate the Dafny verifier for the bytecode in a file file.evm
use the following command:
evm-dis git:(main) ✗ ./makeProofObj.sh 'src/dafny/tests/src/simple/simpleCall.bin'
Processing file: src/dafny/tests/src/simple/simpleCall.bin
Shortname: simpleCall // resulting file: build/proofs/simpleCall/simpleCall-cfg-verifier.dfy
seg size: 20
If you have Dafny installed (recommended Dafny 4.8.1), you can verify the generated Dafny code using the following command:
evm-dis git:(main) ✗ dafny --version
4.8.1+d15eef77080d3262d783bbed92b285bf148cce6b
evm-dis git:(main) ✗ dafny verify build/proofs/simpleCall/simpleCall-cfg-verifier.dfy
Dafny program verifier finished with 3 verified, 0 errors
You can view the options of the disassembler using:
evm-dis git:(main) ✗ ./run-evm-dis.sh --help
Not enough arguments
usage: <this program> [--help] [--dis] [--proof] arg0 [--refine] arg0 [--segment] [--lib] arg0 [--cfg] arg0 [--raw] [--size] arg0 [--notable] [--title] arg0 [--info] <string>
options
--help [-h] Display help and exit
--dis [-d] Disassemble <string>
--proof [-p] Generate proof object to verify/use the CFG for <string>
--refine [-e] Generate proof object with distinct segments <string>
--segment [-s] Print segment of <string>
--lib [-l] The path to the Dafny-EVM source code. Used to add includes files in the proof object.
--cfg [-c] Max depth. Control flow graph in DOT format
--raw [-r] Display non-minimised and minimised CFGs
--size [-z] The max size of segments. Default is upto terminal instructions or JUMPDEST.
--notable [-n] Don't use tables to pretty-print DOT file. Reduces size of the DOT file.
--title [-t] The name of the program.
--info [-i] The stats of the program (size, segments).
ByteSpector can re-construct CFGs and verify them for a large number of deployed contracts.
For example, we have used it on the list less_than_3000_opcodes.txt
from EVMLisa.
Note
Out of the 1704 contracts (address) in less_than_3000_opcodes.txt
, 979 contracts were successfully disassembled and CFGs generated. The remaining 725 contracts were not disassembled due to the presence of RJUMP
s in the bytecode which is not fully implemented yet.
The bytecode for the 979 benchmarks contracts is in the ./etherscan folder.
The corresponding CFGs are in the ./build/dot/etherscan folder.
The verification files (.dfy
) and the verification results (*-stats.csv
) are in the ./build/proofs/etherscan folder.