Contains sources of Gematria, a framework for machine learning on machine code. It includes implementations of the GRANITE model and the Ithemal hierarchical LSTM model for learning inverse throughput of basic blocks.
Our models are built on top of TensorFlow 2.x (using the TensorFlow 1.x compatibility layer) in a mix of C++ and Python. Most of the training code is written in Python; we use C++ for the more demanding parts of the code like graph construction. We use pybind11 to make C++ APIs available in Python.
Basic requirements that need to be installed before starting:
- Bazel 6.0 or newer.
- A C++ compiler supported by Bazel that compiles C++17. Recent versions of GCC and Clang on Linux both fit the bill.
- Python 3.10 or newer.
- Git.
- PIP.
Additional dependencies, including TensorFlow, Protocol buffers, and different
Python libraries are installed through PIP and through Bazel's WORKSPACE
file.
We strongly recommend using
virtualenv to install Python packages to
avoid dependency version conflicts with other libraries.
# Get the source code.
$ git clone https://github.com/google/gematria.git
$ cd gematria
# Set up virtualenv.
$ pip install virtualenv
$ virtualenv env
$ . env/bin/activate
# Install Python dependencies.
$ pip install -r requirements.in
# On OS X only. The dependencies of tensorflow-ranking are not set up correctly
# and it needs to be installed manually.
$ pip install --no-deps tensorflow-ranking.
# Build the project, run tests, ...
$ bazel build ...
$ bazel test ...
We develop and test our code on Linux and x86-64, and we test it on Mac OS X and ARM. While we did not test it, we expect it to work with minimal changes also on other architectures and platforms that run TensorFlow.
See the training and inference guides.
See the separate document.
- Issue tracker: https://github.com/google/Gematria/issues
We welcome patches -- see CONTRIBUTING for more information on how to submit a patch.
@inproceedings{granite:iiswc:2022,
author = {O. Sykora and P. Phothilimthana and C. Mendis and A. Yazdanbakhsh},
booktitle = {2022 IEEE International Symposium on Workload Characterization (IISWC)},
title = {{GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation}},
year = {2022},
}