Skip to content

Get Started!

Zhengrong Wang edited this page Dec 13, 2020 · 10 revisions

Prerequisites

We tested this workflow on Ubuntu 20.10, and it should basically work on other platforms. You need these packages to compile the whole framework. You can also use the docker image we provided below.

$ sudo apt install cmake build-essential autoconf libtool zlib1g zlib1g-dev python3 python-is-python3 python3-pip
$ sudo pip install six

Set Up

These are the commands used to set up GemForge.

$ git clone --recurse-submodules [this repo]
$ cd [this repo]
# Set up some environment variables.
$ source envs.sh
# Build llvm.
$ cd llvm && bash setup.sh
# Build Gem Forge transformation and gem5, etc.
bash setup.sh

Matrix-Vector Multiply Example

There are some microbenchmarks in transform/benchmark. Here we use a matrix-vector multiply example to demonstrate the whole workflow. It is vectorized by AVX-512 and parallelized with OpenMP. It is located at tranform/benchmark/GemForgeMicroSuite/omp_dense_mv_blk. We also provide a make file in example that you can easily follow.

Gem Forge takes a single LLVM bitcode as the input. Use this command to compile the program. Since Gem Forge was first designed to process LLVM IR trace, we call the bitcode trace.bc. But don't be confused by this name, this example does not require any trace information and is purely based on static inforation.

$ cd example && make traced.bc

Next we build two binaries: one for vanilla x86 and one with Stream ISA enabled:

$ make valid.exe
$ make stream.exe

Then we simulate 4 configurations (64 cores with 8x8 mesh):

  • valid.o8.sim: vanilla OOO8 core without any prefetching support.
  • valid.o8-pf.sim: OOO8 core with a Bingo prefetcher at L1$ and a stride prefetcher at L2$.
  • stream.o8.sim: OOO8 core with Stream Engine at the core.
  • stream.o8-float.sim: OOO8 with Stream Engine both at the core and in the cache (aka. Stream Floating).

You can simulate them all by:

$ make sim-all

That's it! You can verify the results in valid/o8, etc., and check that Stream Floating indeed improves the performance and reduces NoC traffic.

Docker

You can also try this docker file:

FROM ubuntu:20.10

ARG USER_ID
ARG GROUP_ID

RUN groupadd -g ${GROUP_ID} gf &&\
    useradd -l -u ${USER_ID} -g gf gf &&\
    usermod -aG sudo gf &&\
    install -d -m 0755 -o gf -g gf /home/gf &&\
    apt update &&\
    apt install -y \
    cmake \
    build-essential \
    autoconf \
    libtool \
    zlib1g \
    zlib1g-dev \
    python3 \
    python3-pip \
    python-is-python3 \
    scons \
    git \
    curl \
    wget \
    sudo \
    vim \
    zsh &&\
    pip3 install six

RUN echo 'gf:gf' | chpasswd

USER gf

Save this into Dockerfile. Then build the image and start the container:

$ docker image build --build-arg USER_ID=$(id -u $USER) --build-arg GROUP_ID=$(id -g $USER) -t gemforge:0.1 .
$ docker run --network host --detach -i --mount type=bind,src=[this repo],dst=[mounted path] --name gemforge gemforge:0.1
$ docker exec -it gemforge zsh

Then you can follow the commands above to build and run!

Clone this wiki locally