Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh README documentation #533

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 62 additions & 82 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,155 +1,135 @@
# WIP libgccjit codegen backend for rust
# GCC codegen backend for Rust (WIP)

[![Chat on IRC](https://img.shields.io/badge/irc.libera.chat-%23rustc__codegen__gcc-blue.svg)](https://web.libera.chat/#rustc_codegen_gcc)
[![Chat on Matrix](https://img.shields.io/badge/matrix.org-%23rustc__codegen__gcc-blue.svg)](https://matrix.to/#/#rustc_codegen_gcc:matrix.org)

This is a GCC codegen for rustc, which means it can be loaded by the existing rustc frontend, but benefits from GCC: more architectures are supported and GCC's optimizations are used.

**Despite its name, libgccjit can be used for ahead-of-time compilation, as is used here.**

## Motivation

The primary goal of this project is to be able to compile Rust code on platforms unsupported by LLVM.
A secondary goal is to check if using the gcc backend will provide any run-time speed improvement for the programs compiled using rustc.

### Dependencies

**rustup:** Follow the instructions on the official [website](https://www.rust-lang.org/tools/install)

**DejaGnu:** Consider to install DejaGnu which is necessary for running the libgccjit test suite. [website](https://www.gnu.org/software/dejagnu/#downloading)

This GCC codegen enables the official rustc compiler frontend to leverage GCC as a compiler backend. This is accomplished through interfacing with GCC's [libgccjit](https://gcc.gnu.org/wiki/JIT) library. While libgccjit is primarily intended to be used as an interface to GCC for just-in-time code generation, it can also be used for ahead-of-time compilation, as we do here.

### Goals
* Enable the compilation of Rust code for target platforms supported by GCC, but not LLVM
* Leverage potential GCC optimizations for run-time speed improvement for Rust applications
* Reduce dependence of the Rust ecosystem on LLVM

## Building
### Dependencies
* **[rustup](https://www.rust-lang.org/tools/install)**: The rustup tool is required to build this project using the following instructions; do not rely on a Rust toolchain that may have been provided by your operating system's package manager.
* **[DejaGnu](https://www.gnu.org/software/dejagnu/#downloading)** (optional): Install the DejaGnu testing framework in order to run the libgccjit test suite.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think https://rustup.rs/ is better for installing rustup.


darcagn marked this conversation as resolved.
Show resolved Hide resolved
**This requires a patched libgccjit in order to work.
You need to use my [fork of gcc](https://github.com/antoyo/gcc) which already includes these patches.**

### Building GCC with libgccjit.so
#### On Linux x86-64
When using an x86-64 Linux host to target x86-64 Linux, building `libgccjit.so` is unnecessary -- in that case, a precompiled version may be downloaded automatically. Simply copy the provided `config.example.toml` file to `config.toml` to enable the automatic downloading of `libgccjit.so`.
```bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When using an x86-64 Linux host to target x86-64 Linux, building `libgccjit.so` is unnecessary -- in that case, a precompiled version may be downloaded automatically. Simply copy the provided `config.example.toml` file to `config.toml` to enable the automatic downloading of `libgccjit.so`.
When using an x86-64 Linux host to target x86-64 Linux, building `libgccjit.so` is unnecessary (unless you patched it) -- in that case, a precompiled version may be downloaded automatically. Simply copy the provided `config.example.toml` file to `config.toml` to enable the automatic downloading of `libgccjit.so`.

$ cp config.example.toml config.toml
```
Now you may skip ahead to [Building rustc_codegen_gcc](#building-rustc_codegen_gcc).

If don't need to test GCC patches you wrote in our GCC fork, then the default configuration should
be all you need. You can update the `rustc_codegen_gcc` without worrying about GCC.

### Building with your own GCC version

If you wrote a patch for GCC and want to test it without this backend, you will need
to do a few more things.
#### Other architectures
If you are on a host arch other than x86-64 Linux or are targetting an arch other than x86-64 Linux, you will need to build a custom GCC with `libgccjit.so`. **At this time, this requires the use of the [rust-lang/gcc](https://github.com/rust-lang/gcc) fork of GCC, which includes patches to libgccjit to enable the use of this codegen.**

To build it (most of these instructions come from [here](https://gcc.gnu.org/onlinedocs/jit/internals/index.html), so don't hesitate to take a look there if you encounter an issue):
Full instructions to build libgccjit are provided in the [libgccjit documentation](https://gcc.gnu.org/onlinedocs/jit/internals/index.html), so check there if you encounter issues, however brief directions for Debian-based Linux follow below. You may need to adapt them to your operating system and package manager. If you need to build a cross-compiler, see [Building a cross-compiler with rustc_codegen_gcc](./doc/cross.md).

```bash
$ git clone https://github.com/antoyo/gcc
$ git clone https://github.com/rust-lang/gcc gcc-source
$ sudo apt install flex libmpfr-dev libgmp-dev libmpc3 libmpc-dev
$ mkdir gcc-build gcc-install
$ cd gcc-build
$ ../gcc/configure \
$ ../gcc-source/configure \
--enable-host-shared \
--enable-languages=jit \
--enable-checking=release \ # it enables extra checks which allow to find bugs
--enable-checking=release \
--disable-bootstrap \
--disable-multilib \
--prefix=$(pwd)/../gcc-install
$ make -j4 # You can replace `4` with another number depending on how many cores you have.
$ make -j4
$ make install
```

If you want to run libgccjit tests, you will need to also enable the C++ language in the `configure`:
Notes:
* If you want to run libgccjit tests, you must also add C++ to the enabled languages: `--enable-languages=jit,c++`
* `--enable-host-shared` builds the compiler as position independent code and is required to build libgccjit. This results in a slower compiler; however, if building GCC in multiple passes, `jit` may be built in the first pass with `--enable-host-shared`, with both disabled in subsequent passes.
* `--enable-checking=release` and `--disable-bootstrap` speed up compilation by disabling self-checks and may be omitted.
* `--disable-multilib` is used as libgccjit only supports targeting one arch ABI variant.
* Adjust the `4` in `make -j4` to the number of threads available on your system to speed up compilation.
* Change `make install` to `make install-strip` to remove debug symbols from GCC, including `libgccjit.so`, to save on disk space.

Once the build is complete, one may run libgccjit tests like so (requires DejaGnu to be installed):
```bash
darcagn marked this conversation as resolved.
Show resolved Hide resolved
--enable-languages=jit,c++
$ make -C gcc check-jit
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding -C gcc?


Then to run libgccjit tests:

To run one specific test:
```bash
$ cd gcc # from the `gcc-build` folder
$ make check-jit
# To run one specific test:
$ make check-jit RUNTESTFLAGS="-v -v -v jit.exp=jit.dg/test-asm.cc"
$ make -C gcc check-jit RUNTESTFLAGS="-v -v -v jit.exp=jit.dg/test-asm.cc"
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question.


**Put the path to your custom build of libgccjit in the file `config.toml`.**

You now need to set the `gcc-path` value in `config.toml` with the result of this command:

Now that you've compiled GCC with libgccjit support, the installed GCC toolchain with `libgccjit.so` can be found in the `gcc-install` directory created above. You will need to provide the path to the specific subdirectory containing `libgccjit.so` to rustc_codegen_gcc. Use the following commands to leave the `gcc-build` directory and find the absolute path within the `gcc-install` directory:
```bash
$ dirname $(readlink -f `find . -name libgccjit.so`)
$ cd ..
$ dirname $(readlink -f `find ./gcc-install -name libgccjit.so`)
```

and to comment the `download-gccjit` setting:
If desired, you may now delete the `gcc-source` and `gcc-build` directories to reclaim disk space, but keep the `gcc-install` directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't recommend that.

Copy link
Contributor

@antoyo antoyo Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that may be useful for people wanting the use the project, but not contribute on it. So, perhaps we should move this to the Tips document?

Then, within the rustc_codegen_gcc repository directory, create a `config.toml` file with the following contents. Replace `[MY PATH]` with the path you found above.
```toml
gcc-path = "[MY PATH]"
# download-gccjit = true
```

Then you can run commands like this:
### Building rustc_codegen_gcc
Now that you have a `config.toml` file set up to use `libgccjit.so`, you can proceed to building the build system and sysroot. `prepare` will retrieve and patch the sysroot source, while `build` will build the sysroot.

```bash
$ ./y.sh prepare # download and patch sysroot src and install hyperfine for benchmarking
$ ./y.sh prepare
$ ./y.sh build --sysroot --release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave the comment.

```

To run the tests:

You may also run the tests:
```bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave the original sentence.

$ ./y.sh test --release
```

## Usage

You have to run these commands, in the corresponding order:

The following example commands are run with `$CG_GCCJIT_DIR` representing the path to your rustc_codegen_gcc directory.
### Cargo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove all mentions to $CG_GCCJIT_DIR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it seems like a recurring wish to be able to run y.sh from anywhere (@darcagn: is this why you mention this variable?), perhaps we could have a section explaining what I did to alleviate this issue, which is I created an alias y = /path/to/rustc_codegen_gcc/y.sh?

To invoke `cargo`, run the following example command:
```bash
$ ./y.sh prepare
$ ./y.sh build --sysroot
$ CHANNEL="release" $CG_GCCJIT_DIR/y.sh cargo run
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for CHANNEL="release".

To check if all is working correctly, run:

You may verify your build is working properly by building a test project:
```bash
$ ./y.sh cargo build --manifest-path tests/hello-world/Cargo.toml
```

### Cargo

```bash
$ CHANNEL="release" $CG_GCCJIT_DIR/y.sh cargo run
$ CHANNEL="release" $CG_GCCJIT_DIR/y.sh cargo build --manifest-path $CG_GCCJIT_DIR/tests/hello-world/Cargo.toml
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert.


If you compiled cg_gccjit in debug mode (aka you didn't pass `--release` to `./y.sh test`) you should use `CHANNEL="debug"` instead or omit `CHANNEL="release"` completely.
Note: If you compiled rustc_codegen_gcc in debug mode (i.e., you didn't pass `--release` to `./y.sh` above), you should use `CHANNEL="debug"` or omit `CHANNEL="release"` completely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be interesting to instead explain here that the CHANNEL environment variable can be used.

### LTO

To use LTO, you need to set the variable `FAT_LTO=1` and `EMBED_LTO_BITCODE=1` in addition to setting `lto = "fat"` in the `Cargo.toml`.
To use LTO, you need to set the environment variables `FAT_LTO=1` and `EMBED_LTO_BITCODE=1`, in addition to setting `lto = "fat"` in your project's `Cargo.toml`.
Don't set `FAT_LTO` when compiling the sysroot, though: only set `EMBED_LTO_BITCODE=1`.

Failing to set `EMBED_LTO_BITCODE` will give you the following error:

```
error: failed to copy bitcode to object file: No such file or directory (os error 2)
```
Failing to set `EMBED_LTO_BITCODE` will give you the following error: `error: failed to copy bitcode to object file: No such file or directory (os error 2)`.

### Rustc
### rustc

If you want to run `rustc` directly, you can do so with:
To invoke `rustc` instead of using `cargo`, you can do so with the following example command:

```bash
$ ./y.sh rustc my_crate.rs
$ $CG_GCCJIT_DIR/y.sh rustc my_crate.rs
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert.


You can do the same manually (although we don't recommend it):
Although not recommended, you may manually invoke `rustc` directly. In this example, `$LIBGCCJIT_PATH` represents the path to the directory containing `libgccjit.so`.

```bash
$ LIBRARY_PATH="[gcc-path value]" LD_LIBRARY_PATH="[gcc-path value]" rustc +$(cat $CG_GCCJIT_DIR/rust-toolchain | grep 'channel' | cut -d '=' -f 2 | sed 's/"//g' | sed 's/ //g') -Cpanic=abort -Zcodegen-backend=$CG_GCCJIT_DIR/target/release/librustc_codegen_gcc.so --sysroot $CG_GCCJIT_DIR/build_sysroot/sysroot my_crate.rs
$ LIBRARY_PATH="$LIBGCCJIT_PATH" LD_LIBRARY_PATH="$LIBGCCJIT_PATH" rustc +$(cat $CG_GCCJIT_DIR/rust-toolchain | grep 'channel' | cut -d '=' -f 2 | sed 's/"//g' | sed 's/ //g') -Zcodegen-backend=$CG_GCCJIT_DIR/target/release/librustc_codegen_gcc.so --sysroot $CG_GCCJIT_DIR/build_sysroot/sysroot my_crate.rs
```

## Env vars
## Environment variables

* _**CG_RUSTFLAGS**_: Send additional flags to rustc. Can be used to build the sysroot without unwinding by setting `CG_RUSTFLAGS=-Cpanic=abort`.
* _**CG_GCCJIT_VERBOSE**_: Enables verbose output from the GCC driver.
* _**CG_GCCJIT_DUMP_ALL_MODULES**_: Enables dumping of all compilation modules. When set to "1", a dump is created for each module during compilation and stored in `/tmp/reproducers/`.
* _**CG_GCCJIT_DUMP_MODULE**_: Enables dumping of a specific module. When set with the module name, e.g., `CG_GCCJIT_DUMP_MODULE=module_name`, a dump of that specific module is created in `/tmp/reproducers/`.
* _**CG_RUSTFLAGS**_: Send additional flags to rustc. Can be used to build the sysroot without unwinding by setting `CG_RUSTFLAGS=-Cpanic=abort`.
* _**CG_GCCJIT_DUMP_TO_FILE**_: Dump a C-like representation to /tmp/gccjit_dumps and enable debug info in order to debug this C-like representation.
* _**CG_GCCJIT_DUMP_TO_FILE**_: Dump a C-like representation to `/tmp/gccjit_dumps` and enable debug info in order to debug this C-like representation.
* _**CG_GCCJIT_DUMP_RTL**_: Dumps RTL (Register Transfer Language) for virtual registers.
* _**CG_GCCJIT_DUMP_RTL_ALL**_: Dumps all RTL passes.
* _**CG_GCCJIT_DUMP_TREE_ALL**_: Dumps all tree (GIMPLE) passes.
Expand All @@ -158,12 +138,12 @@ $ LIBRARY_PATH="[gcc-path value]" LD_LIBRARY_PATH="[gcc-path value]" rustc +$(ca
* _**CG_GCCJIT_DUMP_GIMPLE**_: Dumps the initial GIMPLE representation.
* _**CG_GCCJIT_DUMP_EVERYTHING**_: Enables dumping of all intermediate representations and passes.
* _**CG_GCCJIT_KEEP_INTERMEDIATES**_: Keeps intermediate files generated during the compilation process.
* _**CG_GCCJIT_VERBOSE**_: Enables verbose output from the GCC driver.

## Extra documentation
## Additional documentation

More specific documentation is available in the [`doc`](./doc) folder:
Additional documentation is available in the [`doc`](./doc) folder:

* [Building a cross-compiler](./doc/cross.md)
* [Common errors](./doc/errors.md)
* [Debugging GCC LTO](./doc/debugging-gcc-lto.md)
* [Debugging libgccjit](./doc/debugging-libgccjit.md)
Expand Down
2 changes: 1 addition & 1 deletion config.example.toml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
#gcc-path = "gcc-build/gcc"
#gcc-path = "gcc-install/lib"
download-gccjit = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert. I never run make install and I'd like to keep it that way. You can mention the install folder too though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe make install can be necessary for e.g. using LTO.

28 changes: 28 additions & 0 deletions doc/cross.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# How to build a cross-compiler with rustc_codegen_gcc

## Building libgccjit

* With **crosstool-ng**: follow the instructions on [this repo](https://github.com/cross-cg-gcc-tools/cross-gcc).
* **Other**: Build a GCC cross-compiler like usual, but add `--enable-languages=jit`, `--enable-host-shared`, and `--disable-multilib` to the `./configure` arguments for pass 1.

## Configuring rustc_codegen_gcc

* Make sure you set the path to the cross-compiling libgccjit in rustc_codegen_gcc's `config.toml`.
* Make sure you have the linker for your target (for instance `m68k-unknown-linux-gnu-gcc`) in your `$PATH`. Currently, the linker name is hardcoded as being `$TARGET-gcc`.
* Use `--cross` during the prepare step so that the sysroot is patched for the cross-compiling case:
* `./y.sh prepare --cross`

### rustc-supported targets
* If the target is already supported by rustc, use `--target-triple` to specify the target when building the sysroot:
* `./y.sh build --sysroot --target-triple m68k-unknown-linux-gnu`
* Specify the target when building your project:
* `./y.sh cargo build --target m68k-unknown-linux-gnu`

### rustc-unsupported targets
* If the target is not yet supported by the Rust compiler, create a [target specification file](https://docs.rust-embedded.org/embedonomicon/custom-target.html). Fake the `arch` specified in your target specification file by replacing it with one that is supported by the Rust compiler.
* To build the sysroot, use `--target-triple` to specify the real target, and use `--target` to add the **absolute path** to your target specification file:
* `./y.sh build --sysroot --target-triple m68k-unknown-linux-gnu --target $(pwd)/m68k-unknown-linux-gnu.json`
* Specify the target specification file when building your project:
* `./y.sh cargo build --target path/to/m68k-unknown-linux-gnu.json`

If you get the error `/usr/bin/ld: unrecognised emulation mode: m68kelf`, make sure you set `gcc-path` (in `config.toml`) to the install directory.
2 changes: 1 addition & 1 deletion doc/debugging-gcc-lto.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# How to debug GCC LTO

Run do the command with `-v -save-temps` and then extract the `lto1` line from the output and run that under the debugger.
Run the command with `-v -save-temps` and then extract the `lto1` line from the output and run that under the debugger.
27 changes: 0 additions & 27 deletions doc/tips.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,30 +51,3 @@ If you wish to build a custom sysroot, pass the path of your sysroot source to `

If you need to check what gccjit is generating (GIMPLE), then take a look at how to
generate it in [gimple.md](./doc/gimple.md).

### How to build a cross-compiling libgccjit

#### Building libgccjit

* Follow the instructions on [this repo](https://github.com/cross-cg-gcc-tools/cross-gcc).

#### Configuring rustc_codegen_gcc

* Run `./y.sh prepare --cross` so that the sysroot is patched for the cross-compiling case.
* Set the path to the cross-compiling libgccjit in `gcc-path` (in `config.toml`).
* Make sure you have the linker for your target (for instance `m68k-unknown-linux-gnu-gcc`) in your `$PATH`. Currently, the linker name is hardcoded as being `$TARGET-gcc`. Specify the target when building the sysroot: `./y.sh build --sysroot --target-triple m68k-unknown-linux-gnu`.
* Build your project by specifying the target: `OVERWRITE_TARGET_TRIPLE=m68k-unknown-linux-gnu ../y.sh cargo build --target m68k-unknown-linux-gnu`.

If the target is not yet supported by the Rust compiler, create a [target specification file](https://docs.rust-embedded.org/embedonomicon/custom-target.html) (note that the `arch` specified in this file must be supported by the rust compiler).
Then, you can use it the following way:

* Add the target specification file using `--target` as an **absolute** path to build the sysroot: `./y.sh build --sysroot --target-triple m68k-unknown-linux-gnu --target $(pwd)/m68k-unknown-linux-gnu.json`
* Build your project by specifying the target specification file: `OVERWRITE_TARGET_TRIPLE=m68k-unknown-linux-gnu ../y.sh cargo build --target path/to/m68k-unknown-linux-gnu.json`.

If you get the following error:

```
/usr/bin/ld: unrecognised emulation mode: m68kelf
```

Make sure you set `gcc-path` (in `config.toml`) to the install directory.
Loading