Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite bootstrapping stages #1650

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 67 additions & 52 deletions src/building/bootstrapping.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,30 @@

<!-- toc -->

[*Bootstrapping*][boot] is the process of using a compiler to compile itself.
More accurately, it means using an older compiler to compile a newer version
of the same compiler.

This raises a chicken-and-egg paradox: where did the first compiler come from?
It must have been written in a different language. In Rust's case it was
[written in OCaml][ocaml-compiler]. However it was abandoned long ago and the
only way to build a modern version of rustc is a slightly less modern
version.

This is exactly how `x.py` works: it downloads the current beta release of
rustc, then uses it to compile the new compiler.
[Bootstrapping][boot] is the process of using a compiler to produce a later version
of itself.

This raises a [chicken-or-the-egg
paradox](https://en.wikipedia.org/wiki/Chicken_or_the_egg): what rust compiler
was used to produce the very first rust compiler? The answer is that the first
compiler was not written in rust. It was [written in OCaml][ocaml-compiler]. Of
course, it has long been discarded and since then the only compiler that is
able to produce some version of `rustc` is a slightly earlier version of
`rustc`.

For this purpose a python script `x.py` is provided at the root of the
repository. `x.py` downloads a pre-compiled compiler—the stage 0 compiler—and
with it produces from the current source code a compiler—the stage 1 compiler.
Additionaly, it may use the stage 1 compiler to produce from the current source
code another compiler—the stage 2 compiler. Below describes this process in
some detail, including the reason for a stage 2 compiler and more.

Note that this documentation mostly covers user-facing information. See
[bootstrap/README.md][bootstrap-internals] to read about bootstrap internals.

[bootstrap-internals]: https://github.com/rust-lang/rust/blob/master/src/bootstrap/README.md

## Stages of bootstrapping
## The stages of bootstrapping

### Overview

Expand All @@ -29,8 +34,18 @@ Note that this documentation mostly covers user-facing information. See
- Stage 2: the truly current compiler
- Stage 3: the same-result test

Compiling `rustc` is done in stages. Here's a diagram, adapted from Joshua Nelson's
[talk on bootstrapping][rustconf22-talk] at RustConf 2022, with detailed explanations below.
Each stage involves:
- An existing compiler and its set of dependencies.
- [Objects][objects]: `std` and `rustc`.

Note: the compiler of a stage—e.g. "the stage 1 compiler"—refers to the
compiler that is produced at that stage, not the one that already exists.

Typically, in the first stage (stage 0) the compiler is obtained by downloading a pre-compiled
one and in following stages the compiler is the one that was produced in the previous stage.

Here's a diagram, adapted from Joshua Nelson's [talk on bootstrapping][rustconf22-talk]
at RustConf 2022, with detailed explanations below.

The `A`, `B`, `C`, and `D` show the ordering of the stages of bootstrapping.
<span style="background-color: lightblue; color: black">Blue</span> nodes are downloaded,
Expand Down Expand Up @@ -58,57 +73,57 @@ graph TD
classDef with-s1c fill: lightgreen;
```

### Stage 0: the pre-compiled compiler
[objects]: https://en.wikipedia.org/wiki/Object_code

The stage0 compiler is usually the current _beta_ `rustc` compiler
and its associated dynamic libraries,
which `x.py` will download for you.
(You can also configure `x.py` to use something else.)
### Stage 0: the pre-compiled compiler

The stage0 compiler is then used only to compile `src/bootstrap`, `std`, and `rustc`.
When compiling `rustc`, the stage0 compiler uses the freshly compiled `std`.
There are two concepts at play here:
a compiler (with its set of dependencies)
and its 'target' or 'object' libraries (`std` and `rustc`).
Both are staged, but in a staggered manner.
A pre-compiled compiler and its set of dependencies are downloaded. By default,
it is the current beta release. This is the stage 0 compiler.

### Stage 1: from current code, by an earlier compiler

The rustc source code is then compiled with the stage0 compiler to produce the stage1 compiler.
The stage 0 compiler produces, from current code, `src/bootstrap` and `std` and uses
them to produce, from current code, a compiler. This is the stage 1 compiler.

The stage 1 compiler is the first that is from current code. Yet, it is not
entirely up-to-date, because the compiler that produced it is of earlier code.
More on this below.

### Stage 2: the truly current compiler

We then rebuild our stage1 compiler with itself to produce the stage2 compiler.

In theory, the stage1 compiler is functionally identical to the stage2 compiler,
but in practice there are subtle differences.
In particular, the stage1 compiler itself was built by stage0
and hence not by the source in your working directory.
This means that the ABI generated by the stage0 compiler may not match the ABI that would have been
made by the stage1 compiler, which can cause problems for dynamic libraries, tests, and tools using
`rustc_private`.

Note that the `proc_macro` crate avoids this issue with a C FFI layer called `proc_macro::bridge`,
allowing it to be used with stage 1.

The `stage2` compiler is the one distributed with `rustup` and all other install methods.
However, it takes a very long time to build
because one must first build the new compiler with an older compiler
and then use that to build the new compiler with itself.
For development, you usually only want the `stage1` compiler,
which you can build with `./x.py build library`.
By default, the stage 1 libraries are copied into stage 2, because they are
expected to be identical.

The stage 1 compiler is used to produce from current code a compiler. This is
the stage 2 compiler.

The stage 2 compiler is the first that is both from current code and produced
by a compiler that is of current code. The compilers and libraries obtained by
`rustup` and other installation methods are all stage 2.

For most purposes a stage 1 compiler would suffice: `x.py build library`.
See [Building the compiler](./how-to-build-and-run.html#building-the-compiler).
Between the stage 2 and the stage 1 compiler are subtle differences:

- The symbol names used in the compiler source may not match the symbol names
that would have been made by the stage1 compiler. This is important when using
dynamic linking and due to the lack of ABI compatibility between versions. This
primarily manifests when tests try to link with any of the `rustc_*` crates or
use the (now deprecated) plugin infrastructure. These tests are marked with
`ignore-stage1`.

- The stage 2 compiler benefits from the compile-time optimizations
produces by a compiler that is of the current code.

### Stage 3: the same-result test

Stage 3 is optional. To sanity check our new compiler, we
can build the libraries with the stage2 compiler. The result ought
to be identical to before, unless something has broken.
If a verification that the stage 2 libraries that were copied from stage 1 are indeed
identical to those which would otherwise have been produced in stage 2 is necessary, the
stage 2 compiler is used to produce them and a comparison is made.

### Building the stages

`x.py` tries to be helpful and pick the stage you most likely meant for each subcommand.
These defaults are as follows:
`x.py` provides a reasonable default stage for each subcommand:

- `check`: `--stage 0`
- `doc`: `--stage 0`
Expand All @@ -118,7 +133,7 @@ These defaults are as follows:
- `install`: `--stage 2`
- `bench`: `--stage 2`

You can always override the stage by passing `--stage N` explicitly.
Of course, these can be overridden by passing `--stage <number>`.

For more information about stages, [see below](#understanding-stages-of-bootstrap).

Expand Down