Skip to content

Commit

Permalink
Add performance and best practices documentation
Browse files Browse the repository at this point in the history
Test: check gitiles view
Change-Id: I2fa4fa5f7ee91ba586e1900542c8c334eb727a6b
  • Loading branch information
danw committed Feb 7, 2018
1 parent 7a26b70 commit bc20362
Show file tree
Hide file tree
Showing 5 changed files with 353 additions and 0 deletions.
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,13 @@ logic receives module definitions parsed into Go structures using reflection
and produces build rules. The build rules are collected by blueprint and
written to a [ninja](http://ninja-build.org) build file.

## Other documentation

* [Best Practices](docs/best_practices.md)
* [Build Performance](docs/perf.md)
* [Generating CLion Projects](docs/clion.md)
* Make-specific documentation: [build/make/README.md](https://android.googlesource.com/platform/build/+/master/README.md)

## FAQ

### How do I write conditionals?
Expand Down
148 changes: 148 additions & 0 deletions docs/best_practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Build System Best Practices

## Read only source tree

Never write to the source directory during the build, always write to
`$OUT_DIR`. We expect to enforce this in the future.

If you want to verify / provide an update to a checked in generated source
file, generate that file into `$OUT_DIR` during the build, fail the build
asking the user to run a command (either a straight command, checked in script,
generated script, etc) to explicitly copy that file from the output into the
source tree.

## Network access

Never access the network during the build. We expect to enforce this in the
future, though there will be some level of exceptions for tools like `distcc`
and `goma`.

## Paths

Don't use absolute paths in Ninja files (with make's `$(abspath)` or similar),
as that could trigger extra rebuilds when a source directory is moved.

Assume that the source directory is `$PWD`. If a script is going to change
directories and needs to convert an input from a relative to absolute path,
prefer to do that in the script.

Don't encode absolute paths in build intermediates or outputs. This would make
it difficult to reproduce builds on other machines.

Don't assume that `$OUT_DIR` is `out`. The source and output trees are very
large these days, so some people put these on different disks. There are many
other uses as well.

Don't assume that `$OUT_DIR` is under `$PWD`, users can set it to a relative path
or an absolute path.

## $(shell) use in Android.mk files

Don't use `$(shell)` to write files, create symlinks, etc. We expect to
enforce this in the future. Encode these as build rules in the build graph
instead. This can be problematic in a number of ways:

* `$(shell)` calls run at the beginning of every build, at minimum this slows
down build startup, but it can also trigger more build steps to run than are
necessary, since these files will change more often than necessary.
* It's no longer possible for a stripped-down product configuration to opt-out
of these created files. It's better to have actual rules and dependencies set
up so that space isn't wasted, but the files are there when necessary.

## Headers

`LOCAL_COPY_HEADERS` is deprecated. Soong modules cannot use these headers, and
when the VNDK is enabled, System modules in Make cannot declare or use them
either.

The set of global include paths provided by the build system is also being
removed. They've been switched from using `-isystem` to `-I` already, and are
removed entirely in some environments (vendor code when the VNDK is enabled).

Instead, use `LOCAL_EXPORT_C_INCLUDE_DIRS`/`export_include_dirs`. These allow
access to the headers automatically if you link to the associated code.

If your library uses `LOCAL_EXPORT_C_INCLUDE_DIRS`/`export_include_dirs`, and
the exported headers reference a library that you link to, use
`LOCAL_EXPORT_SHARED_LIBRARY_HEADERS`/`LOCAL_EXPORT_STATIC_LIBRARY_HEADERS`/`LOCAL_EXPORT_HEADER_LIBRARY_HEADERS`
(`export_shared_lib_headers`/`export_static_lib_headers`/`export_header_lib_headers`)
to re-export the necessary headers to your users.

Don't use non-local paths in your `LOCAL_EXPORT_C_INCLUDE_DIRS`, use one of the
`LOCAL_EXPORT_*_HEADERS` instead. Non-local exported include dirs are not
supported in Soong. You may need to either move your module definition up a
directory (for example, if you have ./src/ and ./include/, you probably want to
define the module in ./Android.bp, not ./src/Android.bp), define a header
library and re-export it, or move the headers into a more appropriate location.

Prefer to use header libraries (`BUILD_HEADER_LIBRARY`/ `cc_library_headers`)
only if the headers are actually standalone, and do not have associated code.
Sometimes there are headers that have header-only sections, but also define
interfaces to a library. Prefer to split those header-only sections out to a
separate header-only library containing only the header-only sections, and
re-export that header library from the existing library. This will prevent
accidentally linking more code than you need (slower at build and/or runtime),
or accidentally not linking to a library that's actually necessary.

Prefer `LOCAL_EXPORT_C_INCLUDE_DIRS` over `LOCAL_C_INCLUDES` as well.
Eventually we'd like to remove `LOCAL_C_INCLUDES`, though significant cleanup
will be required first. This will be necessary to detect cases where modules
are using headers that shouldn't be available to them -- usually due to the
lack of ABI/API guarantees, but for various other reasons as well: layering
violations, planned deprecations, potential optimizations like C++ modules,
etc.

## Use defaults over variables

Soong supports variable definitions in Android.bp files, but in many cases,
it's better to use defaults modules like `cc_defaults`, `java_defaults`, etc.

* It moves more information next to the values -- that the array of strings
will be used as a list of sources is useful, both for humans and automated
tools. This is even more useful if it's used inside an architecture or
target specific property.
* It can collect multiple pieces of information together into logical
inheritable groups that can be selected with a single property.

## Custom build tools

If writing multiple files from a tool, declare them all in the build graph.
* Make: Use `.KATI_IMPLICIT_OUTPUTS`
* Android.bp: Just add them to the `out` list in genrule
* Custom Soong Plugin: Add to `Outputs` or `ImplicitOutputs`

Declare all files read by the tool, either with a dependency if you can, or by
writing a dependency file. Ninja supports a fairly limited set of dependency
file formats. You can verify that the dependencies are read correctly with:

```
NINJA_ARGS="-t deps <output_file>" m
```

Prefer to list input files on the command line, otherwise we may not know to
re-run your command when a new input file is added. Ninja does not treat a
change in dependencies as something that would invalidate an action -- the
command line would need to change, or one of the inputs would need to be newer
than the output file. If you don't include the inputs in your command line, you
may need to add the the directories to your dependency list or dependency file,
so that any additions or removals from those directories would trigger your
tool to be re-run. That can be more expensive than necessary though, since many
editors will write temporary files into the same directory, so changing a
README could trigger the directory's timestamp to be updated.

Only control output files based on the command line, not by an input file. We
need to know which files will be created before any inputs are read, since we
generate the entire build graph before reading source files, or running your
tool. This comes up with Java based tools fairly often -- they'll generate
different output files based on the classes declared in their input files.
We've worked around these tools with the "srcjar" concept, which is just a jar
file containing the generated sources. Our Java compilation tasks understand
*.srcjar files, and will extract them before passing them on to the compiler.

## Libraries in PRODUCT_PACKAGES

Most libraries aren't necessary to include in `PRODUCT_PACKAGES`, unless
they're used dynamically via `dlopen`. If they're only used via
`LOCAL_SHARED_LIBRARIES` / `shared_libs`, then those dependencies will trigger
them to be installed when necessary. Adding unnecessary libraries into
`PRODUCT_PACKAGES` will force them to always be installed, wasting space.
195 changes: 195 additions & 0 deletions docs/perf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
# Build Performance

## Debugging Build Performance

### Tracing

soong_ui has tracing built in, so that every build execution's trace can be
viewed. Just open `$OUT_DIR/build.trace.gz` in Chrome's <chrome://tracing>, or
with [catapult's trace viewer][catapult trace_viewer]. The last few traces are
stored in `build.trace.#.gz` (larger numbers are older). The associated logs
are stored in `soong.#.log`.

![trace example](./trace_example.png)

### Soong

Soong can be traced and profiled using the standard Go tools. It understands
the `-cpuprofile`, `-trace`, and `-memprofile` command line arguments, but we
don't currently have an easy way to enable them in the context of a full build.

### Kati

In general, the slow path of reading Android.mk files isn't particularly
performance sensitive, since it doesn't need to happen on every build. It is
important for the fast-path (detecting whether it needs to regenerate the ninja
file) to be fast however. And it shouldn't hit the slow path too often -- so
don't rely on output of a `$(shell)` command that includes the current timestamp,
or read a file that's going to change on every build.

#### Regen check is slow

In most cases, we've found that the fast-path is slow because all of the
`$(shell)` commands need to be re-executed to determine if their output changed.
The `$OUT_DIR/soong.log` contains statistics from the regen check:

```
.../kati.go:127: *kati*: regen check time: 1.699207
.../kati.go:127: *kati*: glob time (regen): 0.377193 / 33609
.../kati.go:127: *kati*: shell time (regen): 1.313529 / 184
.../kati.go:127: *kati*: 0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort
.../kati.go:127: *kati*: 0.105 cd packages/apps/Dialer ; find -L . -type d -name "res"
.../kati.go:127: *kati*: 0.035 find device vendor -maxdepth 4 -name '*_aux_variant_config.mk' -o -name '*_aux_os_config.mk' | sort
.../kati.go:127: *kati*: 0.029 cd frameworks/base ; find -L core/java graphics/java location/java media/java media/mca/effect/java media/mca/filterfw/java media/mca/filterpacks/java drm/java opengl/java sax/java telecomm/java telephony/java wifi/java lowpan/java keystore/java rs/java ../opt/telephony/src/java/android/telephony ../opt/telephony/src/java/android/telephony/gsm ../opt/net/voip/src/java/android/net/rtp ../opt/net/voip/src/java/android/net/sip -name "*.html" -and -not -name ".*"
.../kati.go:127: *kati*: 0.025 test -d device && find -L device -maxdepth 4 -path '*/marlin/BoardConfig.mk'
.../kati.go:127: *kati*: 0.023 find packages/apps/Settings/tests/robotests -type f -name '*Test.java' | sed -e 's!.*\(com/google.*Test\)\.java!\1!' -e 's!.*\(com/android.*Test\)\.java!\1!' | sed 's!/!\.!g' | cat
.../kati.go:127: *kati*: 0.022 test -d vendor && find -L vendor -maxdepth 4 -path '*/marlin/BoardConfig.mk'
.../kati.go:127: *kati*: 0.017 cd cts/tests/tests/shortcutmanager/packages/launchermanifest ; find -L ../src -name "*.java" -and -not -name ".*"
.../kati.go:127: *kati*: 0.016 cd cts/tests/tests/shortcutmanager/packages/launchermanifest ; find -L ../../common/src -name "*.java" -and -not -name ".*"
.../kati.go:127: *kati*: 0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist
.../kati.go:127: *kati*: stat time (regen): 0.250384 / 4405
```

In this case, the total time spent checking was 1.69 seconds, even though the
other "(regen)" numbers add up to more than that (some parts are parallelized
where possible). The biggest contributor is the `$(shell)` times -- 184
executions took a total of 1.31 seconds. The top 10 longest shell functions are
printed.

All the longest commands in this case are all variants of a call to `find`, but
this is where using pure make functions instead of calling out to the shell can
make a performance impact -- many calls to check if `26 > 20` can add up. We've
added some basic math functions in `math.mk` to help some common use cases that
used to be rather expensive when they were used too often.

There are some optimizations in place for find commands -- if Kati can
understand the find command, the built-in find emulator can turn some of them
into glob or stat checks (falling back to calling `find` if one of those imply
that the output may change). Many of the common macros produce find commands
that Kati can understand, but if you're writing your own, you may want to
experiment with other options if they're showing up in this list. For example,
if this was significantly more expensive (either in runtime, or was called
often):

```
.../kati.go:127: *kati*: 0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist
```

It may be more efficient to move the grep into make, so that the `find` portion
can be rewritten and cached:

```
$(filter-out $(file <$(LOCAL_PATH)/java_tests_blacklist),$(call all-java-files-under,luni/src/test/java))
```

Others can be simplified by just switching to an equivalent find command that
Kati understands:

```
.../kati.go:127: *kati*: 0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort
```

By adding the implicit `-a` and moving the `| sort` to Make, this can now be
cached by Kati:

```
$(sort $(shell find device vendor -type -f -a -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop))
```

Kati is learning about the implicit `-a` in [this change](https://github.com/google/kati/pull/132)

#### Kati regens too often

Kati prints out what triggered the slow path to be taken -- this can be a
changed file, a changed environment variable, or different output from a
`$(shell)` command:

```
out/soong/Android-aosp_arm.mk was modified, regenerating...
```

The state is stored in `$OUT_DIR/.kati_stamp*` files, and can be (partially)
read with the `ckati_stamp_dump` tool in prebuilts/build-tools. More debugging
is available when ckati is run with `--regen_debug`, but that can be a lot of
data to understand.

### Ninja

#### Understanding why something rebuilt

Add `NINJA_ARGS="-d explain"` to your environment before a build, this will cause
ninja to print out explanations on why actions were taken. Start reading from the
beginning, as this much data can be hard to read:

```
$ cd art
$ mma
$ touch runtime/jit/profile_compilation_info.h
$ NINJA_ARGS="-d explain" mma
...
ninja explain: output out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o older than most recent input art/runtime/jit/profile_compilation_info.h (
1516683538 vs 1516685188)
ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o is dirty
ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/cpp-define-generator-data is dirty
ninja explain: out/soong/host/linux-x86/bin/cpp-define-generator-data is dirty
ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-asm-support/gen/asm_support_gen.h is dirty
ninja explain: out/soong/.intermediates/art/cmdline/art_cmdline_tests/android_arm_armv7-a_core_cmdline_parser_test/obj/art/cmdline/cmdline_parser_test.o is dirty
...
```

In this case, art/cmdline/cmdline_parser_test.o was rebuilt because it uses
asm_support_gen.h, which was generated by cpp-define-generator-data, which uses
profile_compilation_info.h.

You'll likely need to cross-reference this data against the build graph in the
various .ninja files. The files are (mostly) human-readable, but a (slow) web
interface can be used by running `NINJA_ARGS="-t browse <target>" m`.

#### Builds take a long time

If the long part in the trace view of a build is a relatively solid block, then
the performance is probably more related to how much time the actual build
commands are taking than having extra dependencies, or slowdowns in
soong/kati/ninja themselves.

Beyond looking at visible outliers in the trace view, we don't have any tooling
to help in this area yet. It's possible to aggregate some of the raw data
together, but since our builds are heavily parallelized, it's particularly easy
for build commands to impact unrelated build commands. This is an area we'd
like to improve -- we expect keeping track of user/system time per-action would
provide more reliable data, but tracking some full-system data (memory/swap
use, disk bandwidth, etc) may also be necessary.

## Known Issues

### Common

#### mm

Soong always loads the entire module graph, so as modules convert from Make to
Soong, `mm` is becoming closer to `mma`. This produces more correct builds, but
does slow down builds, as we need to verify/produce/load a larger build graph.

We're exploring a few options to speed up build startup, one being [an
experimental set of ninja patches][ninja parse optimization],
though that's not the current path we're working towards.

### Android 8.1 (Oreo MR1)

In some cases, a tree would get into a state where Soong would be run twice on
every incremental build, even if there was nothing to do. This was fixed in
master with [these changes][blueprint_microfactory], but they were too
significant to backport at the time. And while they fix this particular issue,
they appear to cause ninja to spend more time during every build loading the
`.ninja_log` / `.ninja_deps` files, especially as they become larger.

A workaround to get out of this state is to remove the build.ninja entry from
`$OUT_DIR/.ninja_log`:

```
sed -i "/\/build.ninja/d" $(get_build_var OUT_DIR)/.ninja_log
```

[catapult trace_viewer]: https://github.com/catapult-project/catapult/blob/master/tracing/README.md
[ninja parse optimization]: https://android-review.googlesource.com/c/platform/external/ninja/+/461005
[blueprint_microfactory]: https://android-review.googlesource.com/q/topic:%22blueprint_microfactory%22+status:merged
Binary file added docs/trace_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions navbar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
* [Home](/README.md)
* [Best Practices](/docs/best_practices.md)
* [Performance](/docs/perf.md)

0 comments on commit bc20362

Please sign in to comment.