Zim is a caching build system that is ideal for teams using monorepos containing many components and dependencies. Its primary goal is fast incremental builds across a team by leveraging a shared cache of rule outputs. It is entirely language agnostic and has built-in support for cross-platform builds via Docker.
Components and rules are defined in a YAML definitions that are conceptually similar to Makefiles. Each components may inherit from a base template, which yields a simple mechanism to build many components in a consistent and configurable manner.
Zim offers these advantages to teams developing in a monorepo:
-
Fast, parallel builds. Rules run only if inputs have changed and outputs are pulled from a shared cache if someone else built it already.
-
Trivially define how to build new component types. Define build steps for components in a few lines of YAML.
-
Gain the benefits of isolated build environments and cross-platform compilation via the built-in Docker support. Just specify the Docker image to be used when building a component.
-
Flexible input and output resource types. Currently Zim is able to work with both files and Docker images as natively supported resources.
-
Easy setup for a shared cache in S3 via an AWS CloudFormation stack.
-
Lightweight & easy to install. Zim is written in Go which means it consists of a single binary when built.
This project draws inspiration from the core concepts of GNU Make along with the caching strategy from Buck, Bazel, and Please.
Like Make, Zim has a lightweight way to express new rules that define inputs, outputs, and the commands needed to create the outputs.
Like Buck, Zim computes Rule Keys which are used to determine whether the output of a rule is already available in the cache, based on the combined hashes of all the rule's inputs and configuration.
The following concepts are key to how Zim operates, although you can just skip ahead to the Getting Started section below if you want to skate by for now.
-
Project - typically a Git repository that contains the source for multiple services.
-
Component - a directory in the monorepo which typically relates to a single library, binary, or microservice. In Zim, a Component is described by a
component.yaml
definition. -
Rule - a definition that describes an action or build step. Each rule has optional input and output resources, dependencies, and associated commands.
-
Key - a key is computed for each rule which is unique to the current state of its inputs, dependencies, and rule configuration. This key is used when storing and retrieving output artifacts from the shared cache.
-
Resource - inputs and outputs from rules, which may be files or other types. Currently only files and Docker images are the two supported types.
-
Provider - new resource types may be added via providers. This consists of implementing a Go interface and recompiling Zim. Longer term, this could be changed to use IPC or another mechanism to make it easier to extend.
-
Graph - internally, Zim builds a directed acyclic graph (DAG) containing the rules the user asks to run, along with their transitive dependencies. This graph is created and processed by the scheduler in order to execute rules in order of their dependencies.
-
Scheduler - the scheduler is responsible for executing rules according to the DAG. The implementation uses Goroutines to parallelize execution of rules that are ready to run.
-
Middleware - rule execution is decorated and customized by middleware in Zim, similar to how middleware is used to customize HTTP handlers. Logging, caching, and uploads of artifacts are all accomplished via middleware.
Install the Zim CLI by cloning this repo and running go install
at the top level.
Run zim -h
to see help regarding available commands and flags. Zim recognizes
when it is run within a Git repository and will automatically discover
component.yaml
files within, which define components and their rules.
For each item in the repository that you would like to build with Zim, add
a component.yaml
file in the corresponding directory. A simple example to
build a Go program is as follows.
name: myservice
rules:
build:
inputs:
- "*.go"
outputs:
- ${NAME}
command: go build -o ${OUTPUT}
With that definition saved, you can now enter zim run build
to get it done.
$ zim run build --cache disabled
rule: myservice.build
cmd: go build -o ${OUTPUT}
rule: myservice.build in 1.347 sec [OK]
The outputs - an executable named myservice
in this case - are stored in an
artifacts
directory located at the root level of the repository.
Currently Zim supports using AWS infrastructure for its cache backend. A CloudFormation stack containing an S3 bucket and a handful of other serverless infrastructure is easily provisioned by using SAM.
Prerequisites:
- Install the AWS and SAM CLIs:
pip install awscli aws-sam-cli
- Download Go to build the Zim Lambdas
The following was tested with the following version of the SAM CLI:
$ sam --version
SAM CLI, version 0.45.0
With those dependencies installed, run the following to provision cache infrastructure in AWS using a workflow guided by SAM:
$ make deploy
When the command completes, the URL of your Zim API is printed. This URL should
be saved to ~/.zim.yaml
as described in the following section.
Each developer should create the file ~/.zim.yaml
on their development
machine with two main variables:
- The team API URL from
make deploy
- Your personal authentication token
With AWS credentials active for the account containing Zim, run the following command to create an authentication token for each team member:
$ zim add token --email "[email protected]" --name "Joe"
Each team member should now add the following to their ~/.zim.yaml
:
url: "TEAM_API_URL"
token: "MY_TOKEN"
Alternatively, you can use the environment variables ZIM_URL
and ZIM_TOKEN
.
You may override the Zim CLI cache mode. The following modes are available:
read-write
- this is the defaultwrite-only
- write to the cache but don't read from itdisabled
- operate in offline mode
To use this feature, set cache
in ~/.zim.yaml
as follows:
cache: disabled
Or use the command line flag:
$ zim run build --cache disabled
To automatically run rules inside a Docker container, instead of on the host directly, define a Docker image for each component and set the Docker option.
For example, to build a Go service in a container, you could use the following:
name: myservice
docker:
image: circleci/golang:1.12.4
toolchain:
items:
- name: go
command: go version
rules:
build:
inputs:
- "*.go"
outputs:
- ${NAME}
command: go build -o ${OUTPUT}
When a Docker image is specified for the component, Zim mounts the repository as a volume and sets the component directory as the working directory when executing its rules.
Note the use of toolchain
in the component.yaml
. The above example includes
the output of go version
in the Rule Key so that builds on different
architecture receive unique keys in the cache.
To opt-out of using Docker for certain rules, set the native
flag as follows:
rules:
show-host-arch:
native: true
command: uname -a
You may target different architectures using Docker's multi-CPU architecture support. To set the Docker target platform, set platform
in ~/.zim.yaml
as follows:
platform: linux/amd64
Or use the command line flag:
$ zim run build --platform linux/amd64
You can list available platforms in Docker by running:
$ docker buildx ls
These keys are the basis for Zim caching. Zim uses SHA1 hashes to represent each key. Specifically, the hash is computed on a JSON document containing the following information for each rule:
- Project name
- Component name
- Rule name
- Docker image
- Output artifact count
- Input file relative paths and their SHA1 hashes
- Rule dependencies and their keys
- Environment variables set on the Component and Rule
- Toolchain
- Cache key version
- Rule commands
- Whether the rule is native
This information uniquely identifies all the inputs and configuration used by a rule. This means, prior to executing a rule, Zim can determine the current rule key and check whether an output is stored with the that key in the cache. If so, Zim downloads the output from the cache rather than executing the rule.
Zim assumes the rule commands are, in effect, a pure function. In practice
this isn't always the case, but is close enough. For example, when Python files
are compiled to .pyc
a build timestamp is included, so the build will never
be exactly the same, even with the same file inputs.
If you would like to see a key for a given rule for debugging purposes, you can use the following command:
$ zim key -r myservice.build
To retrieve the underlying information:
$ zim key -r myservice.build --detail
Zim supports dependencies between rules, both within a Component and across Components. Collectively, rule dependencies form a directed acyclic graph that Zim traverses when running rules.
To define a dependency, use the following syntax in a Component definition:
name: myservice
kind: go
rules:
build:
requires:
- component: my_library_a
rule: build
- component: my_library_b
rule: build
inputs:
- "*.go"
outputs:
- ${NAME}
command: go build -o ${OUTPUT}
The above example declares two dependencies from myservice.build
to
my_library_a.build
and my_library_b.build
. Consequently, if a user entered
zim run build -c myservice
, Zim will first build the two libraries, and only
when those complete successfully will it build myservice
.
When declaring a requirement, if the Component is omitted, then it is assumed to be referring to another named Rule in the current Component.
In the case of one component depending on another's source code, the exported
source files can be advertised. The following example declares source
as a
named export from my_go_lib
:
name: my_go_lib
exports:
source:
resources:
- go.mod
- go.sum
- "**/*.go"
ignore:
- "**/*_test.go"
That exported source can then be declared as a dependency of a binary:
name: my_exe
rules:
build:
requires:
- component: my_go_lib
export: source
Requiring an export in this way incorporates all files from the export into the component's rule key.
Rules are able to leverage environment variables from two sources. First, environment variables may be defined at the Component level, which makes them available to all rules of the Component:
name: myservice
environment:
RETRY_COUNT: 3
FOO: bar
Second, a handful of environment variables are automatically injected to provide Rule commands some context:
COMPONENT
- the Component name, e.g. "myservice"NAME
- the Component name, e.g. "myservice"KIND
- the Component kind, e.g. "go"RULE
- the Rule name, e.g. "build"NODE_ID
- ID in Graph for the Rule, e.g. "myservice.build"INPUT
- the relative path to the first inputOUTPUT
- the relative path to the first outputOUTPUTS
- relative paths to all outputs (space separated)DEP
- the relative path to the first dependencyDEPS
- relative paths to all dependencies (space separated)ARTIFACTS_DIR
- absolute path to directory where outputs are placedARTIFACT
- absolute path to the first outputROOT
- absolute path to the root of the project
As a trivial example, if a Rule lists "*.go" as an input and the Component has
one Go file in the directory named "main.go", then INPUT=main.go
is set in
the Rule environment.
Zim offers some built-in commands that may be leveraged within rules. To use
these, specify commands
as a list in a rule definition instead of a simple
command
string. Here is an example showing how to create a zip file containing
the contents of the dist
directory in a build:
rules:
build:
inputs:
- src/**
- package.json
outputs:
- ${NAME}.zip
commands:
- cleandir: dist
- run: yarn run build
- zip:
cd: dist
input: "."
output: ${ARTIFACT}
Available built-ins:
run
- runs the following commands in a shellmkdir
- creates a directory and its parents as needed (mkdir -p)cleandir
- removes and recreates the directory (rm -rf then mkdir -p)remove
- removes files or directories (rm -rf)move
- relocate files or directories (mv)src
- source locationsdst
- destination locations
copy
- copy files or directories (cp -R)src
- source locationsdst
- destination locationsoptions
- cp command options (default-R
)
zip
- create a zip archiveoptions
- zip command options (default-qrFS
)input
- path to input files (default.
)output
- required zip output pathcd
- optional directory to cd into before running the command
unzip
- unzip an archiveoptions
- unzip command options (default-qo
)input
- path to the zip fileoutput
- optional directory to extract into
archive
- create a tgz archiveoptions
- tar command options (default-czf
)input
- required path(s) to input filesoutput
- required path to output tgz
unarchive
- unpack a tgz archiveoptions
- tar command options (default-xzf
)input
- path to the tgzoutput
- optional directory to extract into
These built-ins execute on the build host, not in the container, when a Component is Docker-enabled. This is helpful to avoid I/O performance penalties with Docker on MacOS for example.
Here are the most commonly used commands.
Run all build
rules in the Project with the following. Note that build
is an
arbitrary rule name with no special behavior.
$ zim run build
Run the clean
rule for two specific Components:
$ zim run clean -c comp1,comp2
Build a Component with the cache disabled:
$ zim run build --cache disabled -c comp1
Show the rule cache key for a specific Component and Rule:
$ zim key -r myservice.build
Show the detailed contents of a rule cache key:
$ zim key -r myservice.build --detail
Show all Components in the Project:
$ zim list components
Show all input files used by a Component:
$ zim list inputs -c myservice
Create a new authentication token during setup:
$ zim add token
Auto-completion is available for Components, Rules, and Kinds. Run the following for instructions:
$ zim completion -h