-
Notifications
You must be signed in to change notification settings - Fork 79
Garden Modes
Originally, Cloud Foundry used warden as its container runtime. This was a linux container runtime written from scratch. In time, community container runtimes emerged and it made sense to use one of them instead. Warden was rewritten in golang, renamed to garden, and made to invoke runc to create containers and run processes in them.
Garden still had to maintain details about container state when using runc. Another project which wraps runc and manages that state is containerd. It made sense to continue the approach of replacing garden code with community maintained code by adopting containerd in garden.
For the sake of making small steps, it was decided to integrate containerd in two stages. First it would be used just for the purpose of container creation. This is known as containerd mode. Then finally it would be used to run processes in those containers too. This is known as containerd-for-processes mode. The original mode using only runc became known as runc mode.
The goal was to eventually deprecate runc mode, along with containerd mode, and have containerd fully managing conatiners and processes in containerd-for-processes mode. At time of writing, we have confidence that containerd-for-processes mode is production ready, but containerd mode is the default mode in cf-deployment, and there is an option to run garden in runc mode.
The garden mode is selected by specifying containerd flags in the config file (or command line arguments).
See the config file structure.
If containerd-socket
is defined with the path to the containerd socket, garden will be in one of the containerd modes.
The exact mode depends on the value of use-containerd-for-processes
.
If true, garden is in containerd-for-processes mode;
if false, the default if omitted, garden is in containerd mode.
When containerd-socket
is not defined, garden is in runc mode.
In runc mode, garden wires runrunc as the containerizer.
A key feature is the depot.
This is normally stored at /var/vcap/data/garden/depot
.
This is a directory storing the state of all running containers and processes.
When a container is created, a directory is created in the depot with the name of the container handle.
The container's OCI bundle, i.e. its spec, is serialized in that directory as 'config.json'.
This is used by runc to create the container, writing logs to 'create.log',
and writing the container process ID in 'pidfile'.
In runc mode when a user creates a container the garden server calls out to runc run
directly using a special init process,
which has the job to make sure there are no zombie processes.
When a user wants to exec a process in a running container, garden cannot just exec runc directly,
because if the process exits while the garden server is down its streams and exit code are lost forever.
There needs to be another process that shares the garden container process lifecycle
and takes care of reading its exit code and copying the content of its streams.
In garden this is handled by dadoo.
So when users are exec'ing processes the garden server is effectively running dadoo exec runc exec
which makes dadoo the parent of the container process.
Dadoo will wait for the container process to exit and writes its exit code as well as the standard streams contents
to several pipes that live in the depot under <container-handle>/<process-id>/
.
The garden server uses those pipes to obtain information about container processes.
Running a pea process is similar to running a normal process,
but since a pea has its own rootfs it needs to have its own runc bundle in the depot.
A directory is created for the pea process under <container-handle>/<process-id>
and the bundle is serialized to config.json
there.
The pea is invoked using dadoo run runc run
with the process directory,
and the standard streams and the exit code are output to pipes as for normal processes.
The pea process ends up as a top-level process in the pea container,
but it does share its process namespace with its parent (sandbox) container.
If a user wants to list containers or attach to processes, the garden server will consult the depot directory, find the right container entry by handle and potentially read the pipes in it.
Containerd is a container management daemon wrapping runc. It handles all the state management that garden had to deal with in runc mode.
There are some differences in terminology when using containerd:
- A 'container' is basically a container definition or bundle
- Creating one in containerd does not provide a running container
- A running container is a containerd task.
- Processes running in a container are still processes.
Containerd has its own directory structure for state storage (usually rooted at /var/vcap/data/containerd
).
For example, container bundles will be stored by containerd in <containerd-root-dir>/state/io.containerd.runtime.v2.task/<namespace>/<container-handle>/config.json
where <namespace>
is usually 'garden'.
This means garden no longer has to store any information about container state in the depot.
Kawasaki does however create hosts
and resolv.conf
files there, so the depot is not completely gone in this mode.
Garden also has no need for dadoo
in containerd-for-processes mode.
Containerd provides a shim component that handles the input/output of container processes and persistence across daemon restarts.
Unlike dadoo there is one shim per container that takes care of all the processes.
Keep in mind that pea processes are technically containers disguised as processes,
so each pea process will have its own shim sharing namespaces with the sandbox container.
Containerd provides the ctr
CLI command, which may be useful for querying containerd about containers, tasks and processes.
This is the mixed mode, where garden creates containers with containerd but uses dadoo and runc to run processes on the containers. Dadoo is also used to run peas, which bypasses containerd for the pea container creation.
Since containerd is taking care of container state, the bundle is not serialized in the depot. However, process directories are created in the depot under the container handle directory and contain the same files as in the runc mode.