crik
is a project that aims to provide checkpoint and restore functionality for Kubernetes pods mainly targeted for
node shutdown and restart scenarios. It is a command wrapper that, under the hood, utilizes
criu
to checkpoint and restore process trees in a Pod
.
crik
is first revealed at KubeCon EU 2024: The Party Must Go on - Resume Pods After Spot Instance Shut Down - Muvaffak Onuş, QA Wolf
It is a work in progress and is not ready for production use.
crik
has two components:
crik
- a command wrapper that executes given command and checkpoints it when SIGTERM is received and restores from checkpoint when image directory contains a checkpoint.manager
- a kubernetes controller that watchesNode
objects and updates its internal map of states so thatcrik
can check whether it should checkpoint or restore depending on its node's state.
The only pre-requisite is to have a Kubernetes cluster running. You can use kind
to create a local cluster.
kind create cluster
Then, you can deploy the simple-loop example where a counter increases every second and you can delete the pod and see that it continues from where it left off in the new pod.
kubectl apply -f examples/simple-loop.yaml
Watch logs:
kubectl logs -f simple-loop-0
In another terminal, delete the pod:
kubectl delete pod simple-loop-0
Now, a new pod is created. See that it continues from where it left off:
kubectl logs -f simple-loop-0
The application you want to checkpoint and restore should be run with crik
command, like the following:
crik run -- app-binary
The following is an example Dockerfile
for your application that installs crik
and runs your application. It assumes
your application is entrypoint.sh
.
FROM ubuntu:22.04
RUN apt-get update && apt-get install --no-install-recommends --yes gnupg curl ca-certificates
# crik requires criu to be available.
RUN curl "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x4E2A48715C45AEEC077B48169B29EEC9246B6CE2" | gpg --dearmor > /usr/share/keyrings/criu-ppa.gpg \
&& echo "deb [signed-by=/usr/share/keyrings/criu-ppa.gpg] https://ppa.launchpadcontent.net/criu/ppa/ubuntu jammy main" > /etc/apt/sources.list.d/criu.list \
&& apt-get update \
&& apt-get install --no-install-recommends --yes criu iptables
# Install crik
COPY --from=ghcr.io/qawolf/crik/crik:v0.1.2 /usr/local/bin/crik /usr/local/bin/crik
# Copy your application
COPY entrypoint.sh /entrypoint.sh
# Run your application with crik
ENTRYPOINT ["crik", "run", "--", "/entrypoint.sh"]
Not all apps can be checkpointed and restored and for many of them, criu
may need additional configurations. crik
provides a high level configuration interface that you can use to configure crik
for your application. The following
is the minimum configuration you need to provide for your application and by default crik
looks for config.yaml
in
/etc/crik
directory.
kind: ConfigMap
metadata:
name: crik-simple-loop
data:
config.yaml: |-
imageDir: /etc/checkpoint
Configuration options:
imageDir
- the directory wherecrik
will store the checkpoint images. It needs to be available in the same path in the newPod
as well.additionalPaths
- additional paths thatcrik
will include in the checkpoint and copy back in the newPod
. Populate this list if you getfile not found
errors in the restore logs. The paths are relative to root/
and can be directories or files.inotifyIncompatiblePaths
- paths thatcrik
will delete before taking the checkpoint. Populate this list if you getfsnotify: Handle 0x278:0x2ffb5b cannot be opened
errors in the restore logs. You need to find the inode of the file by converting0x2ffb5b
to an integer, and then find the path of the file by runningfind / -inum <inode>
and add the path to this list. See this comment for more details.
Alpha feature. Not ready for production use.
You can optionally configure crik
to take checkpoint only if the node it's running on is going to be shut down. This is
achieved by deploying a Kubernetes controller that watches Node
events and updates its internal map of states so that
crik
can check whether it should checkpoint or restore depending on its node's state. This may include direct calls
to the cloud provider's API to check the node's state in the future.
Deploy the controller:
helm upgrade --install node-state-server oci://ghcr.io/qawolf/crik/charts/node-state-server --version 0.1.2
Make sure to include the URL of the server in crik
's configuration mounted to your Pod
.
# Assuming the chart is deployed to default namespace.
kind: ConfigMap
metadata:
name: crik-simple-loop
data:
config.yaml: |-
imageDir: /etc/checkpoint
nodeStateServerURL: http://crik-node-state-server.default.svc.cluster.local:9376
crik
will hit the /node-state
endpoint of the server to get the state of the node it's running on when it receives
SIGTERM and take checkpoint only if it returns shutting-down
as the node's state. However, it needs to provide the
node name to the server so make sure to add the following environment variable to your container spec in your Pod
:
env:
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
Build crik
:
go build -o crik cmd/crik/main.go
Taking checkpoints of processes and restoring them from within the container requires quite a few privileges to be given
to the container. The best approach is to execute these operations at the container runtime level and today, container
engines such as CRI-O and Podman do have native support for using criu
to checkpoint and restore the whole containers
and there is an ongoing effort to bring this functionality to Kubernetes as well. The first use case being the forensic
analysis via checkpoints as described here.
While it is the better approach, since it's such a low-level change, it's expected to take a while to be available in
mainstream Kubernetes in an easily consumable way. For example, while taking a checkpoint is possible through kubelet
API if you're using CRI-O, restoring it as another Pod
in a different Node
is not natively supported yet.
crik
allows you to use criu
to checkpoint and restore a Pod
to another Node
today without waiting for the native
support in Kubernetes. Once the native support is available, crik
will utilize it under the hood.
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.