Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new snapshots functionality #465

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

amohoste
Copy link
Contributor

Summary

This PR adds the functionality to boot multiple uVMs from a single snapshot, giving more flexibility to using snapshots and also opening up the possibility to add remote snapshot restore functionality.

Implementation Notes ⚒️

The following external dependencies have been updated to support the new snapshot functionality. Once these have been merged, the go.mod file should be updated to the respective ease_lab repo's:

  • Firecracker: now allows specifying a custom devmapper snapshot device
  • Firecracker-containerd: add support for network namespaces support, creating a shim upon loading snapshots, support for diff snapshot, support for specifying a custom devmapper snapshot device in Firecracker
  • Containerd : expose devmapper snapshot device IDs in stat routine

Since the new snapshotting without offloading is not yet compatible with the reap snapshots, an orchestrator interface has been added to support the old logic with offloading and the new snapshots. To also support REAP snapshots in the future with the new snapshotting logic, we would need to be able to obtain the UPF socket path to handle page faults before loading the snapshot which is currently not possible. The two snapshotting implementations have been named "deduplicated" for the new snapshots and "regular" for the old snapshots with offloading but these could probably use better names.

The following components have been added to support the new snapshots, but are also used in the old implementation with offloading:

  • imageManager: manages and serializes pulling of container images, avoiding potential unnecessary network congestion
  • networkManager: manages a pool of in use and potentially also ready to use network namespaces with all the neccessary networking to run a VM preconfigured. Having a pool of ready to use network namespaces reduces cold start latency by taking the network initialization off the critical path of VM creation
  • snapshotManager: new component that manages the available snapshots. An interface has been provided with an implementation for both the offloading snapshots, where a single snapshot can only be used to boot a single VM and an implementation for the new snapshots where the logic has been adjusted to account for the fact that a snapshot kan be used to boot as many VMs as desired

The following components have been added exclusively for the new snapshots:

  • devMapper: creates and manages device snapshots used to store container images. This component was necessary to boot multiple snapshots from a single VM, although further optimizations are certainly possible here

@ustiugov
Copy link
Member

ustiugov commented Feb 9, 2022

For containerd, please create a PR like for others so it easy to discuss what needs to be done. I reviewed the patch, you need to move the new arguments in api/services/snapshots/v1/snapshots.proto to the end of the protobuf messages, like you did it in firecracker-containerd.

Copy link
Member

@ustiugov ustiugov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, Amory! But we do need more before merging it

  • I am sure that this regular/dedup code duplication is a bad SW design decision. there is a ton of replicated code. This bloats the code and making code maintanence much harder. Please merge them and introduce requires runtime arguments etc. Same stands for firecracker-containerd, it has to be 1 version.
  • make sure that you allow the user to override the interface name because a node may have several NICs and the user might want to drive the traffic thru a particular interface (e.g., the highest BW NIC).
  • we need unit and integration tests for all the added functionality. This is a must.

cri/firecracker/coordinator.go Outdated Show resolved Hide resolved
cri/firecracker/coordinator.go Outdated Show resolved Hide resolved
cri/firecracker/coordinator_test.go Show resolved Hide resolved
cri/firecracker/service.go Show resolved Hide resolved
cri/firecracker/service.go Show resolved Hide resolved
ctriface/iface_test.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
metrics/metrics.go Outdated Show resolved Hide resolved
networking/networking.go Show resolved Hide resolved
scripts/install_pmutools.sh Outdated Show resolved Hide resolved
@ustiugov ustiugov self-requested a review March 8, 2022 12:21
Copy link
Member

@ustiugov ustiugov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

Please see my minor comments, and I look forward to seeing the unit and integration tests ASAP.

cri/firecracker/coordinator.go Outdated Show resolved Hide resolved
cri/firecracker/coordinator.go Outdated Show resolved Hide resolved
cri/firecracker/coordinator.go Outdated Show resolved Hide resolved
cri/firecracker/coordinator.go Outdated Show resolved Hide resolved
cri/firecracker/service.go Outdated Show resolved Hide resolved
cri/firecracker/service.go Outdated Show resolved Hide resolved
ctriface/iface.go Outdated Show resolved Hide resolved
ctriface/manual_cleanup_test.go Show resolved Hide resolved
vhive.go Outdated Show resolved Hide resolved
@amohoste amohoste force-pushed the new_snapshots branch 9 times, most recently from 29882fb to 870cb25 Compare March 13, 2022 15:38
@amohoste amohoste force-pushed the new_snapshots branch 17 times, most recently from 45bd9b3 to f6a953d Compare March 27, 2022 14:34
@amohoste amohoste force-pushed the new_snapshots branch 2 times, most recently from ed85025 to f5e0aee Compare May 16, 2022 20:35
@@ -99,6 +99,9 @@ if [ "$SANDBOX" == "gvisor" ]; then
fi

if [ "$SANDBOX" == "firecracker" ]; then
echo Cleaning snapshots
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent seems off

github.com/ease-lab/vhive/examples/protobuf/helloworld => ./examples/protobuf/helloworld
github.com/firecracker-microvm/firecracker-containerd => github.com/ease-lab/firecracker-containerd v0.0.0-20210618165033-6af02db30bc4
github.com/firecracker-microvm/firecracker-containerd => github.com/amohoste/firecracker-containerd v1.0.0-enhanced-snap // TODO: change to vhive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change

@ustiugov
Copy link
Member

@amohoste could you please update the PR's description, particularly the terms. Much of this text can be moved to the doc. Also, a lot of people are interested in how vHive can support remote snaps, can you please elaborate on that in the doc?

Copy link
Member

@ustiugov ustiugov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Amory! Just need to split up the lines into smaller ones for more granular version control.

@cvetkovic could you please read the docs and see if there are any issues with the explanations? is everything clear to you?

@@ -0,0 +1,25 @@
# vHive full local snapshots

When using Firecracker as the sandbox technology in vHive, two snapshotting modes are supported: a default mode and a full local mode. The default snapshot mode use an offloading based technique which leaves the shim and other resources running upon shutting down a microVM such that it can be re-used in the future. This technique has the advantage that the shim does not have to be recreated and the block and network devices of the previously stopped microVM can be reused, but limits the amount of microVMs that can be booted from a snapshot to the amount of microVMs that have been offloaded. The full local snapshot mode instead allows loading an arbitrary amount of microVMs from a single snapshot. This is done by creating a new shim and the required block and network devices upon loading a snapshot and creating an extra patch file containing the filesystem differences written by the microVM upon snapshot creation. To enable the full local snapshot functionality, vHive must be run with the `-snapshots` and `-fulllocal` flags. In addition, the full local snapshot mode can be further configured using the following flags:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please split into short lines for more granular version control

@ustiugov ustiugov assigned cvetkovic and unassigned cvetkovic Jun 7, 2022
@ustiugov ustiugov requested a review from cvetkovic June 7, 2022 20:59
@ustiugov ustiugov added the enhancement New feature or request label Jun 7, 2022
@ustiugov
Copy link
Member

ustiugov commented Jun 7, 2022

@amohoste could you please update CHANGELOG.md?

# vHive full local snapshots

When using Firecracker as the sandbox technology in vHive, two snapshotting modes are supported: a default mode and a
full local mode. The default snapshot mode use an offloading based technique which leaves the shim and other resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain what is a shim here?

offloaded. The full local snapshot mode instead allows loading an arbitrary amount of microVMs from a single snapshot.
This is done by creating a new shim and the required block and network devices upon loading a snapshot and creating an
extra patch file containing the filesystem differences written by the microVM upon snapshot creation. To enable the
full local snapshot functionality, vHive must be run with the `-snapshots` and `-fulllocal` flags. In addition, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fulllocal looks ugly. Maybe consider full_local.

reused, but limits the amount of microVMs that can be booted from a snapshot to the amount of microVMs that have been
offloaded. The full local snapshot mode instead allows loading an arbitrary amount of microVMs from a single snapshot.
This is done by creating a new shim and the required block and network devices upon loading a snapshot and creating an
extra patch file containing the filesystem differences written by the microVM upon snapshot creation. To enable the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Motivate why patching is needed so a user can understand what were the incentives for creating these two modes.

full local snapshot mode can be further configured using the following flags:

- `isSparseSnaps`: store the memory file as a sparse file to make its storage size closer to the actual size of the memory utilized by the microVM, rather than the memory allocated to the microVM
- `snapsStorageSize [capacityGiB]`: specify the amount of capacity that can be used to store snapshots
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amount of capacity - rewrite

## Remote snapshots

Rather than only using the snapshots available locally on a node, snapshots can also be transferred between nodes to
potentially accelerate cold start times and reduce memory utilization, given that proper mechanisms are in place to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve, not reduce memory utilization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants