diff --git a/text/0000-deprecate-daemon.md b/text/0000-deprecate-daemon.md new file mode 100644 index 000000000..e60883189 --- /dev/null +++ b/text/0000-deprecate-daemon.md @@ -0,0 +1,330 @@ +# Meta +[meta]: #meta +- Name: Export image in OCI format into layers +- Start Date: 2022-01-12 +- Author(s): [jjbustamante](https://github.com/jjbustamante/) +- Status: Draft +- RFC Pull Request: (leave blank) +- CNB Pull Request: (leave blank) +- CNB Issue: (leave blank) +- Supersedes: (put "N/A" unless this replaces an existing RFC, then link to that RFC) + +# Summary +[summary]: #summary + +Allow the Lifecycle to export the output image in OCI format and save it to the file system opening the possibility to deprecate the use of the Daemon in the future. + +See requirement(s): + +- [lifecycle #424](https://github.com/buildpacks/lifecycle/issues/423) + + +# Definitions +[definitions]: #definitions +- A [Platform](https://buildpacks.io/docs/concepts/components/platform/) uses a lifecycle, Buildpacks (packaged in a builder), and application source code to produce an OCI image. +- A [Lifecycle](https://buildpacks.io/docs/concepts/components/lifecycle/) orchestrates Buildpacks execution, then assembles the resulting artifacts into a final app image. +- A Daemon is a service, popularized by Docker, for downloading container images, and executing and managing containers from those images. +- A Registry is a long-running service used for storing and retrieving container images. +- An [OCI Image Layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) is the directory structure for OCI content-addressable blobs and location-addressable references. +- An [image index](https://github.com/opencontainers/image-spec/blob/main/image-index.md) is a higher-level manifest which points to a list of manifests and descriptors. +- An [Image manifest](https://github.com/opencontainers/image-spec/blob/main/manifest.md) provides a configuration and set of layers for a single container image for a specific architecture and operating system. +- A [config](https://github.com/opencontainers/image-spec/blob/main/descriptor.md) is a property references a configuration object for a container, by digest. It must support the following media type `application/vnd.oci.image.config.v1+json` + + +Additionally in order to document this RFC we are using the [C4 model](https://c4model.com) which is an "abstraction-first" approach to diagramming software architecture, based upon abstractions that reflect how software architects and developers think about and build software. + +# Motivation +[motivation]: #motivation + +As we can see in the following landscape diagram, currently lifecycle requires access to a Daemon or a Registry to do its job. + +![](https://i.imgur.com/lsCuY8h.png) + + + +This design makes harder to maintain the current capabilities or extend them because there are differences between both formats that increase the complexity of keeping the same functionalities on images that are saved in Daemon or in a Registry. + +For example: The OCI image specification defines a Manifest file but in the Daemon this concept doesn't exists or if we try to add [annotations](https://github.com/buildpacks/rfcs/pull/196) then we can't guarantee the same behavior with images saved in the Daemon. + + +The main goal is to deprecate the use of the Daemon in the lifecycle, embrace the use of the [OCI specification](https://github.com/opencontainers/image-spec) and move the complexity of interacting with the Daemon into the **Platform** component. + +Some of the uses cases this feature can support are: + +- TBD +- TBD + +The expected output image generated by the lifecycle would be saved using the [OCI image layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) in a path configured by the user. + +# What it is +[what-it-is]: #what-it-is + +The general idea is to produce an [OCI image layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) and save it in a file system accesible from the lifecycle execution. The proposal targets the Platform implementor because delegates the following responsibilities to them: + +- Pull the require dependencies (runtime image for example) and pass it through the lifecycle +- Save the resulting image to the daemon + +Let's see the updated landscape diagram after implementing the proposal. + +![](https://i.imgur.com/7xPpyMZ.png) + + +The integration between *Lifecycle* and the *Daemon* is gone and the *Platform* component has now more responsibilities and it's not just forwarding the requests to the lifecycle. + +**Note** It's important to notice, in this proposal *Lifecycle* only interacts with a Filesystem storage avoiding the complexity of determine if the image is on a Registry or in the Filesystem. + + +# How it Works +[how-it-works]: #how-it-works + +As we saw in the previous landscape diagram, the proposal involves changing the *Platform* and the *Lifecycle*, let's check both in details. + +## Platform + +This component takes the responsibility of interacting with the Daemon to make the *Developer* experience easier, let's suppose it prepares the images using a tool similar to [skopeo](https://github.com/containers/skopeo), let's see how this process looks like + +```shell= +# Copy the run image from a daemon and save it locally + +> skopeo copy docker-daemon:alpine:latest oci:alpine:latest +Getting image source signatures +Copying blob 8d3ac3489996 done +Copying config d539cd357a done +Writing manifest to image destination +Storing signatures + +> ls +alpine + +# The structure in oci layout format + +> tree . +. +└── alpine/ + ├── blobs/ + │ └── sha256/ + │ ├── 03.. + │ ├── 69.. + │ └── a0.. + ├── index.json + └── oci-layout +``` + +This information will be send to the *Lifecycle* during the invocation. Once the output from *Lifecycle* is received, *Platform* saves it into the Daemon. Let's suppose the output image is saved in a folder named `oci-dir` + +```shell= +. +└── oci-dir/ + ├── blobs/ + │ └── sha256/ + │ ├── 01.. + │ ├── 55.. + │ └── 86.. + ├── index.json + └── oci-layout + +``` + +*Platform* will push that image into a Daemon, an example of that process using a tool similar to [skopeo](https://github.com/containers/skopeo) looks as follows + +```shell= +> skopeo copy oci:oci-dir:latest docker-daemon:my-oci-app:latest +Getting image source signatures +Copying blob e4ca327ec0e7 done +Copying blob 55fae2d3d3bc done +Copying blob 6faf60e5f26d done +Copying blob e6c238050bf1 done +Copying blob baa9584706b2 done +Copying blob 963a25eb7bff done +Copying blob fc094da8ae34 done +Copying blob 9e1b79ed05fb done +Copying config 01b31028ae done +Writing manifest to image destination +Storing signatures +> docker images | grep my-oci-app +my-oci-app latest 01b31028ae5e N/A 300MB + +``` + +## Lifecycle + +The lifecycle phases affected by this new behavior are: [Analyze](https://buildpacks.io/docs/concepts/components/lifecycle/analyze/), [Export](https://buildpacks.io/docs/concepts/components/lifecycle/export/) and [Create](https://buildpacks.io/docs/concepts/components/lifecycle/create/) + + +The following new input is proposed to be added to these phases + +| Input | Environment Variable | Default Value | Description +|-------------------|-----------------------|--------------------------|---------------------- +| `` | `CNB_LAYOUT_DIR` | "" | The root directory where all the OCI image will be located, including the output image. The presence of a none empty value for this environment variable will enable the feature. | + +As we saw before, *Platform* needs to provide the *Lifecycle* a store with all the images needed to build the application. The proposed structure for this store can be summarize as follows: + + +```shell= +. +└── / + ├── / + │ ├── blobs/ + │ │ ├── (2) + │ │ ├── (3) + │ │ └── (4) + │ ├── oci-layout + │ └── index.json (1) + └── / + ├── blobs/ + │ ├── + │ ├── + │ └── + ├── oci-layout + └── index.json + +``` + +* The ** folder will be pass to *Lifecycle* using the new `-layout` flag or the `CNB_LAYOUT_DIR` environment variable +* *Lifecycle* will attempt to load the `` or `` based on the `layout` flag or the `CNB_LAYOUT_DIR` and the name of the image. + +Let's see some examples of the expected behavior of the *Lifecycle*, for simplicity let's suppose the root store path was set using the environment variable. + +```shell= +export CNB_LAYOUT_DIR=oci + +# phase is one of {analyzer|exporter|creator} +> cnb/lifecycle/$phase [-run-image|-previous-image] +``` + +| Image name provided | Expected behavior | +| -------- | -------- | +| `some-image` | load from $CNB_LAYOUT_DIR/some-image directory | +| `some-image:0.0.1` | load from $CNB_LAYOUT_DIR/some-image directory and enforced the `org.opencontainers.image.ref.name` annotation saved in $CNB_LAYOUT_DIR/some-image/index.json is equal to 0.0.1 | +| `some-image:sha256:03...b` | load from $CNB_LAYOUT_DIR/some-image directory and enforced the `manifest.digest` is equal to 'sha256:03...b' | +| `gcr.io/my-org/some-image` | load from $CNB_LAYOUT_DIR/gcr.io/my-org/some-image directory | | `gcr.io/my-org/some-image:0.0.1` | load from $CNB_LAYOUT_DIR/gcr.io/my-org/some-image directory and enforced the `org.opencontainers.image.ref.name` annotation saved in $CNB_LAYOUT_DIR/gcr.io/my-org/some-image/index.json is equal to 0.0.1 | +| `gcr.io/my-org/some-image:sha256:03...b` | load from $CNB_LAYOUT_DIR/gcr.io/my-org/some-image directory and enforced the `manifest.digest` is equal to 'sha256:03...b' | + +Once the input images references were loaded by the *Lifecycle* it will keep its current behavior and during the export phase, it will save the output in [OCI image layout](https://github.com/opencontainers/image-spec/blob/main/image-layout.md) format at the same store path defined by the user. + +### Proof of concept + +As part of this RFC we did a little PoC on [Lifecycle](https://github.com/buildpacks/lifecycle/pull/793) and [Pack](https://github.com/buildpacks/pack/pull/1314) that allow the user to export the image to their local machine. + +```shell= +> ./out/pack build oci-example --lifecycle-image \ + lifecycle-layout --builder cnbs/sample-builder:bionic \ +--verbose \ +--path /Users/jbustamante/workspace/buildpack.io/samples/apps/java-maven \ +--tag latest \ +--oci-dir . +``` + + +In the following gift we can see the output image exported to the filesystem in OCI layout format + +![](https://i.imgur.com/MgHolnW.gif) + + +# Drawbacks +[drawbacks]: #drawbacks + + +- A major drawback is look-ups. All known and anticipated images must be available on disk since there's no registry or daemon to do further look-ups against during the build process. +- Exploding the images into disk could affect the performance and be very costly +- This approach could have conflicts with the implementation of the Dockerfile [RFC](https://github.com/buildpacks/rfcs/blob/dockerfiles/text/0000-dockerfiles.md). If we try to customize a base image using a Dockerfile there is no way for Platform to know that it needs to pull an image declare in a `FROM` statement and save it into the filesystem storage. + +# Alternatives +[alternatives]: #alternatives + +## Lifecycle hybrid mode approach +In this solution the *Lifecycle* is able to determine when the image can be allocated from the File System storage or from Registry. It's similar to the orginal solution proposed but it could simply the task from *Platform* + + +![](https://i.imgur.com/0ROGR4E.png) + + +### Drawbacks +* Exploding the images into disk could affect the performance and be very costly + +## Lifecycle registry only approach + +In this solution the *Lifecycle* ONLY interacts with a registry for pull/push the requires images, we forget about the OCI layout format and the responsibility to interact with the daemon is still move to the *Platform*. + +![](https://i.imgur.com/P0NxdMw.png) + +### Drawbacks +* The biggest downside of this is probably how a platform like pack makes this work. It can stand up a registry - but I'm not sure how reliable this would be to expose to the containers running on docker without additional configuration in the Docker preferences to allow insecureRegistries, for instance. Or it would need to create a cert and have that trusted when containers boot. + +## Lifecycle daemon wrapper approach + +This solution is a variant of the registry only approach but instead on delegating all the responsibility to *Platform* a new system called *Wrapper Registry* is created, this component is responsable of exposing an Registry API but also to synchronise the data from this registry into the daemon. + +The high level idea is summarize in the following landscape diagram + +![](https://i.imgur.com/dpbDSpg.png) + +If we zoom in into the *Wrapper Registry* component + +![](https://i.imgur.com/qilyDoe.png) + +The *Daemon Sync* component must take care of handling the synchronization of the data saved in the ephemeral registry and the daemon. Thinking on some implementation, there is a suggestion of considering [lazy image distribution](https://github.com/containerd/stargz-snapshotter) to avoid affecting the performance + +### Drawbacks + +## Lifecycle pluggable architecture approach +Based on the PoC results, the hard work to enable the feature for exporting images to OCI layout format was done implementing the [Image interface](https://github.com/buildpacks/imgutil/blob/main/image.go) in imgUtil. The idea is use [Go plugins](https://pkg.go.dev/plugin) concepts and convert the implementation of this interface in an external module that will be injected at runtime in the *Lifecycle*. + +The high level idea is summarized in the following landscape diagram + +![](https://i.imgur.com/oqfVxia.png) + +The *Plugin System* is external to lifecycle because their code resides outside the *Lifecycle* implementation and probably it could be design as a new ecosystem like the Buildpacks. To make my point clear, let's take the following example [code](https://github.com/buildpacks/lifecycle/blob/4ebc4456001e540792e9eef04706864ff1faeeb4/cmd/lifecycle/analyzer.go#L235) from the Analyzer in the Lifecycle: + +```go= +func (aa analyzeArgs) localOrRemote(fromImage string) (imgutil.Image, error) { + if aa.useDaemon { + return local.NewImage( + fromImage, + aa.docker, + local.FromBaseImage(fromImage), + ) + } + + return remote.NewImage( + fromImage, + aa.keychain, + remote.FromBaseImage(fromImage), + ) +} + +``` + +This code is a [Factory Method](https://en.wikipedia.org/wiki/Factory_method_pattern) responsible for creating a `imgutil.Image` interface instance, if we replaced that logic with some kind of `Plugin Engine` that delegates the instantiation of the interface to an external module, then the complexity is moved outside the *Lifecycle* and moves into those plugin. A more detailed interaction is shown in the following container diagram of the *Lifecycle*. + +![](https://i.imgur.com/TUl5hAR.png) + +Platform will be responsible for injecting the plugin into the *Lifecycle* executables at runtime and those plugins will do the job of interacting with the sources of the images. + +The same abstraction can be done in other sections of the core workflow and phases to expose interfaces that can be categorized as **pluggable**, that could help the community to extend lifecycle behavior easily. + +### Drawbacks +* Go plugins are not compatible in Windows OS + +## Lifecycle export ONLY layers above base image + +Is there an alternative here for lifecycle exporting an OCI layout _just_ for the layers above the base image? + +Could the platform be responsible for outputting the base image + lifecycle output? + +### Drawbacks + +* On simple platforms like Tekton. How might they produce this final image? Given a layout or tar of the Top Layers from a build and the ref of the run image, could they use an existing tool to stitch that together? + +* Another drawback is probably somewhat related. Signing images would not be possible inside of lifecycle if it doesn't end up creating the final image. + +# Prior Art +[prior-art]: #prior-art + + +# Unresolved Questions +[unresolved-questions]: #unresolved-questions + + +# Spec. Changes (OPTIONAL) +[spec-changes]: #spec-changes