-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make build layers read only for subsequent buildpacks #155
Draft
sambhav
wants to merge
5
commits into
buildpacks:main
Choose a base branch
from
sambhav:build-write-layers
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
7413f05
Add a build-write flag for layers
sambhav 5548197
Add examples for build-write flag and add a caveat
sambhav e4b78ec
Move to a read only layers RFC
sambhav 041b1a5
Clarify output location for exported layers
sambhav 6cbf750
Add rfc feedback and add alternative proposal for different buildpack…
sambhav File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
# Meta | ||
[meta]: #meta | ||
- Name: Make build layers read only | ||
- Start Date: 2021-04-15 | ||
- Author(s): [@samj1912](https://github.com/samj1912) | ||
- RFC Pull Request: (leave blank) | ||
- CNB Pull Request: (leave blank) | ||
- CNB Issue: (leave blank) | ||
- Supersedes: (put "N/A" unless this replaces an existing RFC, then link to that RFC) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
This RFC proposes that build layers should be read-only for subsequent buildpacks. | ||
|
||
> NOTE: This RFC should be implemented as an atomic change alongside RFC https://github.com/buildpacks/rfcs/pull/163 | ||
|
||
# Definitions | ||
[definitions]: #definitions | ||
|
||
- Build layers: Layers from a specific buildpack that are available to subsequent buildpacks during the build process. | ||
- layerize: The process of converting a layer directory to a tarball which will be used to create the final application image during the export phase. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
## Why should we do this? | ||
|
||
Currently the spec forbids layer modifictions by buildpacks that don't contribute a specific layer. This is not enforced by the lifecycle and leads to cases where subsequent buildpacks can unknowningly modify layer content as a side-effect of executing binaries provided by a build layer. Such unexpected modifications are also really hard to debug and pin-point the case as it can be really difficult to figure out which commands from subsequent buildpacks are causing them. | ||
|
||
On the other hand there are valid use cases for build layers that can be used as a collaborative workspace for multiple buildpacks and need to be cached in subsequent runs. | ||
|
||
## What use cases does it support? | ||
|
||
- Allowing buildpacks to explicitly indicate how the layers provided by them should be used and having this be guaranteed by the lifecycle. | ||
|
||
### Example use case for read-only build layers | ||
|
||
A build process consists of 2 buildpacks - a python buildpack and a pip buildpack | ||
|
||
1. Buildpack #1 provides a python binary that can be used by subsequent buildpacks but is also used in the app image | ||
2. Buildpack #2 uses the python binary to invoke pip to provide further dependencies. | ||
|
||
Unbeknownst to the buildpack author, python has a side-effect on execution where if it's installation directory is writable, it creates new cache objects or updates existing files. As a result, execution of buildpack #2 causes the content of the layers provided by buildpack #1 to change, causing the final layer imageID to change and hampers layer reproducibility and new layers have to be pushed out to the registry. | ||
|
||
This could easily be fixed if buildpack #1 only exposed its build layers as read-only to subsequent buildpacks or if it discarded the modifications made by subsequent buildpacks essentially making it "read-only" in the context of the build process. | ||
|
||
## What is the expected outcome? | ||
|
||
The spec and lifecycle is modified to support the above use cases. | ||
|
||
# What it is | ||
[what-it-is]: #what-it-is | ||
|
||
Updates to the lifecycle so that `build` layers created by buildpacks should be layerized immediately after the buildpack that provided the layer is finished with its build process. This way any modification made by the future buildpacks would be ignore while still making this layer available to future buildpacks. | ||
|
||
# How it Works | ||
[how-it-works]: #how-it-works | ||
|
||
The lifecycle will have to layerize all the necessary buildpack layers (either `launch = true` or `cache = true`) during the `build` phase of the lifecycle instead of the `export` phase. | ||
|
||
The layerized output from the `build` phase of lifecycle could be stored at `/<layers>/@exported` (which should not clash with any buildpack IDs) where `<layers>`. The exporter will just read from this to construct the final image. | ||
sambhav marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For example the output could look like `<layers>/@exported/<buildpack-id>/<layer-name>.tar` and `<layers>/@exported/<buildpack-id>/<layer-name>.diffId` | ||
|
||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
- Possibly mixing the concerns of the `build` and the `export` phases. | ||
|
||
# Alternatives | ||
[alternatives]: #alternatives | ||
|
||
- What other designs have been considered? | ||
|
||
We could change the the layer so that it is owned by root or a different user between the build steps of different buildpacks so that subsequent buildpacks cannot modify the layers created by other buildpacks. This would require elevated privileges during the build phase of the lifecycle. | ||
|
||
Alternatively we could also allow a buildpack to specify a different build `UID` and `GID` that it wants to run as. This will require the lifecycle to run a specific buildpack with the buildpack specified `UID` and `GID`. | ||
|
||
- Why is this proposal the best? | ||
|
||
This proposal allows you to achieve the above use cases without changing the expected UID/GID of the layers and without elevated permissions during the build phase. | ||
|
||
- What is the impact of not doing this? | ||
|
||
We could have unknown layer modifications by buildpacks which is currently a spec violation. | ||
|
||
sambhav marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Prior Art | ||
[prior-art]: #prior-art | ||
|
||
TODO | ||
|
||
# Unresolved Questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
The impact this would have on the performance of the export step. The exact details of which parts of the lifecycle build and export phases that need to be changed. | ||
sambhav marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
How to handle the cases when users want a common build workspace that is cached. This will be covered in a subsequent RFC and that RFC and this RFC should be implemented as an "atomic" change to avoid existing use cases from being broken without a proper migration path. | ||
|
||
# Spec. Changes | ||
[spec-changes]: #spec-changes | ||
|
||
None. | ||
sambhav marked this conversation as resolved.
Show resolved
Hide resolved
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a buildpack indicate that a layer should not be layerized until all buildpacks have run?
If a subsequent buildpack modified a layer from an existing buildpack, can we save that as a new layer? This may be inefficient overall, but it makes it much easy to develop buildpacks in some cases.