Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC for dependency mirrors #302

Merged
merged 3 commits into from
Apr 9, 2024
Merged

Add RFC for dependency mirrors #302

merged 3 commits into from
Apr 9, 2024

Conversation

dmikusa
Copy link
Contributor

@dmikusa dmikusa commented Mar 20, 2024

Summary

Presently, you can either take the dependencies shipped with a Paketo buildpack or you can create a whole bunch of dependency mapping bindings to change each dependency you want to override individually. There is no option to override all dependencys and point them to a mirror that is convenient.

Big thanks to @bitgully for the idea and libpak implementation.

Readable.

@dmikusa dmikusa requested a review from a team as a code owner March 20, 2024 02:56
Signed-off-by: Daniel Mikusa <[email protected]>
@loewenstein
Copy link

@dmikusa I think we should support cases, where the mirrors might not have a common root. Just imagine the platform team taking care of buildpacks in an air gapped environment is not the same team as the one maintaining the mirrors.

Pointing nodejs.org/dist and github.com/<org>/<repo>/releases/download/<release>/<asset> to individually maintained mirror locations would be essential for us.

Something like this for bindings

github.com: mirrors.example.org/public-github/
nodejs.org: mirrors.example.org/node-dist/

and something similar for the environment variable (probably the separators to use would need careful thought.

@shresthaujjwal
Copy link

Thanks for this RFC details and also the changes to libpak
Few questions

  • Will packit library also have same change
  • Does the override env only override host/schema? In that case we need to make sure the relative path needs to match our internal mirror else this won't work
  • Also when is the provenance of this mirror dependency verification done ? Are we going to verify the sha matches the mirror dependency?
  • Is this an interim solution till we get the decouple dependency from buildpack RFC in place

Thank you

@bitgully
Copy link

@loewenstein The current implementation allows for setting mirror URIs using the {originalHost} placeholder. E.g.: "https://mirrors.example.org/{originalHost}".
This would look for all "github.com" dependencies in "https://mirrors.example.org/github.com/...", and all "nodejs.org" dependencies in "https://mirrors.example.org/nodejs.org/..."
Would this be sufficient for you? Or do you have to separate your repositories by a custom path name other than the original hostname?

@bitgully
Copy link

@shresthaujjwal

@loewenstein
Copy link

Yes @bitgully, I do not necessarily have control over the path, so {originalHost} is not sufficient.

@bitgully
Copy link

Alright, @loewenstein, we could expand this according to your suggestion above and use the current "uri" binding as default/fallback to apply to all dependencies that don't have an explicit binding matching their hostname. If no "default" binding is set, we could simply override the ones where the hostname matches a binding and download all others from their original (public) location.
@dmikusa: Do you think such enhancement would make sense?

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 21, 2024

@loewenstein @bitgully

What I'm taking away is that {originalHost} is not sufficient because teams might not use the hostname to differentiate. They might use an arbitrary key, so we need something to allow {originalHost} to be more flexible.

I'm not seeing that we need to be able to have a separate mirror for every dependency. I don't think we could support this because it's going to create a lot of complexity. I also think that bindings are uniquely positioned to meet this need because they already allow you to remap every individual dependency.

What I think we could do and what I think meets the needs that @loewenstein mentioned is to offer some sort of {originalHost} mapping. Perhaps a different env variable, like BP_DEPENDENCY_MIRROR_PREFIX_MAP={"example.com": "public-example-com"}.

So for the given example, we'd set BP_DEPENDENCY_MIRROR=https://mirrors.example.org/{originalHost} and then we'd set BP_DEPENDENCY_MIRROR_PREFIX_MAP={"github.com": "public-github", "nodejs.org": "node-dist"}. Then when we go to substitute {originalHost}, we'd check the prefix mapping and apply that to the original host, so you'd get the desired URLs. I think we'd add a corresponding binding key/value for this as well, but I want to have an env variable option not just binding options.

Do you think that would work?

I don't like that it requires JSON embedded into the env variable, but I couldn't think of another way to do that. Thoughts on that as well?

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 21, 2024

Is this an interim solution till we get the decouple dependency from buildpack RFC in place

@shresthaujjwal The other RFC is pretty large and ambitious. This RFC is something that we can implement pretty easily. In libpak, it required changes in two files, probably less than 100 LOC. The idea is that this is something we can give people now, pretty easily, that will help with managing dependencies & make the buildpacks more accessible.

I also think that even when we move dependencies out of buildpack.toml, we'll still want to support options for dependency mirrors. This establishes mirrors as a first-class option, and we'll need to continue to support it regardless of where the dependency metadata lives.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 21, 2024

@loewenstein you'd also mentioned these two points on the original issue:

Given the original dependency is https://github.com/SAP/SapMachine/releases/download/sapmachine-11.0.22/sapmachine-jdk-11.0.22_linux-x64_bin.tar.gz would have to take the full path into the mirror, i.e. /SAP/SapMachine/releases/download/sapmachine-11.0.22/sapmachine-jdk-11.0.22_linux-x64_bin.tar.gz, right?

Yes, I don't think we can do any path mapping. We can insert a static prefix, we can insert the original host, and we can probably do something like I mentioned above where we do a mapping from the original host to some other value. I don't think we can remap the entire URL though.

We have to assume that the mirror is going to have the same file structure as the original link, but with a different host/prefix.

So you could set BP_DEPENDENCY_MIRROR=https://www.example.com/mirrors and then https://github.com/SAP/SapMachine/releases/download/sapmachine-11.0.22/sapmachine-jdk-11.0.22_linux-x64_bin.tar.gz would be tried at https://www.example.com/mirrors/SAP/SapMachine/releases/download/sapmachine-11.0.22/sapmachine-jdk-11.0.22_linux-x64_bin.tar.gz.

Does that work?

Should we consider to document this new feature at least in the repository's repo and maybe also on the paketo.io documentation?

Yes, 100%. When the RFC is sorted, we'll document this on Paketo's Docs page. I was thinking under the config section where we have the dependency mapping documented.

@loewenstein
Copy link

@dmikusa How is the additional environment variable make things any easier? Asked differently, how is a host map going to be complex?

I.e. in https://github.com/paketo-buildpacks/libpak/blob/7d4bdbcd28e1a042b07f887b770f05513cd24fbc/dependency_cache.go#L425-L440

func (d DependencyCache) setDependencyMirror(urlD *url.URL) {
        if d.DependencyMirror != "" {
                d.Logger.Bodyf("%s Download URIs will be overridden.", color.GreenString("Dependency mirror found."))
                urlOverride, err := url.ParseRequestURI(d.DependencyMirror)


                if strings.ToLower(urlOverride.Scheme) == "https" || strings.ToLower(urlOverride.Scheme) == "file" {
                        urlD.Scheme = urlOverride.Scheme
                        urlD.User = urlOverride.User
                        urlD.Path = strings.Replace(urlOverride.Path, "{originalHost}", urlD.Hostname(), 1) + urlD.Path
                        urlD.Host = urlOverride.Host
                } else {
                        d.Logger.Debugf("Dependency mirror URI is invalid: %s\n%w", d.DependencyMirror, err)
                        d.Logger.Bodyf("%s is ignored. Have you used one of the supported schemes https:// or file://?", color.YellowString("Invalid dependency mirror"))
                }
        }
}

we would have access to the mapping via d.DependencyMirror and to the actual dependency's host via urlD.Host if I am not mistaken.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 21, 2024

How is the additional environment variable make things any easier? Asked differently, how is a host map going to be complex?

Sorry, let me clarify what I meant. I mean more complex for the user/from a UX perspective.

With the map as you propose and as I understand what you propose (if I'm misunderstood, apologizes please correct me), everyone would need to do it that way. Even if you just have one host.

With the second env variable, BP_DEPENDENCY_MIRROR works the same for everyone, and you only need to worry about BP_DEPENDENCY_MIRROR_PREFIX_MAP if you need that mapping.

I do agree that the implementation is basically the same either way. My main concern is just hiding additional complexity from users who don't need it. I'm not locked into this way of doing it, but I think that's an important consideration.

@shresthaujjwal
Copy link

The idea is that this is something we can give people now, pretty easily, that will help with managing dependencies & make the buildpacks more accessible.

@dmikusa @bitgully i fully agree, this is nice solution. Awesome work thinking outside the box.

If we are thinking of adding BP_DEPENDENCY_MIRROR_PREFIX_MAP then i would recommend adding regex search replace mapping to this as well for team who doesn't have control over relative path (by relative url path i mean this /SAP/SapMachine/releases/download/sapmachine-11.0.22/sapmachine-jdk-11.0.22_linux-x64_bin.tar.gz) and changing them for all our existing mirror that is already setup in Artifactory is large painful process. My recommendation is something like this

{
    "github.com": {
        "mirror": "public-github",
        "path" : [
            {
                "regex-search-replace-description": "Replace /SAP/SapMachine/releases/download with /internal-SAP/generic/releases",
                "regex-search": "",
                "regex-replace": ""
            },
            {
                "regex-search-replace-description": "Replace /SAP/SapMachine/releases/download with /internal-SAP/generic/releases",
                "regex-search": "",
                "regex-replace": ""
            }
        ]
    },
    "nodejs.org": {
        "mirror": "node-dist",
        "path" : [
            {
                "regex-search-replace-description": "Replace /SAP/SapMachine/releases/download with /internal-SAP/generic/releases",
                "regex-search": "",
                "regex-replace": ""
            },
            {
                "regex-search-replace-description": "Replace /SAP/SapMachine/releases/download with /internal-SAP/generic/releases",
                "regex-search": "",
                "regex-replace": ""
            }
        ]
    }
}

@loewenstein
Copy link

With the map as you propose and as I understand what you propose (if I'm misunderstood, apologizes please correct me), everyone would need to do it that way. Even if you just have one host.

I would see two options to keep simple things simple and make complex things possible:

  1. We could allow * as a key as in BP_DEPENDENCY_MIRROR=*:https://mirror.example.com/{originalHost}
  2. We could allow a simple string and a json string
  3. We could have a simple env var and require the use of bindings for the complex case

With 1. we would have to think about separators to allow multiple entries, so I would tend to either 2 or 3 - slightly leaning towards 2, because it also keeps the more complex case a little simpler.

The 2. could also be augmented with the proposed regex capabilities for even more.

@bitgully
Copy link

It's nice to see all the enthusiasm for adding more flexibility. It's just that the core intention was to give the user a simple string to redirect from the public internet to an on-prem location; kind of like a proxy setting in the OS.

Whilst addressing some additional corner cases, I would consider dynamic path replacement out of scope here.
@dmikusa mentioned something similar above:

Yes, I don't think we can do any path mapping. We can insert a static prefix, we can insert the original host, and we can probably do something like I mentioned above where we do a mapping from the original host to some other value.

The proposed solution of using BP_DEPENDENCY_MIRROR_PREFIX_MAP to use different path prefixes for each hostname might cover @loewenstein's use case. This would, however, reach it's limitation again if someone wanted to point to distinct mirror server instances (e.g. one for github.com, one for maven.org). I'm not sure how many real world scenarios would bump into this but implementation wise, it would mean the same effort to allow for additional envs like BP_DEPENDENCY_MIRROR_GITHUB_COM and BP_DEPENDENCY_MIRROR_MAVEN_ORG. This would remove the need for an embedded JSON and might also be easier to grasp for users.

@loewenstein
Copy link

Whilst addressing some additional corner cases, I would consider dynamic path replacement out of scope here.

I agree. I was just hinting at the possibility to extend the solution if we choose 2. for the initial implementation.

The proposed solution of using BP_DEPENDENCY_MIRROR_PREFIX_MAP to use different path prefixes for each hostname might cover @loewenstein's use case.

I this that BP_DEPENDENCY_MIRROR_PREFIX_MAP is unnecessarily complex on the UX side. I.e. with

BP_DEPENDENCY_MIRROR=https://mirror.example.org/{originalHost}
BP_DEPENDENCY_MIRROR_PREFIX_MAP={"github.com":"public-github")

we need two variables instead of one and we need to understand that the PREFIX_MAP with influence the originalHost that get's injected.

Instead with

BP_DEPENDENCY_MIRROR={"github.com":"https://mirror.example.org/public-github/{originalHost}")

this a single env variable and it is quite easy to grasp what it is doing. Note: I added {originalHost} although it probably doesn't make much sense in this concrete example, but to show that I would keep this logic also for the mapping case.

At the same time, we can continue to support the folowing.

BP_DEPENDENCY_MIRROR=https://mirror.example.org/{originalHost}

The idea of separate env variables encoding the host in there has it's appeal. But it doesn't match too well with also supporting the binding, does it?

@bitgully
Copy link

I was hoping, we could solve the bindings like this:

                                       File Content
/platform
    └── bindings
        └── dependency-mirror
            ├── default                https://mirror.example.org/{originalHost}
            ├── github.com             https://mirror.example.org/public-github
            ├── nodejs.org             https://mirror.example.org/node-dist
            └── type                   dependency-mirror

And the environment variables like that:

BP_DEPENDENCY_MIRROR              https://mirror.example.org/{originalHost}
BP_DEPENDENCY_MIRROR_GITHUB_COM   https://mirror.example.org/public-github
BP_DEPENDENCY_MIRROR_NODEJS_ORG   https://mirror.example.org/node-dist

For most users, there wouldn't be any additional complexity as they have to set only one (default) value. For those needing more detailed separation, the configuration pattern stays the same. The thing with using JSON in environment variables is, that we might also have to escape those double quotes when setting them (e.g. in Kubernetes manifests).

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 22, 2024

@bitgully 100%. I'd be fine with that approach.

One implementation detail, if we were to take that approach. How do we get fromBP_DEPENDENCY_MIRROR_GITHUB_COM to something useful? Is it sufficient to translate GITHUB_COM to github.com? i.e. change _ to . and lowercase?

I worry a little about edge cases. If a URL has https://GitHub.com/foo/... and we have GITHUB_COM -> github.com, does that match the URL? or do we need to lowercase all the URLs before we do the replacement? Do we need case to matter in the env variable name? i.e. BP_DEPENDENCY_MIRROR_GitHub_com is what would match that URL and we don't change case?

There are also some limitations as to what you can put into an env variable's name. Could we get into a case where a URL has some character in it that's not a character someone could put into the env variable name? Do we care, since there's the option to fall back and use bindings?

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 22, 2024

Maybe a variation on the env variable approach? It would avoid having the hostname in the env variable name. I think it would be OK as well because you can't have an = in the hostname, so splitting on the first = should give us reliable results.

For a single mapping:

BP_DEPENDENCY_MIRROR              https://mirror.example.org/{originalHost}

For multiple mappings:

BP_DEPENDENCY_MIRRORS_0   https://mirror.example.org/{originalHost}
BP_DEPENDENCY_MIRRORS_1   github.com=https://mirror.example.org/public-github
BP_DEPENDENCY_MIRRORS_2   nodejs.org=https://mirror.example.org/node-dist
  1. Setting BP_DEPENDENCY_MIRROR or just BP_DEPENDENCY_MIRRORS_0 have the same result.
  2. We could make BP_DEPENDENCY_MIRRORS_0 be the default always. Or we could have a special prefix to indicate the default like default= or *=? or even the lack of a hostname= prefix means the default. It doesn't really matter the order, the implementation just needs to know which one is the default.
  3. We could start the index with 1 if that makes more sense to people :)

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 22, 2024

@shresthaujjwal - Sorry, I think full path replacement is out of scope. We can handle some prefix mapping, like what's been discussed with hostnames and mapping, but doing replacements to the actual URLs is a lot more work and a lot more configuration that has to be conveyed to the buildpack so it can do the mappings.

I apologize if that doesn't fit into your current mirror structure and I acknowledge rebuilding a mirror can be a lot of work. For what it's worth, I do want to create some tooling to quickly create a mirror from the given buildpacks, something where you can give it a list of buildpacks and it'll go download all the dependencies and create the file structure. Once this RFC lands and we have all the details sorted out, I'll look into that.

@bitgully
Copy link

Is it sufficient to translate GITHUB_COM to github.com? i.e. change _ to . and lowercase?

After digging out some documents from around the days when I was born, I'd argue it is sufficient. It seems RFC-921 and RFC-952 referenced here suggest hostnames should be considered case-insensitive and consist only of a-z, 0-9 or hyphens. No underscores, and periods are allowed only for separating domain levels. I also did a quick search of the paketo-buildpacks repos. It looks like there are around 20 different hostnames in use for dependencies, all of which comply with this naming convention.

I'd prefer not to complicate this unnecessarily and refrain from using wildcards and mappings (=, :) in the actual values.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 22, 2024

@bitgully Hyphens are the ones that worry me. You can have them in hostnames, but they're not allowed in env variable names. Some shells might support it, but some don't. I've experienced this pain in the past.

If there were to be a domain like exam-ple.com that wouldn't be mappable because there's no way to represent the hyphen in an env variable name. BP_DEPENDENCY_MIRROR_EXAM-PLE_COM isn't valid. My understanding is you have uppercase letters, lowercase letters, and underscores for the env variable name.

Since you looked, thank you very much, and we don't have any dependencies that would be impacted by this then maybe we go ahead with the approach anyway. I do like it as the simplest approach.

I do think we should maybe consider what we would do if a hostname comes up with a - in it though. Cause invariably it will happen right after we implement this :) Maybe a __ gets converted to a -? Any ideas?

@bitgully
Copy link

Oh, now I get it. Sorry. Yes, we already have hostnames with hyphens in existing paketo-buildpacks (e.g. download.bell-sw.com). The __ would be okay, I guess. Bearing in mind that only a few users will come across those cases, I would still prefer such substitution over having numbered envs with the original hostname being part of the value.

This would only affect users that

  • have to separate their mirror locations by the original hostname,
  • use environment variables (not bindings) to do so and
  • hit a dependency that includes a hyphen.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 22, 2024

Ok, sounds like it's time for me to update the RFC. I'll take all this feedback and get it integrated this weekend.

Please keep the feedback coming! Thanks all!

Signed-off-by: Daniel Mikusa <[email protected]>
@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 25, 2024

Ok, please take a second look. I've updated and added a section to include the additional hostname mapping capabilities.

I also took out the unresolved question. I'll post it here if anyone has feedback on it.

It is not clear if it would be helpful to support having a mirror that refers to http://localhost. As this RFC is presently defined, that would not work because https is strictly required. One could in theory generate a TLS public/private key pair and use that to provide https://localhost, but it is unclear if even that would be useful.

The buildpacks run within a container and so localhost refers to the network in the actual container where it's very unlikely that a mirror would be running.

The reference implementation has added specific provisions to ignore TLS certificate verification for https://localhost so that one could potentially do this, even though it doesn't seem practical.

Other options would be to not do anything, in which case http://localhost is forbidden because it's http-only and https://localhost is forbidden because certificates would not verify, or to specifically allow an exception for http://localhost but not http-only for any other domains.

Where I'm landing on this is:

  1. I don't think we should have any special handling for localhost. At least not by default. If it turns out we need that, then we should enable an option to specifically add that. Since there's no clear use case for it now, I don't think it should be added to the RFC, and we should leave that for someone to raise as an issue when and if a use case emerges.

  2. We might want to consider having some special handling for host.docker.internal. This is a special domain name under Docker that resolves to the host where the container is running. I could see this being beneficial as you might want to run a server on your host and have the container reference that. Using host.docker.internal is a convenient way to do that. I'm leaning towards no because of 3.), or because you could use ca-certificates buildpack to trust your local server. If folks feel strongly that we should add special handling, like to trust all certs or allow http:// traffic, then please let me know.

  3. If running builds with Docker, then using the file:// scheme for your BP_DEPENDENCY_MIRROR and pointing to a volume mount allows you to achieve a very similar effect to using a localhost or host.docker.internal mirror. I would say that this should be documented as the preferred option for local builds with Docker. Again, please let me know if you see problems with that.

Thanks for all the great feedback!

@bitgully
Copy link

The updated description looks good to me and I assume the mirror feature will become quite popular for those deploying paketo-buildpacks in corporate networks.

Regarding the unresolved question:
I couldn't think of a use case either, but I assume performing such local builds would be more manual/static in nature where option 3 (using file://), classic dependency mappings, or the mentioned use of the ca-certificates buildpack might be enough.

@loewenstein
Copy link

Why do we require a default mirror to be set? If none is specified for a specific host I'd say we can just try to access the original host.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 25, 2024

Why do we require a default mirror to be set? If none is specified for a specific host I'd say we can just try to access the original host.

I was thinking about it from the security use case. If I'm setting a mirror to ensure that certain dependencies (perhaps ones our security team has approved) are consumed, then I don't want to have a case where the buildpack could accidentally pull other dependencies.

Requiring the dependency mirror means that it would pull from a mapping or fall back to the default.

Also, the way I've been thinking about mirrors and this feature is that using a dependency mirror signals that I want to redirect everything to my mirror. If I only wanted to redirect certain dependencies, then I'd use a dependency mapping like we have already.

How are others thinking about this? Is there a concrete use case for being able to use this feature to only map certain things?

@loewenstein
Copy link

Is there a concrete use case for being able to use this feature to only map certain things?

It's not about a concrete use case for partial mapping actually. It's more about not having a meaningful default.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 25, 2024

Is there a concrete use case for being able to use this feature to only map certain things?
It's not about a concrete use case for partial mapping actually. It's more about not having a meaningful default.

Should we fail if there's no default then?

So options would be:

  • Single mapping, we always use this.
  • Host-name based mapping w/out default. It fails if there is no mapping for a given host.
  • Host-name based mapping w/default. It uses the default if there is no mapping for a given host, which might fail too if the dep doesn't exist on that mirror.

I'd be OK with that as it still preserves the concept that a mirror remaps everything (whereas dependency mappings are used for partial remappings).


I don't think we should do this for the RFC, but if we get an issue or a future use case surfaces where falling back to the original URLs is the right thing to do then we could add a configuration flag to make that work. I personally feel that the expectation if you use a mirror is that everything gets resolved from the mirror. If it can't resolve from the mirror, trying somewhere else feels magical and unexpected to me.

If others feel differently please let me know. That's just my opinion.

@bitgully
Copy link

  • Host-name based mapping w/out default. It fails if there is no mapping for a given host.

I guess it would be okay to fall back to the original URL if no default is set. Even though it might not be reachable from most networks.

Maybe we cannot assume that all mirrors are resolving all types of artifacts. There might be cases where access to the internet would be available but certain host names should still be mapped to a local mirror.

E.g.: There is a mirror that provides docker images only. All images should be pulled from the mirror for performance reasons or because vulnerability scanning and quarantining are done here. But no other artifacts (binaries etc.) are hosted on-prem due to a missing functionality or a lack of control over what is mirrored in a certain organization.

@bitgully
Copy link

@dmikusa I can see your point regarding unexpected magic happening. Maybe this is caused a little by our choice of the "dependency mirror" terminology and way of thinking now; assuming there either is one or there is not.

I think we should bear in mind that the original URL (from buildpack.toml) is the true DEFAULT location. The proposed change would just give us an option
    a) to override specific hosts, and
    b) to override all hosts that are not defined in a).

The functionality is the same but I find conceptually shifting away from having a default mirror, host-specific mirrors and the original location as a potential fallback makes a difference.

@loewenstein
Copy link

I'd prefer optional mirror behaviour like @bitgully describes over failing in case of some hosts are not mapped like @dmikusa describes.
The reason is that we are not in a situation with strictly air-gapped environments right now. Taking dependencies from a mirror is preferred, but like in a SHOULD not in a MUST way.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 28, 2024

Thinking about it more, I'm fine with this. My concern was that calling it a mirror might set a false expectation, but given the way we're naming the env variables I think we're being as direct with what it's doing as we can be.

  • If you're setting BP_DEPENDENCY_MIRROR https://mirror.example.org/{originalHost}, then sure there's an expectation that everything goes to the mirror.

  • If you're setting BP_DEPENDENCY_MIRROR_GITHUB_COM https://mirror.example.org/public-github we have the hostname in the env variable so one should only expect that the particular host goes to the mirror.

Let me update the RFC, and we can see how things look. Sounds like we're getting close to agreement on everything. Maybe we can have a call for votes next week.

@TisVictress
Copy link
Contributor

I'm working on a draft PR to packit with the above implementation.

@dmikusa
Copy link
Contributor Author

dmikusa commented Mar 28, 2024

RFC has been updated. Please all take a look and let me know if there's any more feedback. If it's OK, please add your votes! Thanks

@bitgully
Copy link

This looks good to me.

Copy link

@loewenstein loewenstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bitgully
Copy link

bitgully commented Apr 2, 2024

The libpak implementation has been adapted accordingly: paketo-buildpacks/libpak#322

@ForestEckhardt
Copy link
Contributor

I am struggling the understand how the hostname specific proxy functions. If I have a BP_DEPENDENCY_MIRROR_NODEJS_ORG=https://mirror.example.org/node-dist and then I have a dependency URL that looks like https://nodejs.org/path/to/dep does that turn into https://mirror.example.org/node-dist/path/to/dep or do we lose the back half?

@bitgully
Copy link

bitgully commented Apr 3, 2024

@ForestEckhardt The back half is preserved and the URL turns into https://mirror.example.org/node-dist/path/to/dep.

Copy link
Contributor

@ForestEckhardt ForestEckhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@dmikusa
Copy link
Contributor Author

dmikusa commented Apr 3, 2024

Thanks everyone! We have enough votes to move this forward.

This is a call for any final comments. You have until the end of the day Friday April 5th.

@dmikusa dmikusa merged commit fb96b6a into main Apr 9, 2024
1 check passed
@dmikusa dmikusa deleted the dep-mirrors branch April 9, 2024 03:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants