Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tink Worker preprend internal docker registry to any action #469

Closed
gianarb opened this issue Mar 26, 2021 · 23 comments
Closed

Tink Worker preprend internal docker registry to any action #469

gianarb opened this issue Mar 26, 2021 · 23 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.

Comments

@gianarb
Copy link
Contributor

gianarb commented Mar 26, 2021

Right now tink-worker prepends the internal registry to all the actions specified as workflow. This is unexpected behavior for many people coming from docker where by default the registry is docker.io. it also makes it impossible to have a nice copy/paste experience from ArtifactHub or from another teammate because you have to double-check that all the images are proxied to the internal registry (that's a good practice but should not be a strong requirement as it is now).

A lot of tools like containerd, and skopeo who want to be a bit more docker-agnostic forced the idea of a canonical name that requires to specify the registry. So redis:latest does not work but docker.io/redis:latest does.

A possible solution is:

  1. When an action image is not canonical (hello-world) we prepend the internal registry
  2. When an action image is canonical we do not prepend anything

We can use this function to identify if an action image needs the prefix https://github.com/containers/image/blob/a5061e5a5f00333ea3a92e7103effd11c6e2f51d/docker/reference/reference.go#L241

@mfranczy
Copy link

I wanted to raise an issue about that, nice that you did that already, I think the solution you proposed @gianarb is the best in that case.

@gianarb
Copy link
Contributor Author

gianarb commented Mar 26, 2021

I started to pick up the code and obviously "the devil is in the details". This is the function that pulls the image:

func (r *RegistryConnDetails) pullImage(ctx context.Context, cli *client.Client, image string) error {
	authConfig := types.AuthConfig{
		Username:      r.user,
		Password:      r.pwd,
		ServerAddress: r.registry,
	}
	encodedJSON, err := json.Marshal(authConfig)
	if err != nil {
		return errors.Wrap(err, "DOCKER AUTH")
	}
	authStr := base64.URLEncoding.EncodeToString(encodedJSON)

	out, err := cli.ImagePull(ctx, r.registry+"/"+image, types.ImagePullOptions{RegistryAuth: authStr})
	if err != nil {
		return errors.Wrap(err, "DOCKER PULL")
	}
	defer out.Close()
	if _, err := io.Copy(os.Stdout, out); err != nil {
		return err
	}
	return nil
}

As you can see this is where the registry gets appended: out, err := cli.ImagePull(ctx, r.registry+"/"+image.

We can do what I just suggested prepending the registry only if image is not canonical. I think it will work for publicly available images (we have to check), but for sure it won't work for the private images because there is no multi registry support yet.

		Username:      r.user,
		Password:      r.pwd,

Are passed via flag when tink-worker starts. And it supports only one registry.

I think we can test out this change and make it clear that only public image is supported outside of the internal registry (until we figure out how to properly add support for multiple registries).

I think those are two different issues:

  1. Do we want to prepend the internal registry every time? I think we do not want to do it.
  2. How do we support multiple registries?! (I would like to leave this out to its own issue)

@mfranczy
Copy link

I think we can test out this change and make it clear that only public image is supported outside of the internal registry (until we figure out how to properly add support for multiple registries).

that sounds good to me

Do we want to prepend the internal registry every time? I think we do not want to do it.

would be possible to have some option to do it? Something like prepend-internal-registry that defaults to true or ENV? I can imagine situations where indeed we don't want to prepend.

@gianarb
Copy link
Contributor Author

gianarb commented Mar 26, 2021

would be possible to have some option to do it? Something like prepend-internal-registry that defaults to true or ENV? I can imagine situations where indeed we don't want to prepend.

It hurts me to have options for everything because it means you have to understand them, and it will sound clear to you now because you are discussing this topic specifically, but for new people or in long term a binary with 123 flags won't be manageable or understandable. And any issue I look at has a request for a flag, or at least that's my feeling eheh.

My solution already gives you an indirect feature flag, do you want to avoid your image being prefixed with the internal registry? Specify its canonical name. wdyt @mfranczy ?

@mfranczy
Copy link

mfranczy commented Mar 26, 2021

It hurts me to have options for everything because it means you have to understand them, and it will sound clear to you now because you are discussing this topic specifically, but for new people or in long term a binary with 123 flags won't be manageable or understandable. And any issue I look at has a request for a flag, or at least that's my feeling eheh.

I see your point, fair enough.

My solution already gives you an indirect feature flag, do you want to avoid your image being prefixed with the internal registry? Specify its canonical name. wdyt @mfranczy ?

Okay you're right, giving this a second thought it's always better to be explicit over canoncial name if you don't want to use internal registry. I was just thinking about my weird dev test setup that's why I ask about option. We should follow this what you wrote.

@gianarb
Copy link
Contributor Author

gianarb commented Mar 26, 2021

I was just thinking about my weird dev test setup that's why I ask about option

Unfortunately, the setup is so convoluted that makes me tired as well 👍 I am glad we can make it better!

@moadqassem
Copy link
Member

@gianarb I would always take whatever the user adds. As a user, I am expecting the image that I am writing down, will be pull exclusively, without assuming any bundling underneath the hood. If I would like to use the internal registry then I have to be clear about that and states it in my image url. I know that has the draw back of duplication(if the repo is the same), however, that removes any obscure assumptions.

If the users wish to use one registry for all the images, then they should adjust that in the docker-compose file DOCKER_REGISTRY. If they are running tinkerbell directly on the machine they should adjust to this flag https://github.com/tinkerbell/tink/blob/master/cmd/tink-worker/cmd/root.go#L50. Eventually documenting this.

Regarding private repos and the usage of uname and pass can be added as what you suggested. having those two properties per action, if they are there then this is a private repo, if not then it is public.

IMO this way we have more flexibility, however, if you have a hard requirement to verify those images, then your solution sounds like a good candidate :)

P.S: most of the actions are probably gonna be generated out of a template, which means those image names(urls) are also probably gonna be generated out of some configurations. Thus duplication isn't really a major issue.

@mfranczy
Copy link

mfranczy commented Mar 26, 2021

If I would like to use the internal registry then I have to be clear about that and states it in my image url. I know that has the draw back of duplication(if the repo is the same), however, that removes any obscure assumptions.

Since the internal registry is integral part of the project (and should be used for integration or production setups), I think this is not obscure assumption, at least to me. I think of overwriting registry mostly for debug/development use case.

@moadqassem
Copy link
Member

Since the internal registry is integral part of the project (and should be used for integration or production setups)

@mfranczy Sure, the registry is an integral part of the project, but wouldn't be better to let the user to decide for that, instead of enforcing that on them. IMO the registry shouldn't be treated as a core part unconditionally. There are some examples where registries and mirrors are important, especially in cases of a) the provisioning env is offline, b) remote repos introduces pull limits(thinking of docker hub here 😉 ). the registry should be optional(somehow like an addon to the tinkerbell stack) not enforcing it. Now if that's the case, then this issue won't be here in the first place.

I think of overwriting registry mostly for debug/development use case.

If registry was treated as an addon instead of being a core component, then this won't be needed at all, as the provisioner would identify that, there is no registry configured, thus treat the image url as it is.

Anyway, don't wanna open a whole new workstation here(making registry optional) 😅, but I would always let the users decide which datasources they are interested in and how to access them. However, for the current approach which has been implemented, @gianarb introduced method is good. My only concern is, the corner cases that tag along(if it public then we support it, if not then 🤷‍♂️).

@gianarb
Copy link
Contributor Author

gianarb commented Mar 29, 2021

@gianarb introduced method is good. My only concern is the corner cases that tag along(if it public then we support it, if not then 🤷‍♂️).

Yes, it is not easy to implement. Today when looking at the code and edge cases I realized that probably we should not do it. I like the idea that everything passes for an internal registry but we should do what we thought a few months ago and configure the registry to act as a proxy to quay.io/tinkerbell-actions for sandbox. In this way, the community at least get access to all the images we publish to https://quay.io/organization/tinkerbell-actions

@gianarb
Copy link
Contributor Author

gianarb commented Mar 29, 2021

This is the PR tinkerbell/playground#73

@thebsdbox
Copy link
Contributor

From the conversations that we've had, the general consensus is to remove any sort of prefixing of repository internal or otherwise. Given that this is the docker engine executing containers from registry, we should follow the already existing logic.

<repository_address>/<image>:<tag>

The documentation will need changing in places to reflect this, but most people will just need to update their templates to ensure that the actions are picked up from quay.io or their local registry, usually 192.168.1.1 (in sandbox)

@gianarb
Copy link
Contributor Author

gianarb commented Apr 9, 2021

Every time I look at it I learn something new. Let me summarise. Or at least let me try.

First of all, you have to remember that the pull request needs to be authenticated, public images can be downloaded without a token. Right now tink-worker deals with only one registry, the internal one. That's why the tink-worker binary gets registry URL, password, and username from the outside at start time as a flag.

The way docker cli handles multi registry authentication is via configuration file docker login creates a config.json and so on...

Having only one registry simplifies how tink-worker works. Removing the single registry and the prefixed URL means that we have to replicate something similar (or the same) login docker cli implements. Having an internal registry as the only gateway for image pull simplifies authentication because we move complicity elsewhere (for example to the registry itself that will have to act as a proxy (managing authentication as well). Am I missing something?

@thebsdbox
Copy link
Contributor

Perhaps we could look at this from the other direction? Tinkerbell comes with it's own optional internal registry, that requires auth/certs etc.. which is great for internal actions perhaps that have security restrictions. Externally the actions images are all public, such as the official ones on quay.

@gianarb
Copy link
Contributor Author

gianarb commented Apr 9, 2021

Ok, Tink Worker supports only one authenticated registry a the time. It can be the internal one, or even quay/docker if you replace the internal one with them.

if we want to send unauthorized requests when the specified image does not start with the registry passed to tink-worker we can do that. It will suffer rate-limit set by registries such as Docker Hub and Quay leaving a not-so-great experience.

check how the image name starts is the only way I think right now have to figure out if an image targets a registry we do not know how to authenticate to. Another way is to pull with auth and fall back to authorized if it fails. It sounds convoluted to explain and unexpected.

The more I look at what we are proposing here and more I think: "we want to simplify playing with Tinkerbell but it won't help all the other use cases"

What we have today at this point sounds way more linear, we should probably document it more clearly. Everything has to pass to an internal registry, we can make the internal registry smarter with a better proxy configuration so we won't have to pull/push EVERY image. 😖 😕 😫

But the alternative implementations we discussed sound all flaky to me at this point 😭 But the idea that you can't copy-paste images randomly from the web is also sad.

The right solution at the end is to implement what Docker CLI does with proper support for multiple authorized registries in tink-worker... But yes it is not a quick fix probably also because it opens the question: HOW DO YOU PASS THE AUTH CONFIGURATION TO OSIE/Hook (tinkerbell/roadmap#37) ? 👍

@jacobweinstock
Copy link
Member

jacobweinstock commented Apr 9, 2021

Not being able to natively support a public registry is also a not-so-great experience.

As you've suggested, checking the image name for the action and if it contains the registry that needs auth we add the auth, and if the image name doesn’t contain the registry don't add the auth feels straightforward and solid. this wouldn't break the existing experience and would add the flexibility users are wanting. We'd also update docs to explain this and the concerns with public registries, rate limits, etc.

This feels like a step toward multiple authorized registries too.

very rough example

func (r *RegistryConnDetails) pullImage(ctx context.Context, cli *client.Client, image string) error {
	authConfig := types.AuthConfig{
		Username:      r.user,
		Password:      r.pwd,
		ServerAddress: r.registry,
	}
	encodedJSON, err := json.Marshal(authConfig)
	if err != nil {
		return errors.Wrap(err, "DOCKER AUTH")
	}
	authStr := base64.URLEncoding.EncodeToString(encodedJSON)
    
	// New  bit here vv
	var pullOpts types.ImagePullOptions{}
	if strings.Contains(image, r.registry) {
		pullOpts = types.ImagePullOptions{RegistryAuth: authStr}
	}

	out, err := cli.ImagePull(ctx, image, pullOpts)
	if err != nil {
		return errors.Wrap(err, "DOCKER PULL")
	}
	defer out.Close()
	if _, err := io.Copy(os.Stdout, out); err != nil {
		return err
	}
	return nil
}

@gianarb
Copy link
Contributor Author

gianarb commented Apr 9, 2021

Yep, that works @jacobweinstock it sounds like something you do not expect (similar to what we are trying to fix (who expects that everything has the internal registry pre-pended)).

If we see it as an improvement you can push your code because it works (I think strings.Contains open the door to unpleasent edge cases like, the DNS name of the internal registry can't be contained to the image name) 👍

@jacobweinstock
Copy link
Member

yeah, i agree with the strings.Contains and edge cases. the library you suggest at the beginning would probably be better

@gianarb
Copy link
Contributor Author

gianarb commented Apr 9, 2021 via email

@jacobweinstock
Copy link
Member

jacobweinstock commented Apr 9, 2021

apologies, I'm going to retract any suggestion to modify pullImage to add logic. I think that pullImage should be left generic and only pull images based on what is passed in. I'll even go as far as to say that the authConfig and authStr in pullImage should be removed.

Any logic around when to or not to use auth should be pushed up to the caller of pullImage. pullImage should probably take an additional parameter of types.ImagePullOptions. This improves this function to single purpose and makes it clear from what is passed in what it does.

looks like pullImage is being called here. So we should probably modify it there at the caller site to have the logic to determine if auth should or should not be passed in.

Also, strings.Contains is probably not the best way to go about this determination. If we dont like/want to use reference.ParseNamed then strings.HasPrefix might get us started on a more ideal way.

@metahertz
Copy link

metahertz commented May 15, 2021

Also adding my +1 after being caught out by the "invisible" append, especially when all the documentation examples (ubuntu install, for example) seem to simply pull the public quay.io images which would never work in reality.

Finally found the issue when digging through the tink-worker logs and finding docker complaining about 10.x.x.x/quay.io/image etc.

My usecase requires pulling remote, public registry images, which should be supported IMHO as all the private information the container needs is coming from the metadata service anyway. The initial suggestion of keeping current logic, but dropping back to "unauthenticated pull attempt" if a repo is specified in the workflow makes a lot of sense to me?

image:latest - Current functionality, internal registry appended and auth used.
quay.io/image:tag - pass directly to the docker pull, no auth.
anything.tld/image:tag - pass directly to the docker pull, no auth.

@jacobweinstock
Copy link
Member

Sounds like supporting other registries with and without auth is something that is wanted. Maybe a good next step is to start with a PR and work from there?

@tstromberg tstromberg added kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Aug 27, 2021
@mmlb
Copy link
Contributor

mmlb commented Feb 8, 2022

Lets close this and discuss in #586

@mmlb mmlb closed this as completed Feb 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

8 participants