Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement watcher for oci bundles #4321

Merged

Conversation

ricardomaraschini
Copy link
Contributor

@ricardomaraschini ricardomaraschini commented Apr 22, 2024

Description

Implements an OCI bundle watcher. This allows k0s to load new OCI bundles without requiring a restart of the process. The watcher acts upon Create() or Write() operations happening in the OCI bundles directory and events are debounced with a timeout of 10 seconds.

Fixes # 4316

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

How Has This Been Tested?

  • Manual test
  • Auto test added

At this stage I have updated k0s locally and created/deleted/updated bundles manually in the OCI bundles directory. I have been checking the results by inspecting the logs (as follow) and with k0s ctr i ls:

root@ec:/usr/local/bin# journalctl -u k0scontroller.service -f | grep -i component=OCIBundleReconciler
Apr 22 10:49:44 ec k0s[2014911]: time="2024-04-22 10:49:44" level=info msg="started to watch events on /var/lib/k0s/images" component=OCIBundleReconciler
Apr 22 10:50:06 ec k0s[2014911]: time="2024-04-22 10:50:06" level=info msg="OCI bundle directory changed, reconciling" component=OCIBundleReconciler
Apr 22 10:50:08 ec k0s[2014911]: time="2024-04-22 10:50:08" level=info msg="Imported image docker.io/library/x:latest" component=OCIBundleReconciler
Apr 22 10:50:28 ec k0s[2014911]: time="2024-04-22 10:50:28" level=info msg="OCI bundle directory changed, reconciling" component=OCIBundleReconciler
Apr 22 10:50:29 ec k0s[2014911]: time="2024-04-22 10:50:29" level=info msg="Imported image docker.io/library/x:latest" component=OCIBundleReconciler
Apr 22 10:51:03 ec k0s[2014911]: time="2024-04-22 10:51:03" level=info msg="OCI bundle directory changed, reconciling" component=OCIBundleReconciler
Apr 22 10:51:04 ec k0s[2014911]: time="2024-04-22 10:51:04" level=info msg="Imported image docker.io/library/y:latest" component=OCIBundleReconciler
Apr 22 10:51:40 ec k0s[2014911]: time="2024-04-22 10:51:40" level=info msg="OCI bundle directory changed, reconciling" component=OCIBundleReconciler
Apr 22 10:51:59 ec k0s[2014911]: time="2024-04-22 10:51:59" level=info msg="Imported image docker.io/library/rhel-9-kubernetes-images-1.29.3:latest" component=OCIBundleReconciler
Apr 22 10:51:59 ec k0s[2014911]: time="2024-04-22 10:51:59" level=info msg="Imported image docker.io/kurl/rhel-7-k8s:1.29.3" component=OCIBundleReconciler
Apr 22 10:51:59 ec k0s[2014911]: time="2024-04-22 10:51:59" level=info msg="Imported image docker.io/library/dockerout-containerd-1.6.28-ubuntu-22.04:latest" component=OCIBundleReconciler

Checklist:

  • My code follows the style guidelines of this project
  • My commit messages are signed-off
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

implements an oci bundle watcher. this allows k0s to load new bundles
without requiring a restart of the process. the watcher acts upon create
or write operations happening in the oci bundles directory.

events are debounced with a timeout of 10 seconds.

Signed-off-by: Ricardo Maraschini <[email protected]>
@jnummelin
Copy link
Member

The code might get bit simpler / cleaner if you'd separate loading a single bundle file into it's own func. Also that way when we see FS event, we do not need to go through the entire dir. Although there's probably only few files there anyways so it should not be that big of an optimization. 😄

@ricardomaraschini
Copy link
Contributor Author

ricardomaraschini commented Apr 23, 2024

The code might get bit simpler / cleaner if you'd separate loading a single bundle file into it's own func. Also that way when we see FS event, we do not need to go through the entire dir. Although there's probably only few files there anyways so it should not be that big of an optimization. 😄

I thought about that but haven't gone that route because I noticed we are connecting / disconnecting containerd's client every time. That allied with the fact we are calling client.ListImages(ctx) as some kind of "connection validation" made me believe there were more things than meet my eye.

That all being said I can certainly work to change what we have here. Do you think connecting / disconnecting from containerd for each file would be a problem ? Are you aware of any sort of problem with long live containerd connections ?

@ricardomaraschini ricardomaraschini marked this pull request as ready for review April 23, 2024 08:49
@ricardomaraschini ricardomaraschini requested a review from a team as a code owner April 23, 2024 08:49
create a function to import a single oci bundle. use it in our watcher.

Signed-off-by: Ricardo Maraschini <[email protected]>
@ricardomaraschini
Copy link
Contributor Author

@jnummelin I added a function to import a single OCI bundle and now we are calling it from the watcher. Let me know what you think.

@jnummelin
Copy link
Member

Do you think connecting / disconnecting from containerd for each file would be a problem ?

No, I don't think it'll be any issue. I mean there's not like gazillions of those files and it's "just" a local socket connection.

Are you aware of any sort of problem with long live containerd connections ?

I haven't seen anything but I don't think we've ever implemented ones either. So honestly, dunno 🤷

IMO it's better this way now where we can load single files based on events.

@twz123 WDYT?

@twz123
Copy link
Member

twz123 commented Apr 24, 2024

Do you think connecting / disconnecting from containerd for each file would be a problem ?

No, I don't think it'll be any issue. I mean there's not like gazillions of those files and it's "just" a local socket connection.

Are you aware of any sort of problem with long live containerd connections ?

I haven't seen anything but I don't think we've ever implemented ones either. So honestly, dunno 🤷

IMO it's better this way now where we can load single files based on events.

@twz123 WDYT?

I'm not too concerned about the connect/probe/import/disconnect procedure for each individual bundle file. As you already said: there'll be just a few anyways and they won't change frequently. On the other hand, I wouldn't opt for a long-living connection. That sounds like something that breaks after being idle for x weeks, and then a new file gets added.

But: instead of doing a directory listing in the loadAll method itself, we could add a new parameter with a slice of file names to import. We could inline the loadOne method again, and reuse a single connection for the whole slice, as it was before before. Then we can still just use loadAll with a single-element slice from the watcher.

Copy link
Member

@twz123 twz123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
ricardomaraschini and others added 5 commits April 24, 2024 20:56
Co-authored-by: Tom Wieczorek <[email protected]>
Signed-off-by: Ricardo Maraschini <[email protected]>
Co-authored-by: Tom Wieczorek <[email protected]>
Signed-off-by: Ricardo Maraschini <[email protected]>
Co-authored-by: Tom Wieczorek <[email protected]>
Signed-off-by: Ricardo Maraschini <[email protected]>
Co-authored-by: Tom Wieczorek <[email protected]>
Signed-off-by: Ricardo Maraschini <[email protected]>
Signed-off-by: Ricardo Maraschini <[email protected]>
@ricardomaraschini ricardomaraschini force-pushed the implement-watcher-for-oci-bundles branch from 95758d9 to 1dda734 Compare April 30, 2024 14:27
Signed-off-by: Ricardo Maraschini <[email protected]>
@ricardomaraschini
Copy link
Contributor Author

@twz123 @jnummelin I think this is ready for another round of review. Let me know how it looks like now. Thanks.

@ajp-io
Copy link

ajp-io commented May 3, 2024

@twz123 @jnummelin Are you able to review this PR again sometime soon? We have a core feature that has a significant known issue until we get this fixed. Thanks!

@jnummelin jnummelin added this to the 1.30 milestone May 6, 2024
twz123
twz123 previously approved these changes May 6, 2024
Copy link
Member

@twz123 twz123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Left a few small nits to be improved, if there's time and inclination.

pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
pkg/component/worker/ocibundle.go Outdated Show resolved Hide resolved
Signed-off-by: Ricardo Maraschini <[email protected]>
@ricardomaraschini ricardomaraschini force-pushed the implement-watcher-for-oci-bundles branch from 36ce83b to debf414 Compare May 6, 2024 15:08
@ricardomaraschini
Copy link
Contributor Author

@jnummelin @twz123 Thanks for your help so far, I really appreciate it. Are there any plans to back port this to v1.29 / v1.28 ? If needed I can help with that.

@jnummelin
Copy link
Member

We don't usually backport new features. One could argue though that is this one. 😆 @twz123 WDYT, could/should we backport this?

@jnummelin jnummelin merged commit bd54328 into k0sproject:main May 7, 2024
75 checks passed
@jnummelin
Copy link
Member

@ricardomaraschini Thanks for working on this 👍

@ricardomaraschini
Copy link
Contributor Author

@jnummelin @twz123 Is there any chance of back porting this to v1.29 and v1.28 release branches ? This would be so valuable for our use case here. I tried to create a back port myself but, IIUC, I need to add the backport/release-1.2{8,9} label here and I don't have enough permissions to do so. Welp.

@ricardomaraschini ricardomaraschini deleted the implement-watcher-for-oci-bundles branch May 15, 2024 15:15
@twz123
Copy link
Member

twz123 commented May 16, 2024

Actually, I'm a pretty conservative guy when it comes to backports. I wouldn't want to port it back so much because it's a pretty substantial change, adding new behavior, not fixing a broken one. Shipping your on k0s builds with the backport applied wouldn't be an option for you in this case? (Just asking 🙃)

@twz123
Copy link
Member

twz123 commented May 16, 2024

Another thing that comes to my mind: Eventually, we need to have a way to re-enable GC on containerd side for images that have previously been imported via k0s, but are no longer present. Otherwise, the images will pile up on the image fs, potentially leading to disk pressure in the long run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

autopilot update does not import new images if k0s version remains the same
4 participants