-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(k8s): Make RunContainer detect/report more error conditions like image pull errors #331
base: main
Are you sure you want to change the base?
Conversation
2d5034b
to
f33af7f
Compare
Codecov Report
@@ Coverage Diff @@
## master #331 +/- ##
==========================================
+ Coverage 81.18% 82.20% +1.01%
==========================================
Files 68 68
Lines 5273 5472 +199
==========================================
+ Hits 4281 4498 +217
+ Misses 848 831 -17
+ Partials 144 143 -1
|
abdb440
to
bc846e0
Compare
We have nothing to do when that happens.
we have to use different list options, so they can't share the same factory.
This needs access to the PodTracker's EvenLister, so we create the function when initializing the containerTracker.
I've documented a bunch of other event reasons to facilitate expanded event handling in the future. Also, I keep running across these messages, so this help me remember what they are and where they come from.
Now that RunContainer listens for k8s-state changes, the tests need to actually simulate the state change like the WaitContainer tests do.
I'll re-add it later when needed.
// known kubelet event reasons are listed here: | ||
// https://github.com/kubernetes/kubernetes/blob/v1.23.6/pkg/kubelet/events/event.go | ||
|
||
// kubelet image event reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to just use their constants or wrap them at least so we know when they change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have not packaged the code so that it can be used externally.
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/events/event.go
If I do go get github.com/kubernetes/kubernetes
I get an error module declares its path as: k8s.io/kubernetes
If I do go get k8s.io/kubernetes
I get an error that k8s.io/[email protected] requires k8s.io/[email protected]: reading k8s.io/api/go.mod at revision v0.0.0: unknown revision v0.0.0
They seem to carefully curate what can be imported by external projects, so we can't import this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, just checking. I just noticed when I went to the code they were public on the package. So, thought they might be importable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought I'd provide a bit of clarity on this subject...
The tl;dr is you can vendor from k8s.io/kubernetes
but its not pretty or straightforward 😅
Here's the reason why go get k8s.io/kubernetes
fails:
kubernetes/kubernetes#79384 (comment)
And to actually vendor from k8s.io/kubernetes
, you have to add a replace
directive for all nested packages:
kubernetes/kubernetes#79384 (comment)
If we want, there are people who've scripted the approach to this so updating the version is easier:
kubernetes/kubernetes#79384 (comment)
To see a real world example of what this looks like in the go.mod
file:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eww... All that work just to reuse some constants... Not worth it imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'm running into some issues with this, so I'm moving it back to draft for now. |
This is an alternate implementation of #279 using the
SharedInformer
infra to watch events introduced in #519.Closes #279
Overview
The Docker Runtime surfaces image pull errors naturally because of the blocking synchronous requests to pull an image or start a container.
The Kubernetes Runtime has to watch for events that show any errors with pulling images.
RunContainer
changes in k8s runtimeWait until one of these conditions before returning:
WaitContainer
changes in k8s runtimeWait until one of these conditions before returning:
containerTracker
additionsI added signal channels for
Running
andImagePulled
. I also added a channel for receiving image pull errors inRunContainer
.podTracker
changesI need pod
Name
andNamespace
in more places, so I copy those into the podTracker now.I was surprised that the
SharedInformerFactory
I used to create the Informer/Lister for watching the build Pod did NOT work for watching events. Apparently, events don't get labeled, so I had to add a separateeventInformerFactory
that uses a FieldSelector instead of a LabelSelector when making the list/watch API calls.I did not rename
informerFactory
topodInformerFactory
because it can easily be used for other k8s resources in the future (if needed), just not for Events.I also added the Event event handler funcs (wow that was a mouth full - events about Events).
podTracker.inspectContainerEvent()
(called by the Event event handlers) is responsible forRunContainer
ImagePulled
signal channel when an event indicates that the pull was successfulI also adjusted
podTracker.inspectContainerStatuses()
so that it can also:Running
signal channel when the container status shows that it is running.TODO: