-
-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sidecars in Worker Template Causes Failure #378
Comments
Thanks for raising this! Byt default we name the worker container Do you have any interest in raising a PR to implement this? |
hey thanks for taking a look at this @jacobtomlinson. I'm definitely interested in raising a PR. The other approach I was thinking was to always select the first container (which will be 100% backwards compatible because no one can run a cluster right now with more than one container), so that way if people overrode the container name in their worker-spec.yaml, their deployments will continue to work. I can do it either way just let me know. The only complication is that I have a Mac with the M1 chip (ARM) so the test suite doesn't appear to be working (the dask dev docker container won't run on ARM it seems), so I might have some issues with the regression testing, but I'll see how far I can get. |
That would work too, as long as the Dask container is always the first one. I have no preference really, as long as it's documented. Ah yeah I also have an M1 Mac and Docker is just not possible. I tend to SSH to an amd64 machine. The CI should run the tests when you push up a PR so feel free to open a draft PR and keep pushing commits to it to trigger the CI. I'll squash on merge anyway. I am curious what your use case is for running multiple contains inside one pod with |
Perfect, thanks for the advice - that should work! Our use case here is to run machine learning models in batch. We build the ML models as a docker container and use Apache Arrow Flight as the external interface so we can use the same container efficiently for both real time and batch use cases as a sidecar. It's kind a more developed version of what I posted here: https://github.com/ehenry2/xgbatch |
The classic |
What happened:
Cluster failed to create when using a worker template that contains multiple containers (e.g. sidecar pattern).
The error:
I traced this back to the logs() function in the Pod class in core.py. It makes a call to read_namespaced_pod_log (which in the case of multiple containers in the pod, needs a "container=" argument passed to it with the name of the container.
What you expected to happen:
The cluster to be created correctly. I expected the logs() function to be smart enough to know which container is the dask container or iterate through each container until it recognized the logs it was looking for.
Minimal Complete Verifiable Example:
And the worker-spec.yaml file:
Anything else we need to know?:
Environment:
The text was updated successfully, but these errors were encountered: