Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using environment variables to update values in worker-spec.yml #92

Closed
jhamman opened this issue Aug 7, 2018 · 11 comments · Fixed by #101
Closed

Using environment variables to update values in worker-spec.yml #92

jhamman opened this issue Aug 7, 2018 · 11 comments · Fixed by #101

Comments

@jhamman
Copy link
Member

jhamman commented Aug 7, 2018

In the dask configuration module, it is possible to use environment variables to update configuration values. It doesn't seem that is currently possible here as the "worker-spec.yml" file is actually a kubernetes spec, rather than a dask config file. I'm curious if there is a way to do that now or if it would be feasible to add such functionality.

somewhat related to #88, #45

Background for my particular use case: jupyterhub/kubespawner#193, jupyterhub/repo2docker#363

@mrocklin
Copy link
Member

mrocklin commented Aug 8, 2018

dask-kubernetes provides two ways to specify the worker template, either as a separate standalone file with a reference to that file in the config, or by placing the entire worker template into the config file. Here is our current blank config file

kubernetes:
  name: "dask-{user}-{uuid}"
  namespace: null
  count:
    start: 0
    max: null
  host: '0.0.0.0'
  port: 0
  env: {}

  worker-template-path: null

  worker-template: {}
    # kind: Pod
    # metadata:
    #   labels:
    #     foo: bar
    #     baz: quux
    # spec:
    #   restartPolicy: Never
    #   containers:
    #   - image: daskdev/dask:latest
    #     args:
    #       - dask-worker
    #       - --nthreads
    #       - '2'
    #       - --no-bokeh
    #       - --memory-limit
    #       - 6GB
    #       - --death-timeout
    #       - '60'
    #     resources:
    #       limits:
    #         cpu: "1.75"
    #         memory: 6G
    #       requests:
    #         cpu: "1.75"
    #         memory: 6G

So one solution would be to just not use worker-spec.yaml, and instead put the worker spec into dask's proper configuration system.

@mrocklin
Copy link
Member

mrocklin commented Aug 8, 2018

We could also merge the two in dask-kubernetes using dask.config.merge(worker_spec_from_file, dask.config.get('kubernetes.worker-template')).

@jhamman
Copy link
Member Author

jhamman commented Aug 8, 2018

Okay, thanks. It seems setting DASK_KUBERNETES__WORKER_TEMPLATE__SPEC__IMAGE along with moving the worker-template to the dask config file will suffice.

Upon further reflection, I'm curious if removing the worker-template file might simplify things here. This would emphasize the use of the dask-config tools which do provide a lot of flexibility.

@mrocklin
Copy link
Member

mrocklin commented Aug 8, 2018 via email

@jhamman
Copy link
Member Author

jhamman commented Aug 8, 2018

Fair enough. I don't have strong feelings here as it seems I can do everything I needed to with existing tooling. Happy to see this closed as is unless others have thoughts on the last topics.

@jhamman
Copy link
Member Author

jhamman commented Aug 22, 2018

So dask/dask#3893 was just merged. How do we feel about expanding environment variables when loading the dask configs from dask-kubernetes?

@mrocklin
Copy link
Member

mrocklin commented Aug 22, 2018 via email

@jhamman
Copy link
Member Author

jhamman commented Aug 28, 2018

I think that one question is "are there any fields that might have environment variables that we *don't *want to expand?"

Not that I'm aware of in this space. @jacobtomlinson, any thoughts here?

@jacobtomlinson
Copy link
Member

I'm trying to think of a scenario where this would be a problem.

I guess there is potential for another package to update the environment and then dask to expand something nasty into the config. e.g updating the docker image with a crypto miner.

Personally I wouldn't be concerned about things like this though. If we are worrying about the user executing malicious code I can think of much worse things than this.

@jhamman
Copy link
Member Author

jhamman commented Aug 29, 2018

On the security point, dask is already vulnerable to that sort of environment variable manipulation. The reason we are discussing this is because the image specification in the worker template is in a list of arguments and dask's use of environment variables only works for nested dictionaries.

@jhamman
Copy link
Member Author

jhamman commented Aug 29, 2018

But I agree with your final point, I'm not particularly concerned about this potential.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants