Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DaskKubernetesEnvironment expose dask scheduler dashboard #3412

Closed
mgsnuno opened this issue Oct 1, 2020 · 4 comments
Closed

DaskKubernetesEnvironment expose dask scheduler dashboard #3412

mgsnuno opened this issue Oct 1, 2020 · 4 comments
Labels
enhancement An improvement of an existing feature

Comments

@mgsnuno
Copy link

mgsnuno commented Oct 1, 2020

Current behavior

DaskKubernetesEnvironment doesn't expose the dask scheduler dashboard.

Proposed behavior

Exposing the dashboard is very useful for debugging and checking the cluster status.

dask-kubernetes allows this to happen by setting dask.config.set({"kubernetes.scheduler-service-type": "LoadBalancer"}). It creates a service for clients and workers to connect to (dask/dask-kubernetes#259 (comment)).

In DaskKubernetesEnvironment:scheduler_spec_file I do the following:

env:
  - name: DASK_KUBERNETES__SCHEDULER_SERVICE_TYPE
    value: LoadBalancer
  - name: DASK__DISTRIBUTED__COMM__TIMEOUTS__CONNECT
    value: "200"
  - name: DASK__KUBERNETES__DEPLOY_MODE
    value: remote

Doesn't work because no LoadBalancer service gets created.

@mgsnuno mgsnuno added the enhancement An improvement of an existing feature label Oct 1, 2020
@jcrist
Copy link

jcrist commented Oct 7, 2020

I think you're missing a space in the above, should be DASK__KUBERNETES__SCHEDULER_SERVICE_TYPE (space after DASK).

If that doesn't work we'll need to delve deeper, but I think that should fix it.

@mgsnuno
Copy link
Author

mgsnuno commented Oct 8, 2020

thanks @jcrist that was definitely wrong. Unfortunately it didn't solve the problem: no service was created, so no loadbalancer with externalIP to connect to. Any ideas of what I could try next?

@jcrist
Copy link

jcrist commented Oct 8, 2020

Hmmm, I'm not sure. My first guess is that the config isn't being set properly, and so isn't picked up by the job that creates the cluster. You might debug this by adding an on_start hook to your DaskKubernetesEnvironment to log some stuff and see if the dask settings are being picked up.

def on_start():
    # log things of interest here.
    # untested, and you might want to log more stuff here, not sure
    import os
    import dask.config
    from prefect.utilities.logging import get_logger
    logger = get_logger()
    dask_envs = [k for k in os.environ if k.startswith("DASK")]
    logger.info("Dask environment vars: %s", dask_envs)
    logger.info("Dask kubernetes config: %s", dask.config.get("kubernetes"))

flow.environment.on_start = on_start

@jcrist
Copy link

jcrist commented Dec 21, 2020

Closing as stale. Note that DaskKubernetesEnvironment is deprecated in favor of using KubernetesRun with a DaskExecutor. See https://docs.prefect.io/orchestration/flow_config/overview.html for more information.

@jcrist jcrist closed this as completed Dec 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants