You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A common science-team workflow is to run several set-up steps to start a local Dask cluster on the Sandbox which do not have to be run on the NCI (see GeoscienceAustralia/dea-notebooks#528).
Some of these steps including dask.config.set, configure_s3_access and perhaps closing previous clients seem like they could be run behind-the-scenes in the Sandbox, removing the need for users to run them every time a notebook is launched on the Sandbox.
This could greatly improve performance for a lot of the code on the Sandbox (e.g. speed improvements of 200% for several of our notebooks), which in turn would improve user experience during demonstrations.
if 'AWS_ACCESS_KEY_ID' in os.environ:
# configure dashboard link to go over proxy
dask.config.set({"distributed.dashboard.link":
os.environ.get('JUPYTERHUB_SERVICE_PREFIX', '/')+"proxy/{port}/status"})
# close previous client if any
client = locals().get('client', None)
if client is not None:
client.close()
del client
# start up a local cluster
client = start_local_dask(mem_safety_margin = '3Gb')
## Configure GDAL for s3 access
configure_s3_access(aws_unsigned=True,
client=client);
else:
# close previous client if any
client = locals().get('client', None)
if client is not None:
client.close()
del client
# start up a local cluster
client = start_local_dask(mem_safety_margin = '3Gb')
# show the dask cluster settings
display(client)
The text was updated successfully, but these errors were encountered:
robbibt
changed the title
Automate Sandbox-specific steps to set up local Dask cluster
Automate Sandbox-specific steps for setting up local Dask cluster
Feb 26, 2020
At this stage I think we're going to wrap the above code in a dea-notebooks function so at least we can more easily use Dask in notebooks without scaring users off with the code block above. @Kirill888@tom-butler@alexgleith, can you let @cbur24 or myself know if there's any progress on getting some of those Sandbox-specific steps automated in the Sandbox?
Most of this should probably be done in the dask config file instead, then a notebook can just include a simple client connection line (similar to datacube)
A common science-team workflow is to run several set-up steps to start a local Dask cluster on the Sandbox which do not have to be run on the NCI (see GeoscienceAustralia/dea-notebooks#528).
Some of these steps including
dask.config.set
,configure_s3_access
and perhaps closing previous clients seem like they could be run behind-the-scenes in the Sandbox, removing the need for users to run them every time a notebook is launched on the Sandbox.This could greatly improve performance for a lot of the code on the Sandbox (e.g. speed improvements of 200% for several of our notebooks), which in turn would improve user experience during demonstrations.
@Kirill888 @tom-butler @alexgleith
The text was updated successfully, but these errors were encountered: