This repository includes easy-to-run example notebooks for Dask. They are intended to be educational and give users a start on common workflows.
They should be easy to run locally if you download this repository. They are also available on the cloud by clicking on the link below:
This repository is a great opportunity to start contributing to Dask. Please note that examples submitted to this repository should follow these guidelines:
-
Run top-to-bottom without intervention from the user
-
Not require external data sources that may disappear over time (external data sources that are highly unlikely to disappear are fine)
-
Not be resource intensive, and should run within 2GB of memory
-
Be clear and contain enough prose to explain the topic at hand
-
Be concise and limited to one or two topics, such that a reader can get through the example within a few minutes of reading
-
Be of general relevance to Dask users, and so not too specific on a particular problem or use case
As an example "how to do dataframe joins" is a great topic while "how to do dataframe joins in the particular case when one column is a categorical and the other is object dtype" is probably too specific
-
If the example requires a library not included in
binder/environment.yml
then it would bepip
installed` in the first cell of the notebook, with a brief explanation about what functionality the library adds. A brief example follows:### Install Extra Dependencies We first install the library X for interacting with Y
!pip install X
-
Modify
binder/environment-base.yml
with new or updated dependencies -
Run a
linux/amd64
Docker container withmamba
available. For example:docker run --platform=linux/amd64 -it --rm --mount type=bind,source=$(pwd)/binder,target=/binder condaforge/mambaforge /bin/bash
This mounts the
./binder
folder in/binder
in the Docker container -
Create the environment
mamba env create -f environment-base.yml
This may take quite a while.
-
Export the environment specification:
mamba env export -n dask-examples --no-builds -f environment.yml