Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automation #8

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Automation #8

wants to merge 8 commits into from

Conversation

postelrich
Copy link
Collaborator

This PR contains a bash script and relevant asv configuration files to run benchmarking as a cron script. It is setup to run for both dask and distributed and to push to the gh-pages branch. You can already see the published benchmarks for both:

https://dask.github.io/dask-benchmarks/results/distributed
https://dask.github.io/dask-benchmarks/results/dask

Distributed was a little tricky because of it's heavy dependency on dask. To work around this I set it so asv install dask github master. Unfortunately this is not supported with asv and a little hack was added to ensure a file is in the right place for html generation. I was also unable to run benchmarks for previous versions of distributed as distributed.Client did not exist yet, which the tests depend on.

@mrocklin
Copy link
Member

At first glance everything here seems clear to me. However I haven't actually run through things myself yet.

@TomAugspurger and @pitrou might also have thoughts.

DASK_CONFIG=${DASK_ASV_CONFIG:-$HOME/asv.dask.conf.json}
DISTRIBUTED_CONFIG=${DISTRIBUTED_ASV_CONFIG:-$HOME/asv.distributed.conf.json}

source activate dask-asv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this environment created?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, I forgot i wanted to add in here steps that will create the environment if not already present.


echo "Running dask benchmarks..."
cd $DASK_DIR
asv --config $DASK_CONFIG run NEW
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also run new benchmarks, see airspeed-velocity/asv#278

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool will add.

# can get it to work. However directory structure gets messed up and machine.json
# is not in the correct location to generate the graphs. Thus this hack to copy it
# to the right locations before running publish.
find /home/ec2-user/results/distributed/aws-ec2-c4.xlarge -type d -exec cp /home/ec2-user/results/distributed/aws-ec2-c4.xlarge/machine.json {} \;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to avoid hardcoding aws-ec2-c4.xlarge?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole line is just a workaround to install dask master. I have an open issue for asv here: airspeed-velocity/asv#494. Plan to take a look at applying pv's fix.

@postelrich
Copy link
Collaborator Author

@pitrou OK I added in the conda env creation and running new benchmarks. It always reruns benchmarks for distributed at the EXISTING step. Probably due to the pip hack. Need to investigate.

@TomAugspurger
Copy link
Member

No comments from me, other than a thanks to @postelrich for setting this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants