Merge branch 'dev' into fastmri_workload_variants

mlcommons · Dec 5, 2023 · 4cabfa6 · 4cabfa6
2 parents 41c1c62 + 98146be
commit 4cabfa6
Show file tree

Hide file tree

Showing 14 changed files with 670 additions and 606 deletions.
diff --git a/CALL_FOR_SUBMISSIONS.md b/CALL_FOR_SUBMISSIONS.md
@@ -1,3 +1,30 @@
 # MLCommons™ AlgoPerf: Call for Submissions
 
-🚧 **Coming soon!** 🚧
+## Announcing the AlgoPerf: Training Algorithms Benchmark Competition
+
+Neural networks must be trained to be useful. However, training is a resource-intensive task, often demanding extensive compute and energy resources.
+To promote faster training algorithms, the [MLCommons® Algorithms Working Group](https://mlcommons.org/en/groups/research-algorithms/) is delighted to present the **AlgoPerf: Training Algorithms** benchmark. This benchmark competition is designed to measure neural network training speedups due to *algorithmic improvements*. We welcome submissions that implement both novel and existing training algorithms, including, but not limited to:
+
+- Optimizer update rules
+- Hyperparameter tuning protocols, search spaces, or schedules
+- Data sampling strategies
+
+Submissions can compete under two hyperparameter tuning rulesets (with separate prizes and awards): an external tuning ruleset meant to simulate tuning with a fixed amount of parallel resources, or a self-tuning ruleset for hyperparameter-free algorithms.
+
+## Dates
+
+- **Call for submissions: November 28th, 2023**
+- Registration deadline to express non-binding intent to submit: January 28th, 2024
+- **Submission deadline: March 28th, 2024**
+- **Deadline for self-reporting preliminary results: May 28th, 2024**
+- [tentative] Announcement of all results: July 15th, 2024
+
+For a detailed and up-to-date timeline see the [Competition Rules](/COMPETITION_RULES.md).
+
+## Participation
+
+For details on how to participate in the competition, please refer to our [Competition Rules](/COMPETITION_RULES.md). To learn more about the benchmark, see our [technical documentation](/DOCUMENTATION.md). The benchmark is further motivated, explained, and justified in the accompanying [paper](https://arxiv.org/abs/2306.07179). We require all submissions to be provided under the open-source [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0).
+
+## Prize Money & Funding
+
+MLCommons has provided a total of $50,000 in prize money for eligible winning submissions. We would also like to express our gratitude to Google for their generous support in providing computational resources to score the top submissions, and resources to help score some promising submissions from submitters with more limited resources.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,5 @@
-# Change log
+# Change Log
 
-## TODO: algorithmic-efficiency 0.1.0
+## algoperf-benchmark-0.1.0 (2023-11-28)
 
-First release of AlgoPerf benchmarking code.
+First release of the AlgoPerf: Training algorithms benchmarking code.
diff --git a/COMPETITION_RULES.md b/COMPETITION_RULES.md
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -2,15 +2,16 @@
 
 ## Table of Contents <!-- omit from toc -->
 
-- [Setup](#setup)
+- [Contributing to MLCommons](#contributing-to-mlcommons)
+- [Setup for Contributing](#setup-for-contributing)
   - [Setting up a Linux VM on GCP](#setting-up-a-linux-vm-on-gcp)
   - [Installing GPU Drivers](#installing-gpu-drivers)
   - [Authentication for Google Cloud Container Registry](#authentication-for-google-cloud-container-registry)
 - [Installation](#installation)
-- [Docker workflows](#docker-workflows)
+- [Docker Workflows](#docker-workflows)
   - [Pre-built Images on Google Cloud Container Registry](#pre-built-images-on-google-cloud-container-registry)
-  - [Trigger rebuild and push of maintained images](#trigger-rebuild-and-push-of-maintained-images)
-    - [Trigger build and push of images on other branch](#trigger-build-and-push-of-images-on-other-branch)
+  - [Trigger Rebuild and Push of Maintained Images](#trigger-rebuild-and-push-of-maintained-images)
+    - [Trigger Build and Push of Images on Other Branch](#trigger-build-and-push-of-images-on-other-branch)
   - [GCP Data and Experiment Integration](#gcp-data-and-experiment-integration)
   - [Downloading Data from GCP](#downloading-data-from-gcp)
   - [Saving Experiments to GCP](#saving-experiments-to-gcp)
@@ -19,10 +20,12 @@
 - [Submitting PRs](#submitting-prs)
 - [Testing](#testing)
   - [Style Testing](#style-testing)
-  - [Unit and integration tests](#unit-and-integration-tests)
-  - [Regression tests](#regression-tests)
+  - [Unit and Integration Tests](#unit-and-integration-tests)
+  - [Regression Tests](#regression-tests)
 
-We invite everyone to look through our rules and codebase and submit issues and pull requests, e.g. for rules changes, clarifications, or any bugs you might encounter. If you are interested in contributing to the work of the working group and influence the benchmark's design decisions, please [join the weekly meetings](https://mlcommons.org/en/groups/research-algorithms/) and consider becoming a member of the working group.
+## Contributing to MLCommons
+
+We invite everyone to look through our technical documentation and codebase and submit issues and pull requests, e.g. for changes, clarifications, or any bugs you might encounter. If you are interested in contributing to the work of the working group and influence the benchmark's design decisions, please [join the weekly meetings](https://mlcommons.org/en/groups/research-algorithms/) and consider becoming a member of the working group.
 
 The best way to contribute to the MLCommons is to get involved with one of our many project communities. You find more information about getting involved with MLCommons [here](https://mlcommons.org/en/get-involved/#getting-started).
 
@@ -32,7 +35,7 @@ To get started contributing code, you or your organization needs to sign the MLC
 
 MLCommons project work is tracked with issue trackers and pull requests. Modify the project in your own fork and issue a pull request once you want other developers to take a look at what you have done and discuss the proposed changes. Ensure that cla-bot and other checks pass for your Pull requests.
 
-## Setup
+## Setup for Contributing
 
 ### Setting up a Linux VM on GCP
 
@@ -51,7 +54,7 @@ Use the gcloud credential helper as documented [here](https://cloud.google.com/a
 
 ## Installation
 
-If you have not installed the package and dependencies yet see [Installation](./README.md#installation).
+If you have not installed the package and dependencies yet see [Installation](/README.md#installation).
 
 To use the development tools such as `pytest` or `pylint` use the `dev` option:
 
@@ -62,14 +65,14 @@ pre-commit install
 
 To get an installation with the requirements for all workloads and development, use the argument `[full_dev]`.
 
-## Docker workflows
+## Docker Workflows
 
 We recommend developing in our Docker image to ensure a consistent environment between developing, testing and scoring submissions.
 
 To get started see also:
 
-- [Installation with Docker](./README.md#docker)
-- [Running a submission inside a Docker Container](./getting_started.md#run-your-submission-in-a-docker-container)
+- [Installation with Docker](/GETTING_STARTED.md#docker)
+- [Running a submission inside a Docker Container](/GETTING_STARTED.md#run-your-submission-in-a-docker-container)
 
 ### Pre-built Images on Google Cloud Container Registry
 
@@ -100,15 +103,15 @@ Currently maintained images on the repository are:
 To reference the pulled image you will have to use the full `image_path`, e.g.
 `us-central1-docker.pkg.dev/training-algorithms-external/mlcommons-docker-repo/algoperf_jax_main`.
 
-### Trigger rebuild and push of maintained images
+### Trigger Rebuild and Push of Maintained Images
 
 To build and push all images (`pytorch`, `jax`, `both`) on maintained branches (`dev`, `main`).
 
 ```bash
 bash docker/build_docker_images.sh -b <branch>
 ```
 
-#### Trigger build and push of images on other branch
+#### Trigger Build and Push of Images on Other Branch
 
 You can also use the above script to build images from a different branch.
 
@@ -121,9 +124,7 @@ You can also use the above script to build images from a different branch.
 
 ### GCP Data and Experiment Integration
 
-The Docker entrypoint script can transfer data to and from
-our GCP buckets on our internal GCP project. If
-you are an approved contributor you can get access to these resources to automatically download the datasets and upload experiment results.
+The Docker entrypoint script can transfer data to and from our GCP buckets on our internal GCP project. If you are an approved contributor you can get access to these resources to automatically download the datasets and upload experiment results.
 You can use these features by setting the `--internal_contributor` flag to 'true' for the Docker entrypoint script.
 
 ### Downloading Data from GCP
@@ -216,7 +217,7 @@ New PRs will be merged on the dev branch by default, given that they pass the pr
 
 ## Testing
 
-We run tests with GitHub Actions, configured in the [.github/workflows](https://github.com/mlcommons/algorithmic-efficiency/tree/main/.github/workflows) folder.
+We run tests with GitHub Actions, configured in the [.github/workflows](.github/workflows/) folder.
 
 ### Style Testing
 
@@ -253,14 +254,15 @@ pylint submission_runner.py
 pylint tests
 ```
 
-## Unit and integration tests
-We run unit tests and integration tests as part of the of github actions as well. 
+### Unit and Integration Tests
+
+We run unit tests and integration tests as part of the of github actions as well.
 You can also use `python tests/reference_algorithm_tests.py` to run a single model update and two model evals for each workload using the reference algorithm in `reference_algorithms/target_setting_algorithms/`.
 
-### Regression tests
+### Regression Tests
 
-We also have regression tests available in [.github/workflows/regression_tests.yml](https://github.com/mlcommons/algorithmic-efficiency/tree/main/.github/workflows/regression_tests.yml) that can be run semi-automatically.
-The regression tests are shorter end-to-end submissions run in a containerized environment across all 8 workloads, in both the jax and pytorch frameworks.
+We also have regression tests available in [.github/workflows/regression_tests.yml](.github/workflows/regression_tests.yml) that can be run semi-automatically.
+The regression tests are shorter end-to-end submissions run in a containerized environment across all 8 workloads, in both the JAX and PyTorch frameworks.
 The regression tests run on self-hosted runners and are triggered for pull requests that target the main branch. Typically these PRs will be from the `dev` branch
 so the tests will run containers based on images build from the `dev` branch.
 To run a regression test: