If you haven't already done so, start by reading the Code Submission Format page on the competition website.
Welcome to the runtime repository for the Detection Track of the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge!
This repository contains the definition of the environment where your code submissions will run. It specifies both the operating system and the software packages that will be available to your solution.
This repository has three primary uses for competitors:
π‘ Working example solutions to help you get started with the challenge:
- Quickstart example: A minimal example that runs succesfully in the runtime environment and outputs a properly formatted submission CSV. This will generate arbitrary predictions, so unfortunately you won't win the competition with this example, but you can use it as a guide for bringing in your own work and generating a real submission.
- Benchmark example: A modestly more advanced example that uses a YOLO pretrained model to generate bounding box predictions. You won't win the competition with this example either, but it's a step in the right direction.
π§ Test your submission: Test your submission with a locally running version of the container to discover errors before submitting to the competition site. Generally, this should save you time and help you iterate faster.
π¦ Request new packages in the official runtime: All packages required by your submission must be pre-installed, and your submission will not have internet access. If you want to use a package that is not in the runtime environment, make a pull request to this repository.
This section guides you through the steps to test a simple but valid submission for the competition.
First, make sure you have the prerequisites installed.
- A clone or fork of this repository
- Enough free space on your machine for the spacecraft images dataset (at least 10 GB) and Docker container images (5 GB)
- Docker
- GNU make (optional, but useful for running commands in the Makefile)
First, go to the challenge download page to start downloading the challenge data. Save the submission_format.csv
and training_labels.csv
in this project's /data
directory.
The images dataset is broken up into individual tar files of approximately 10 GB in size. Download at least one of these tar files to get started, and then extract it to the data/images
directory.
Once everything is downloaded and in the right location, it should look something like this:
data/ # Runtime data directory
βββ images/ # Directory containing image files
β β
β βββ a0a0d73d0e9a4b16a23bc210a264fd3f.png
β βββ a0a6efb87e1fcd0c158ba35ced823250.png
β βββ a0a0d73d0e9a4b16a23bc210a264fd3f.png
β βββ a0a6efb87e1fcd0c158ba35ced823250.png
β βββ ...
β
βββ submission_format.csv # CSV file showing how submission should be formatted
βββ train_labels.csv # CSV file with ground truth data
Later in this guide, when we launch a Docker container from your computer (or the "host" machine), the data
directory on your host machine will be mounted as a read-only directory in the container as /code_execution/data
. In the runtime, your code will then be able to access all the competition data at /code_execution/data
, which will by default look to your script like ./data
since your script will be invoked with /code_execution
as the working directory.
A working example submission is provided in this project's example_src/
to help you get started.
In that directory, you'll see the main.sh
file that you're required to include in your submission. Below are the full contents of that file -- for this simple example, this script simply sets a couple path variables and runs a python script called main.py
. You are welcome to make the main.sh
behavior more complex as you develop your submission and there's no requirement that you use it to call a python script (we just think this will be a fairly common use pattern).
#!/usr/bin/env bash
DATA_DIR=/code_execution/data
SUBMISSION_PATH=/code_execution/submission/submission.csv
# call our script (main.py in this case) and tell it where the data and submission live
python main.py $DATA_DIR $SUBMISSION_PATH
The main.py
script is fairly straightforward as well. For this quickstart example, the script doesn't even try to generate reasonable predictions. It just returns an arbitrary bounding box for each image in the dataset. That won't generate a very good score, but it will still be a valid submission, which is what we're interested in for starters.
The primary purpose of this runtime repository is to allow you to easily test your submission before making a submission to the DrivenData platform.
Your submission is going to run inside a Docker container on our code execution platform. This repository contains the definition for that (Docker container)[https://github.com/drivendataorg/spacecraft-pose-object-detection-runtime/tree/main/runtime], as well as a few commands you can run to easily download the Docker image and test your submission. Below we walk through those commands.
First, make sure Docker is running and then run the following commands in your terminal:
make pull
downloads the latest official Docker image from the container registry (Azure). You'll need an internet connection for this.make pack-example
zips the contents of theexample_src
directory and saves it assubmission/submission.zip
. This is the file that you will upload to the DrivenData competition site for code execution. But first we'll test that everything looks good locally in step #3.- Note: When running this again in the future, you may need to first run
make clean
before you re-pack the example for submission, both because it won't rerun by default if the submission file already exists, and also because sometimes running with Docker before may have created files in the mounted submission directory with different permissions.
- Note: When running this again in the future, you may need to first run
make test-submission
will do a test run of your submission, simulating what happens during actual code execution. This command runs the Docker container with the requisite host directories mounted, and executesmain.sh
to produce a CSV file with your image rankings atsubmission/submission.csv
.
make pull
make clean && make pack-example
make test-submission
π Congratulations! You've just tested a submission for the Pose Bowl challenge. If everything worked as expected, you should see a new file has been generated at submission/submission.csv
.
If you were ready to make a real submission to the competition, you would upload the submission.zip
file from step 2 above to the competition Submissions page. Once submitted, our code execution platform would then run your submission, and generate a new submission.csv
on the unseen test set that will get scored automatically using the competition scoring metric to determine your rank on the leaderboard.
In your local model development and cross validation, you may wish to use the same scoring
metric that will be employed when your real submissions are scored. We have included a script
that implements the same logic at scripts/score.py
.
The usage is:
β― python scripts/score.py --help
usage: score.py [-h] predicted_path actual_path
Calculates the Jaccard index score for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge. Args: predicted_path (str | Path): Path to predictions CSV file matching submission format
actual_path (str | Path): Path to ground truth CSV file Returns: Dict[str, float]: Jaccard index score
positional arguments:
predicted_path Path to predictions CSV.
actual_path Path to ground truth CSV.
options:
-h, --help show this help message and exit
For example, using the submission_format.py
as the predictions with our training labels as the
ground truth, we can verify that we achieve a (bad!) score:
β― python scripts/score.py data/submission_format.csv data/train_labels.csv
{
"score": 0.0
}
Now that you've gone through the quickstart example, let's talk about how to develop your own solution for the competition.
This section provides instructions on how to develop and run your code submission locally using the Docker container. To make things simpler, key processes are already defined in the Makefile
. Commands from the Makefile
are then run with make {command_name}
. The basic steps are:
make pull
make pack-submission
make test-submission
Let's walk through what you'll need to do, step-by-step. The overall process here is very similar to what we've already covered in the Quickstart, but we'll go into more depth this time around.
-
Download the official competition Docker image:
$ make pull
-
βοΈ Save all of your submission files, including the required
main.sh
script, in thesubmission_src
folder of the runtime repository. This is where the real work happens.- You are free to modify the
main.sh
scripts we've provided as examples. Just make sure that you adhere to the competition rules and you still produce asubmission.csv
in the correct format. - Also keep in mind that the runtime already contains a number of packages that might be useful for you (see: environment.yml). If there are other packages you'd like added, see the section below on updating runtime packages.
- Finally, make sure any model weights or other files you need are also saved in
submission_src
.
- You are free to modify the
-
Create a
submission/submission.zip
file containing your code and model assets insubmission_src
:$ make pack-submission
-
Test your submission. The command below will launch an instance of the competition Docker image, replicating the same inference process that takes place in the official code execution runtime. This will mount the requisite host directories on the Docker container, unzip
submission/submission.zip
into the root directory of the container, and then executemain.sh
to produce a CSV file with your predictions atsubmission/submission.csv
.$ make test-submission
β οΈ Remember that for local testing purposes, the/code_execution/data
directory is just a mounted version of what you have saved locally in this project'sdata
directory. So you will just be using the publicly available training files for local testing. In the official code execution environment,/code_execution/data
will contain the actual test data, which no participants have access to, and this is what will be used to compute your score for the leaderboard.
When you run make test-submission
the logs will be printed to the terminal and written out to submission/log.txt
. If you run into errors, use the log.txt
to determine what changes you need to make for your code to execute successfully. This same log will be kept when you make a submission on the platform, which you can access through the user interface. Note: try to be judicious about what you keep in the log - if the log is overly chatty it may get truncated when you view it on the platform.
All internet access is blocked in the runtime environment. This means that you will need to package any required resources into your submission.zip
.
For example, it is common for models to download pre-trained weights. Since submissions do not have internet access, you will need to include all weights along with your submission.zip
and make sure that your code loads them from disk and rather than the internet.
If you want to use a package that is not in the environment, you are welcome to make a pull request to this repository. If you're new to the GitHub contribution workflow, check out this guide by GitHub. The runtime manages dependencies using conda environments. Here is a good general guide to conda environments. The official runtime uses Python 3.10 environment.
To submit a pull request for a new package:
-
Fork this repository.
-
Edit the conda environment YAML file,
runtime/environment.yml
. There are two ways to add a requirement:- Add an entry to the
dependencies
section. This installs from a conda channel usingconda install
. Conda performs robust dependency resolution with other packages in thedependencies
section, so we can avoid package version conflicts. - Add an entry to the
pip
section. This installs from PyPI usingpip
, and is an option for packages that are not available in a conda channel.
For both methods be sure to include a version, e.g.,
numpy==1.20.3
. This ensures that all environments will be the same. - Add an entry to the
-
Locally test that the Docker image builds successfully:
make build
-
Commit the changes to your forked repository.
-
Open a pull request from your branch to the
main
branch of this repository. Navigate to the Pull requests tab in this repository, and click the "New pull request" button. For more detailed instructions, check out GitHub's help page. -
Once you open the pull request, Github Actions will automatically try building a new Docker image with your changes and running the tests in
runtime/tests
. These tests can take up to 30 minutes, and may take longer if your build is queued behind others. You will see a section on the pull request page that shows the status of the tests and links to the logs. -
You may be asked to submit revisions to your pull request if the tests fail or if a DrivenData team member has feedback. Pull requests won't be merged until all tests pass and the team has reviewed and approved the changes.
Thanks for reading! Enjoy the competition, and hit up the forums if you have any questions!