-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cloud Deployment IV]: Simple neuroconv deployment #393
Closed
Closed
Changes from all commits
Commits
Show all changes
64 commits
Select commit
Hold shift + click to select a range
f96652b
added helper function
CodyCBakerPhD da3ee44
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] aa3619a
remake dockerfile; add dandi upload to YAML
CodyCBakerPhD 1ae8034
debugged
CodyCBakerPhD 8f50c80
Create aws_batch_deployment.rst
CodyCBakerPhD 901f1e1
Delete dockerfile_neuroconv_with_rclone
CodyCBakerPhD d4ae252
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 5659b35
Merge branch 'main' into batch_helper
CodyCBakerPhD 95ab319
Merge branch 'batch_helper' into simple_neuroconv_deployment
CodyCBakerPhD e822bc7
Merge branch 'main' into batch_helper
CodyCBakerPhD f1f7b9f
typos and formatting
bendichter 53258c4
Merge branch 'batch_helper' into simple_neuroconv_deployment
bendichter 9739320
resolve conflicts
9213391
add changelog
a476ba7
correct merge conflict and changelog + imports
4f6489d
format docstring
db51921
resolve conflicts
766185f
add changelog
9ae7ace
adjust changelog
c7fb810
split estimator to different PR
7fedcdd
expose extra options and add tests
f15cb68
Merge branch 'batch_helper' into simple_neuroconv_deployment
CodyCBakerPhD 935f038
debug import
7e8ef72
fix bad conflict
f2be008
add boto3 to requirements
a4e7bf5
pass AWS credentials in function and actions
16ef3f6
Merge branch 'main' into batch_helper
CodyCBakerPhD 4939c60
pass secrets
CodyCBakerPhD 7c66c82
correct keyword name
CodyCBakerPhD b115adb
debug role fetching
CodyCBakerPhD dfcb148
fix syntax
CodyCBakerPhD 57f65ce
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 38327f7
splinter out aws tests to reduce costs
CodyCBakerPhD 90deef6
splinter out aws tests to reduce costs
CodyCBakerPhD 0b6e429
temporarily disable
CodyCBakerPhD 06e9bdb
fix suffix
CodyCBakerPhD fe16dde
limit matrix to reduce costs
CodyCBakerPhD 7f40885
cancel previous
CodyCBakerPhD 34328cf
remove iam role stuff; has to be set on user
CodyCBakerPhD 17898f4
fix API call
CodyCBakerPhD de4e18f
update to modern standard; expose extra options; rename argument
CodyCBakerPhD 47cc917
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 4eea2db
fix keyword argument in tests
CodyCBakerPhD 4b22903
add status helper
CodyCBakerPhD e16551d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 29aa19b
debug
CodyCBakerPhD 37223c9
enhance doc
CodyCBakerPhD 1b4d88f
try not casting as strings
CodyCBakerPhD 829e5f2
fix deserialization type
CodyCBakerPhD e76897f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] df8cb10
debug
CodyCBakerPhD 67e8405
expose submission ID
CodyCBakerPhD 297476f
fix datetime typing
CodyCBakerPhD 2cfaf58
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 9ad5ef2
update test to new structure
CodyCBakerPhD 4db6141
remove trigger
CodyCBakerPhD 6949be0
restore trigger
CodyCBakerPhD c193c55
Merge branch 'batch_helper' into simple_neuroconv_deployment
CodyCBakerPhD c990ddf
Merge remote-tracking branch 'origin/simple_neuroconv_deployment' int…
26c5f69
resolve conflict
37d5be4
finish initial structure for deployment helper
CodyCBakerPhD 9af5b99
separate base code; add new entrypoint; adjust dockerfiles; add EFS c…
CodyCBakerPhD 4022b60
fix tests; make deletion safe
CodyCBakerPhD 4491a7f
debugs
CodyCBakerPhD File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
name: AWS Tests | ||
on: | ||
schedule: | ||
- cron: "0 16 * * 1" # Weekly at noon on Monday | ||
|
||
concurrency: # Cancel previous workflows on the same pull request | ||
group: ${{ github.workflow }}-${{ github.ref }} | ||
cancel-in-progress: true | ||
|
||
env: | ||
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} | ||
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} | ||
DANDI_API_KEY: ${{ secrets.DANDI_API_KEY }} | ||
|
||
jobs: | ||
run: | ||
name: ${{ matrix.os }} Python ${{ matrix.python-version }} | ||
runs-on: ${{ matrix.os }} | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
python-version: ["3.12"] | ||
os: [ubuntu-latest] | ||
steps: | ||
- uses: actions/checkout@v4 | ||
- run: git fetch --prune --unshallow --tags | ||
- name: Setup Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
- name: Global Setup | ||
run: | | ||
python -m pip install -U pip # Official recommended way | ||
git config --global user.email "[email protected]" | ||
git config --global user.name "CI Almighty" | ||
|
||
- name: Install full requirements | ||
run: pip install .[aws,test] | ||
|
||
- name: Run subset of tests that use AWS live services | ||
run: pytest -rsx -n auto tests/test_minimal/test_tools/aws_tools.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
48 changes: 48 additions & 0 deletions
48
.github/workflows/build_and_upload_docker_image_latest_release_for_ec2_deployment.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
name: Build and Upload Docker Image of Latest Release for EC2 Deployment to GHCR | ||
|
||
on: | ||
schedule: | ||
- cron: "0 16 * * 1" # Weekly at noon EST on Monday | ||
workflow_dispatch: | ||
|
||
concurrency: # Cancel previous workflows on the same pull request | ||
group: ${{ github.workflow }}-${{ github.ref }} | ||
cancel-in-progress: true | ||
|
||
jobs: | ||
release-image: | ||
name: Build and Upload Docker Image | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v4 | ||
|
||
- name: Parse the version from the GitHub latest release tag | ||
id: parsed_version | ||
run: | | ||
git fetch --prune --unshallow --tags | ||
tags="$(git tag --list)" | ||
version_tag=${tags: -6 : 6} | ||
echo "version_tag=$version_tag" >> $GITHUB_OUTPUT | ||
- name: Printout parsed version for GitHub Action log | ||
run: echo ${{ steps.parsed_version.outputs.version_tag }} | ||
|
||
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v3 | ||
- name: Set up Docker Buildx | ||
uses: docker/setup-buildx-action@v3 | ||
- name: Login to GitHub Container Registry | ||
uses: docker/login-action@v3 | ||
with: | ||
registry: ghcr.io | ||
username: ${{ secrets.DOCKER_UPLOADER_USERNAME }} | ||
password: ${{ secrets.DOCKER_UPLOADER_PASSWORD }} | ||
|
||
- name: Build and push | ||
uses: docker/build-push-action@v5 | ||
with: | ||
push: true # Push is a shorthand for --output=type=registry | ||
tags: ghcr.io/catalystneuro/neuroconv_for_ec2_deployment:dev | ||
context: . | ||
file: dockerfiles/neuroconv_release_for_ec2_deployment | ||
provenance: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
FROM python:3.11.7-slim | ||
LABEL org.opencontainers.image.source=https://github.com/catalystneuro/neuroconv | ||
LABEL org.opencontainers.image.description="A docker image extending the dev branch of the NeuroConv package with modifications related to deployment on EC2 Batch." | ||
ADD ./ neuroconv | ||
RUN cd neuroconv && pip install .[full] | ||
CMD printf "$NEUROCONV_YAML" > run.yml && python -m neuroconv_ec2 run.yml --data-folder-path "$NEUROCONV_DATA_PATH" --output-folder-path "$NEUROCONV_OUTPUT_PATH" --overwrite --upload-to-dandiset-id "$DANDISET_ID" --update-tracking-table "$TRACKING_TABLE" --tracking-table-submission-id "$SUBMISSION_ID" --efs-volume-name-to-cleanup "$EFS_VOLUME" |
This file was deleted.
Oops, something went wrong.
2 changes: 1 addition & 1 deletion
2
...files/neuroconv_latest_release_dockerfile → dockerfiles/neuroconv_release_dockerfile
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
FROM python:3.11.7-slim | ||
LABEL org.opencontainers.image.source=https://github.com/catalystneuro/neuroconv | ||
LABEL org.opencontainers.image.description="A docker image for the most recent official release of the NeuroConv package." | ||
LABEL org.opencontainers.image.description="A docker image for an official release of the full NeuroConv package." | ||
RUN apt update && apt install musl-dev python3-dev -y | ||
RUN pip install "neuroconv[full]" | ||
CMD ["python -m"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
FROM ghcr.io/catalystneuro/neuroconv:latest | ||
LABEL org.opencontainers.image.source=https://github.com/catalystneuro/neuroconv | ||
LABEL org.opencontainers.image.description="A docker image extending the official release of the NeuroConv package with modifications related to deployment on EC2 Batch." | ||
CMD printf "$NEUROCONV_YAML" > run.yml && python -m neuroconv_ec2 run.yml --data-folder-path "$NEUROCONV_DATA_PATH" --output-folder-path "$NEUROCONV_OUTPUT_PATH" --overwrite --upload-to-dandiset-id "$DANDISET_ID" --update-tracking-table "$TRACKING_TABLE" --tracking-table-submission-id "$SUBMISSION_ID" --efs-volume-name-to-cleanup "$EFS_VOLUME" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
One way of deploying items on AWS Batch is to manually setup the entire workflow through AWS web UI, and to manually submit each jobs in that manner. | ||
|
||
Deploying hundreds of jobs in this way would be cumbersome. | ||
|
||
Here are two other methods that allow simpler deployment by using `boto3` | ||
|
||
|
||
Semi-automated Deployment of NeuroConv on AWS Batch | ||
--------------------------------------------------- | ||
|
||
Step 1: Transfer data to Elastic File System (EFS) | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
The nice thing about using EFS is that we are only ever billed for our literal amount of disk storage over time, and do not need to specify a particular fixed allocation or scaling strategy. | ||
|
||
It is also relatively easy to mount across multiple AWS Batch jobs simultaneously. | ||
|
||
Unfortunately, the one downside is that it's pricing per GB-month is significantly higher than either S3 or EBS. | ||
|
||
To easily transfer data from a Google Drive (or theoretically any backend supported by `rclone`), set the following environment variables for rclone credentials: `DRIVE_NAME`, `TOKEN`, `REFRESH_TOKEN`, and `EXPIRY`. | ||
|
||
.. note: | ||
|
||
I eventually hope to just be able to read and pass these directly from a local `rclone.conf` file, but | ||
|
||
.. note: | ||
|
||
All path references must point to `/mnt/data/` as the base in order to persist across jobs. | ||
|
||
.. code: python | ||
|
||
import os | ||
from datetime import datetime | ||
|
||
from neuroconv.tools.data_transfers import submit_aws_batch_job | ||
|
||
job_name = "<unique job name>" | ||
docker_container = "ghcr.io/catalystneuro/rclone_auto_config:latest" | ||
efs_name = "<your EFS volume name>" | ||
|
||
log_datetime = str(datetime.now()).replace(" ", ":") # no spaces in CLI | ||
RCLONE_COMMAND = f"{os.environ['RCLONE_COMMAND']} -v --config /mnt/data/rclone.conf --log-file /mnt/data/submit-{log_datetime}.txt" | ||
|
||
environment_variables = [ | ||
dict(name="DRIVE_NAME", value=os.environ["DRIVE_NAME"]), | ||
dict(name="TOKEN", value=os.environ["TOKEN"]), | ||
dict(name="REFRESH_TOKEN", value=os.environ["REFRESH_TOKEN"]), | ||
dict(name="EXPIRY", value=os.environ["EXPIRY"]), | ||
dict(name="RCLONE_COMMAND", value=RCLONE_COMMAND), | ||
] | ||
|
||
submit_aws_batch_job( | ||
job_name=job_name, | ||
docker_container=docker_container, | ||
efs_name=efs_name, | ||
environment_variables=environment_variables, | ||
) | ||
|
||
|
||
An example `RCLONE_COMMAND` for a drive named 'MyDrive' and the GIN testing data stored under `/ephy_testing_data/spikeglx/Noise4Sam_g0/` of that drive would be | ||
|
||
.. code: | ||
|
||
RCLONE_COMMAND = "sync MyDrive:/ephy_testing_data/spikeglx/Noise4Sam_g0 /mnt/data/Noise4Sam_g0" | ||
|
||
|
||
Step 2: Run the YAML Conversion Specification | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Continuing the example above, if we have the YAML file `test_batch.yml` | ||
|
||
.. code: | ||
|
||
metadata: | ||
NWBFile: | ||
lab: My Lab | ||
institution: My Institution | ||
|
||
conversion_options: | ||
stub_test: True | ||
|
||
data_interfaces: | ||
ap: SpikeGLXRecordingInterface | ||
lf: SpikeGLXRecordingInterface | ||
|
||
experiments: | ||
ymaze: | ||
metadata: | ||
NWBFile: | ||
session_description: Testing batch deployment. | ||
|
||
sessions: | ||
- nwbfile_name: /mnt/data/test_batch_deployment.nwb | ||
source_data: | ||
ap: | ||
file_path: /mnt/data/Noise4Sam_g0/Noise4Sam_g0_imec0/Noise4Sam_g0_t0.imec0.ap.bin | ||
lf: | ||
file_path: /mnt/data/Noise4Sam_g0/Noise4Sam_g0_imec0/Noise4Sam_g0_t0.imec0.lf.bin | ||
metadata: | ||
NWBFile: | ||
session_id: test_batch_deployment | ||
Subject: | ||
subject_id: "1" | ||
sex: F | ||
age: P35D | ||
species: Mus musculus | ||
|
||
then we can run the following stand-alone script to deploy the conversion after confirming Step 1 completed successfully. | ||
|
||
.. code: | ||
|
||
from neuroconv.tools.data_transfers import submit_aws_batch_job | ||
|
||
job_name = "<unique job name>" | ||
docker_container = "ghcr.io/catalystneuro/neuroconv:dev_auto_yaml" | ||
efs_name = "<name of EFS>" | ||
|
||
yaml_file_path = "/path/to/test_batch.yml" | ||
|
||
with open(file=yaml_file_path) as file: | ||
YAML_STREAM = "".join(file.readlines()).replace('"', "'") | ||
|
||
environment_variables = [dict(name="YAML_STREAM", value=YAML_STREAM)] | ||
|
||
submit_aws_batch_job( | ||
job_name=job_name, | ||
docker_container=docker_container, | ||
efs_name=efs_name, | ||
environment_variables=environment_variables, | ||
) | ||
|
||
|
||
Step 3: Ensure File Cleanup | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
TODO: write a dockerfile to perform this step with the API | ||
|
||
It's a good idea to confirm that you have access to your EFS from on-demand resources in case you ever need to go in and perform a manual cleanup operation. | ||
|
||
Boot up a EC2 t2.micro instance using AWS Linux 2 image with minimal resources. | ||
|
||
Create 2 new security groups, `EFS Target` (no policies set) and `EFS Mount` (set inbound policy to NFS with the `EFS Target` as the source). | ||
|
||
On the EC2 instance, change the security group to the `EFS Target`. On the EFS Network settings, add the `EFS Mount` group. | ||
|
||
Connect to the EC2 instance and run | ||
|
||
.. code: | ||
|
||
mkdir ~/efs-mount-point # or any other name you want; I do recommend keeping this in the home directory (~) for ease of access though | ||
sudo mount -t nfs -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-<efs number>.efs.us-east-2.amazonaws.com:/ ~/efs-mount-point # Note that any operations performed on contents of the mounted volume must utilize sudo | ||
|
||
and it _should_ work, but this step is known to have various issues. If you did everything exactly as illustrated above, hopefully it should work. At least it did on 4/2/2023. | ||
|
||
You can now read, write, and importantly delete any contents on the EFS. | ||
|
||
Until the automated DANDI upload is implemented in YAML functionality, you will need to use this method to manually remove the NWB file. | ||
|
||
Even after, you should double check to ensure the `cleanup=True` flag to that function properly executed. | ||
|
||
|
||
|
||
Fully Automated Deployment of NeuroConv on AWS Batch | ||
---------------------------------------------------- | ||
|
||
Coming soon... | ||
|
||
Approach is essentially the same as the semi-automated, I just submit all jobs at the same time with the jobs being dependent on the completion of one another. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set to trigger on github release