From e6c9fcae7031fd7740e3143211f63d511280a2ce Mon Sep 17 00:00:00 2001 From: geoffwoollard Date: Tue, 2 Jul 2024 17:02:20 -0400 Subject: [PATCH 1/6] Update README.md --- README.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index b64513f..17ebc71 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,7 @@ cd /path/to/Cryo-EM-Heterogeneity-Challenge-1 pip install . ``` -## Devel installation +## Developer installation If you are interested in testing the programs previously installed, please, install the repository in development mode with the following commands: ```bash @@ -52,6 +52,7 @@ The test included in the repo can be executed with PyTest as shown below: ```bash cd /path/to/Cryo-EM-Heterogeneity-Challenge-1 +sh tests/scripts/fetch_test_data.sh # download test data from OSF pytest tests/test_preprocessing.py pytest tests/test_svd.py pytest tests/test_map_to_map.py @@ -59,9 +60,14 @@ pytest tests/test_distribution_to_distribution.py ``` # Running -If you want to run our code, please check the notebooks in the [tutorials folder](https://github.com/flatironinstitute/Cryo-EM-Heterogeneity-Challenge-1/tree/main/tutorials). +If you want to run our code on the full challenge data, or you own local data, please complete the following steps -The tutorials explain how to setup the config files, and run the commands +### 1. Dowload the full challenge data from [The Inaugural Flatiron Institute Cryo-EM Heterogeneity Community Challenge](https://osf.io/8h6fz/) +You can do this through the web browser, or programatically with wget (you can get inspiration from [this script](https://github.com/flatironinstitute/Cryo-EM-Heterogeneity-Challenge-1/blob/main/tests/scripts/fetch_test_data.sh), which is just for the test data, not the full datasets) + +### 2. Modify the config files and run the commands on the full challenge data +Point to the path where the data is locally +The [tutorial notebooks](https://github.com/flatironinstitute/Cryo-EM-Heterogeneity-Challenge-1/tree/main/tutorials) explain how to setup the config files, and run the commands ``` cryo_challenge run_preprocessing --config config_files/config_preproc.yaml cryo_challenge run_svd --config config_files/config_svd.yaml @@ -69,6 +75,7 @@ cryo_challenge run_map2map_pipeline --config config_files/conf cryo_challenge run_distribution2distribution_pipeline --config config_files/config_distribution_to_distribution.yaml ``` + # Contributing If you find any bug or have a suggestion on the code feel free to open an issue [here](https://github.com/flatironinstitute/Cryo-EM-Heterogeneity-Challenge-1/issues). From 3c672b16e44dd638dfe207d2d597ecbeef0c5d23 Mon Sep 17 00:00:00 2001 From: Geoffrey Woollard Date: Tue, 9 Jul 2024 13:48:25 -0400 Subject: [PATCH 2/6] tutorial readme --- tutorials/README.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) create mode 100644 tutorials/README.md diff --git a/tutorials/README.md b/tutorials/README.md new file mode 100644 index 0000000..a8d9ee5 --- /dev/null +++ b/tutorials/README.md @@ -0,0 +1,31 @@ +# TUTORIALS + + +## Brief overview + +There are notebook tutorials for the each step in the analysis pipeline. + +1. `1_tutorial_preprocessing.ipynb` +- input: folders with respective 80 `.mrc` files and a `populations.txt` file +- output: anonymized `submission_?.pt` files with aligned populations +This notebok walks through how to align the raw submissions, and the various options in this step. + +2. `2_tutorial_svd.ipynb` +- input: `submission_?.pt` files +- output: `svd_results.pt` + +This notebook walks through generating and analyzing (plots) the SVD results + +3. `3_tutorial_map2map.ipynb` +- input: one `submission_?.pt` file +- output: a `.pkl` file + +This notebook walks through generating and analyzing (plots) the map to map distance matrix results + +4. `4_tutorial_distribution2distribution.ipynb` +- input: one `.pkl` file from the map2map step +- output: a `.pkl` file + +4. `5_tutorial_plotting.ipynb` + +This notebook walks through parsing and analyzing (plots) the map to map and distribution to distribution results. \ No newline at end of file From c21e03e6dd74a6515e793e1dece45178a4e0e874 Mon Sep 17 00:00:00 2001 From: Geoffrey Woollard Date: Tue, 9 Jul 2024 13:49:50 -0400 Subject: [PATCH 3/6] header --- tutorials/README.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/tutorials/README.md b/tutorials/README.md index a8d9ee5..6459a30 100644 --- a/tutorials/README.md +++ b/tutorials/README.md @@ -1,7 +1,4 @@ -# TUTORIALS - - -## Brief overview +# Brief overview There are notebook tutorials for the each step in the analysis pipeline. From 9c44853a70f723767076b3b5543a9823876a4cc7 Mon Sep 17 00:00:00 2001 From: Geoffrey Woollard Date: Tue, 9 Jul 2024 13:50:47 -0400 Subject: [PATCH 4/6] header --- tutorials/README.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tutorials/README.md b/tutorials/README.md index 6459a30..d3488c7 100644 --- a/tutorials/README.md +++ b/tutorials/README.md @@ -2,27 +2,27 @@ There are notebook tutorials for the each step in the analysis pipeline. -1. `1_tutorial_preprocessing.ipynb` +### `1_tutorial_preprocessing.ipynb` - input: folders with respective 80 `.mrc` files and a `populations.txt` file - output: anonymized `submission_?.pt` files with aligned populations This notebok walks through how to align the raw submissions, and the various options in this step. -2. `2_tutorial_svd.ipynb` +### `2_tutorial_svd.ipynb` - input: `submission_?.pt` files - output: `svd_results.pt` This notebook walks through generating and analyzing (plots) the SVD results -3. `3_tutorial_map2map.ipynb` +### `3_tutorial_map2map.ipynb` - input: one `submission_?.pt` file - output: a `.pkl` file This notebook walks through generating and analyzing (plots) the map to map distance matrix results -4. `4_tutorial_distribution2distribution.ipynb` +### `4_tutorial_distribution2distribution.ipynb` - input: one `.pkl` file from the map2map step - output: a `.pkl` file -4. `5_tutorial_plotting.ipynb` +### `5_tutorial_plotting.ipynb` This notebook walks through parsing and analyzing (plots) the map to map and distribution to distribution results. \ No newline at end of file From 6b7813c328c1c914cd6c880d0d33d5164a0cdd0c Mon Sep 17 00:00:00 2001 From: Geoffrey Woollard Date: Tue, 9 Jul 2024 13:52:00 -0400 Subject: [PATCH 5/6] header --- tutorials/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tutorials/README.md b/tutorials/README.md index d3488c7..d427ba5 100644 --- a/tutorials/README.md +++ b/tutorials/README.md @@ -3,26 +3,26 @@ There are notebook tutorials for the each step in the analysis pipeline. ### `1_tutorial_preprocessing.ipynb` +This notebok walks through how to align the raw submissions, and the various options in this step. - input: folders with respective 80 `.mrc` files and a `populations.txt` file - output: anonymized `submission_?.pt` files with aligned populations -This notebok walks through how to align the raw submissions, and the various options in this step. + ### `2_tutorial_svd.ipynb` +This notebook walks through generating and analyzing (plots) the SVD results. - input: `submission_?.pt` files - output: `svd_results.pt` -This notebook walks through generating and analyzing (plots) the SVD results ### `3_tutorial_map2map.ipynb` +This notebook walks through generating and analyzing (plots) the map to map distance matrix results. - input: one `submission_?.pt` file - output: a `.pkl` file -This notebook walks through generating and analyzing (plots) the map to map distance matrix results ### `4_tutorial_distribution2distribution.ipynb` - input: one `.pkl` file from the map2map step - output: a `.pkl` file ### `5_tutorial_plotting.ipynb` - This notebook walks through parsing and analyzing (plots) the map to map and distribution to distribution results. \ No newline at end of file From 9199bd229f568474c70cacb0f4f5074ef71e6401 Mon Sep 17 00:00:00 2001 From: sonyahanson Date: Tue, 9 Jul 2024 18:35:29 -0400 Subject: [PATCH 6/6] fixed some typos --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 17ebc71..9dfb755 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -

Cryo-EM Heterogeniety Challenge

+

Cryo-EM Heterogeneity Challenge

@@ -62,7 +62,7 @@ pytest tests/test_distribution_to_distribution.py # Running If you want to run our code on the full challenge data, or you own local data, please complete the following steps -### 1. Dowload the full challenge data from [The Inaugural Flatiron Institute Cryo-EM Heterogeneity Community Challenge](https://osf.io/8h6fz/) +### 1. Download the full challenge data from [The Inaugural Flatiron Institute Cryo-EM Heterogeneity Community Challenge](https://osf.io/8h6fz/) You can do this through the web browser, or programatically with wget (you can get inspiration from [this script](https://github.com/flatironinstitute/Cryo-EM-Heterogeneity-Challenge-1/blob/main/tests/scripts/fetch_test_data.sh), which is just for the test data, not the full datasets) ### 2. Modify the config files and run the commands on the full challenge data