Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge dev to main #88

Merged
merged 47 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
67ebf6f
new general map to map distance class
geoffwoollard Jul 10, 2024
648a8c9
old and new methods numerically agreeing
geoffwoollard Jul 10, 2024
4438b92
remove old reference to l2, corr, and bioem3d
geoffwoollard Jul 10, 2024
59b409d
working...
geoffwoollard Jul 10, 2024
4dddfc0
working...
geoffwoollard Jul 10, 2024
e33801a
complted. FSCDistance class
geoffwoollard Jul 10, 2024
2730734
clean up. rename. remove print
geoffwoollard Jul 10, 2024
4cccfb1
Merge branch 'dev' into iss31_m2m
geoffwoollard Jul 10, 2024
f863466
res
geoffwoollard Jul 12, 2024
2ae38ac
global_store_of_running_results for res distance from fsc computed as…
geoffwoollard Jul 12, 2024
a035417
psize and box size from config
geoffwoollard Jul 12, 2024
3f86081
typo in npix
geoffwoollard Jul 16, 2024
eb5bb62
implement submission version and volume flipping
DSilva27 Aug 6, 2024
9751ae4
implement submission version and volume flipping
DSilva27 Aug 6, 2024
8660d1f
Merge pull request #72 from flatironinstitute/implement-flip-and-sub-…
DSilva27 Aug 6, 2024
0fcc4ed
added flavor keys
Miro-Astore Aug 8, 2024
ea60e55
added flavor_key to test json
Miro-Astore Aug 8, 2024
a5f84dd
added direct naming of flavors from json file
Miro-Astore Aug 9, 2024
13a94c0
removed ice cream dict and random assignment in favor of direct assig…
Miro-Astore Aug 9, 2024
3008c77
edit how the hash table is saved
Aug 9, 2024
3e8388e
remove seed assignment from config files
Aug 9, 2024
8ebc234
remove caching as it is causing false negatives in testing
Aug 9, 2024
e3a5655
Merge pull request #77 from flatironinstitute/45-remove-cache-from-gi…
DSilva27 Aug 9, 2024
4a24b6d
Merge branch 'dev' into adding_flavor_keys
Aug 9, 2024
fc99c96
Merge pull request #74 from flatironinstitute/adding_flavor_keys
DSilva27 Aug 9, 2024
d96836a
downsample testing data and remove download from osf
Aug 9, 2024
410fe89
Merge branch 'adding_flavor_keys' into 75-reduce-size-of-testing-files
Aug 9, 2024
a10f656
Merge branch 'dev' into resolution_metric
geoffwoollard Aug 12, 2024
f288e27
Merge branch 'dev' into iss31_m2m
geoffwoollard Aug 12, 2024
35f74e3
docstrings
geoffwoollard Aug 12, 2024
ab45763
override
geoffwoollard Aug 12, 2024
ed00ecd
refactor
geoffwoollard Aug 12, 2024
3988c47
passing tests
geoffwoollard Aug 12, 2024
9e323a9
rename config
geoffwoollard Aug 12, 2024
3d3c8f1
@override from typing_extensions
geoffwoollard Aug 12, 2024
3a588c6
Merge pull request #78 from flatironinstitute/75-reduce-size-of-testi…
DSilva27 Aug 12, 2024
e37aaef
Merge branch 'dev' into iss31_m2m
geoffwoollard Aug 12, 2024
18712d6
Merge pull request #36 from flatironinstitute/iss31_m2m
DSilva27 Aug 12, 2024
4822fa1
add linting check
Aug 12, 2024
c516088
fix linting and add how to use pre-commit to CONTRIBUTE.md
Aug 12, 2024
62563ef
Merge pull request #84 from flatironinstitute/83-add-linting-test-enf…
DSilva27 Aug 12, 2024
ad1d117
res distance refactored in
geoffwoollard Aug 12, 2024
1cd0225
precommit
geoffwoollard Aug 12, 2024
d1381c6
Merge pull request #85 from flatironinstitute/resolution_metric
DSilva27 Aug 12, 2024
5c6cec7
docs
geoffwoollard Aug 12, 2024
f651848
Merge branch 'dev' into resolution_metric
DSilva27 Aug 12, 2024
efb77bc
Merge pull request #87 from flatironinstitute/resolution_metric
DSilva27 Aug 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/main_merge_check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ jobs:
if: github.base_ref == 'main' && github.head_ref != 'dev'
run: |
echo "ERROR: You can only merge to main from dev."
exit 1
exit 1
12 changes: 12 additions & 0 deletions .github/workflows/ruff.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Runs the Ruff linter and formatter.

name: Lint

on: [push]

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: chartboost/ruff-action@v1
22 changes: 2 additions & 20 deletions .github/workflows/testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,30 +26,12 @@ jobs:
python-version: ${{ matrix.python-version }}
cache: 'pip' # caching pip dependencies

- name: Cache test data
id: cache_test_data
uses: actions/cache@v3
with:
path: |
tests/data
data
key: venv-${{ runner.os }}-${{ env.pythonLocation }}-${{ hashFiles('**/tests/scripts/fetch_test_data.sh') }}

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .
pip install pytest omegaconf

- name: Get test data from OSF
if: ${{ steps.cache_test_data.outputs.cache-hit != 'true' }}
run: |
sh tests/scripts/fetch_test_data.sh


- name: Test with pytest
run: |
pytest tests/test_preprocessing.py
pytest tests/test_svd.py
pytest tests/test_map_to_map.py
pytest tests/test_distribution_to_distribution.py

pytest tests
5 changes: 1 addition & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,7 @@ data/dataset_2_submissions
data/dataset_1_submissions
data/dataset_2_ground_truth

# data for testing and resulting outputs
tests/data/Ground_truth
tests/data/dataset_2_submissions/
tests/data/unprocessed_dataset_2_submissions/submission_x/
# testing results
tests/results/


Expand Down
22 changes: 21 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,28 @@ The "-e" flag will install the package in editable mode, which means you can edi

## Things to do before pushing to GitHub

In this project we use Ruff for linting, and pre-commit to make sure that the code being pushed is not broken or goes against PEP8 guidelines. When you run `git commit` the pre-commit pipeline should rune automatically. In the near future we will start using pytest and mypy to perform more checks.
### Using pre-commit hooks for code formatting and linting

When you install in developer mode with `".[dev]` you will install the [pre-commit](https://pre-commit.com/) package. To set up this package simply run

```bash
pre-commit install
```

Then, everytime before doing a commit (that is before `git add` and `git commit`) run the following command:

```bash
pre-commit run --all-files
```

This will run `ruff` linting and formatting. If there is anything that cannot be automatically fixed, the command will let you know the file and line that needs to be fixed before being able to commit. Once you have fixed everything, you will be able to run `git add` and `git commit` without issue.


### Make sure tests run

```bash
python -m pytest tests/
```

## Best practices for contributing

Expand Down
2 changes: 1 addition & 1 deletion config_files/config_distribution_to_distribution.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ cvxpy_solver: ECOS
optimal_q_kl:
n_iter: 100000
break_atol: 0.0001
output_fname: results/distribution_to_distribution_submission_0.pkl
output_fname: results/distribution_to_distribution_submission_0.pkl
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
data:
n_pix: 224
psize: 2.146
psize: 2.146
submission:
fname: data/dataset_2_ground_truth/submission_0.pt
volume_key: volumes
metadata_key: populations
label_key: id
ground_truth:
volumes: data/dataset_2_ground_truth/maps_gt_flat.pt
metadata: data/dataset_2_ground_truth/metadata.csv
mask:
volumes: data/dataset_2_ground_truth/maps_gt_flat.pt
metadata: data/dataset_2_ground_truth/metadata.csv
mask:
do: true
volume: data/dataset_2_ground_truth/mask_dilated_wide_224x224.mrc
analysis:
Expand All @@ -18,9 +18,10 @@ analysis:
- corr
- bioem
- fsc
- res
chunk_size_submission: 80
chunk_size_gt: 190
normalize:
do: true
method: median_zscore
output: results/map_to_map_distance_matrix_submission_0.pkl
output: results/map_to_map_distance_matrix_submission_0.pkl
1 change: 0 additions & 1 deletion config_files/config_preproc.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
submission_config_file: submission_config.json
seed_flavor_assignment: 0
thresh_percentile: 93.0
BOT_box_size: 32
BOT_loss: wemd
Expand Down
4 changes: 3 additions & 1 deletion src/cryo_challenge/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
from cryo_challenge.__about__ import __version__
from cryo_challenge.__about__ import __version__

__all__ = ["__version__"]
4 changes: 2 additions & 2 deletions src/cryo_challenge/_commands/run_map2map_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import os
import yaml

from .._map_to_map.map_to_map_distance_matrix import run
from .._map_to_map.map_to_map_pipeline import run
from ..data._validation.config_validators import validate_input_config_mtm


Expand Down Expand Up @@ -46,5 +46,5 @@ def main(args):

if __name__ == "__main__":
parser = argparse.ArgumentParser(description=__doc__)
args = parser.parse_args()
# args = parser.parse_args()
main(add_args(parser).parse_args())
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@
import numpy as np
import pickle
from scipy.stats import rankdata
import yaml
import argparse
import torch
import ot

Expand All @@ -14,10 +12,12 @@


def sort_by_transport(cost):
m,n = cost.shape
_, transport = compute_wasserstein_between_distributions_from_weights_and_cost(np.ones(m) / m, np.ones(n)/n, cost)
indices = np.argsort((transport * np.arange(m)[...,None]).sum(0))
return cost[:,indices], indices, transport
m, n = cost.shape
_, transport = compute_wasserstein_between_distributions_from_weights_and_cost(
np.ones(m) / m, np.ones(n) / n, cost
)
indices = np.argsort((transport * np.arange(m)[..., None]).sum(0))
return cost[:, indices], indices, transport


def compute_wasserstein_between_distributions_from_weights_and_cost(
Expand Down Expand Up @@ -65,15 +65,14 @@ def make_assignment_matrix(cost_matrix):


def run(config):

metadata_df = pd.read_csv(config["gt_metadata_fname"])
metadata_df.sort_values("pc1", inplace=True)

with open(config["input_fname"], "rb") as f:
data = pickle.load(f)

# user_submitted_populations = np.ones(80)/80
user_submitted_populations = data["user_submitted_populations"]#.numpy()
user_submitted_populations = data["user_submitted_populations"] # .numpy()
id = torch.load(data["config"]["data"]["submission"]["fname"])["id"]

results_dict = {}
Expand Down Expand Up @@ -213,5 +212,5 @@ def optimal_q_kl(n_iter, x_start, A, Window, prob_gt, break_atol):
DistributionToDistributionResultsValidator.from_dict(results_dict)
with open(config["output_fname"], "wb") as f:
pickle.dump(results_dict, f)

return results_dict
Loading