Evaluation scripts #2

Hrovatin · 2023-03-10T14:31:27Z

Evaluation is done via seml.
Seml is run based on config file, these are

adipose dataset: https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/seml_adiposeScSn.yaml
pancreas islet dataset: https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/seml_pancreasCond_hpap2nd.yaml
pancreas endocrine organoid dataset: https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/seml_pancreasEmbryoOrganoid_public-soon.yaml

These yaml files also contain information on datasets used (path_adata; adata contains already selected genes (same in all runs to ensure results are comparable), normalized and count data, and metadata used for integration (system+batch) and evaluation (e.g. cell types)), covariates (group_key, batch_key, system_key), and where the results are saved (path_save, output_dir; in advance directories path_save, path_save/logs, and path_save/integration should be created manually), additional seml configs, param configs and their name (necesary for easy plotting/eval afterwards where different methods are compared), and random seeds to be used. It would be best if new method is simply added to the existing yaml files and then everything will use the right datasets and be saved to the right location for easy comparison of results afterwards.

The arguments from config are passed to the script that runs the python process (https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/run_seml.py) that calls script for integration with my models or scvi, similarly a new script to be called could be added for other methods (e.g. scGLUE).

Script for my model: https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/eval_integration.py
Script for scVI: https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/eval_integration_scvi.py

The script called by seml passes params to the integration scripts and here only the parameters used by the integration script should be passed on, else it wont work as it is currently set up.

The results are then read from shared directory for every data set separately (for all successfully finished runs) and plotted in https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/eval_summary_integration.py I save jupyter notebooks only on server not git (I use jupytext to convert ipynb to py). My code is in /lustre/groups/ml01/code/karin.hrovatin/cross_species_prediction

Hrovatin · 2023-03-10T14:46:58Z

So to add another method one would ideally:

adapt yaml files of every dataset to add the new run setting and any additionally needed params
adapt the script called by seml to be able to call correct integration script based on run params for a new method and pass correct params
make a new integration script for the method based on the current scripts

Hrovatin assigned moinfar Mar 10, 2023

Hrovatin mentioned this issue Apr 21, 2023

Evaluation scripts #1

Open

Hrovatin transferred this issue from another repository May 8, 2023

moinfar mentioned this issue Jun 12, 2023

Add scglue notebook #8

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation scripts #2

Evaluation scripts #2

Hrovatin commented Mar 10, 2023 •

edited

Loading

Hrovatin commented Mar 10, 2023

Evaluation scripts #2

Evaluation scripts #2

Comments

Hrovatin commented Mar 10, 2023 • edited Loading

Hrovatin commented Mar 10, 2023

Hrovatin commented Mar 10, 2023 •

edited

Loading