Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation scripts #2

Open
Hrovatin opened this issue Mar 10, 2023 · 1 comment
Open

Evaluation scripts #2

Hrovatin opened this issue Mar 10, 2023 · 1 comment
Assignees

Comments

@Hrovatin
Copy link
Collaborator

Hrovatin commented Mar 10, 2023

Evaluation is done via seml.
Seml is run based on config file, these are

These yaml files also contain information on datasets used (path_adata; adata contains already selected genes (same in all runs to ensure results are comparable), normalized and count data, and metadata used for integration (system+batch) and evaluation (e.g. cell types)), covariates (group_key, batch_key, system_key), and where the results are saved (path_save, output_dir; in advance directories path_save, path_save/logs, and path_save/integration should be created manually), additional seml configs, param configs and their name (necesary for easy plotting/eval afterwards where different methods are compared), and random seeds to be used. It would be best if new method is simply added to the existing yaml files and then everything will use the right datasets and be saved to the right location for easy comparison of results afterwards.

The arguments from config are passed to the script that runs the python process (https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/run_seml.py) that calls script for integration with my models or scvi, similarly a new script to be called could be added for other methods (e.g. scGLUE).

The script called by seml passes params to the integration scripts and here only the parameters used by the integration script should be passed on, else it wont work as it is currently set up.

The results are then read from shared directory for every data set separately (for all successfully finished runs) and plotted in https://github.com/theislab/cross_species_prediction/blob/main/notebooks/eval/eval_summary_integration.py I save jupyter notebooks only on server not git (I use jupytext to convert ipynb to py). My code is in /lustre/groups/ml01/code/karin.hrovatin/cross_species_prediction

@Hrovatin
Copy link
Collaborator Author

So to add another method one would ideally:

  • adapt yaml files of every dataset to add the new run setting and any additionally needed params
  • adapt the script called by seml to be able to call correct integration script based on run params for a new method and pass correct params
  • make a new integration script for the method based on the current scripts

@Hrovatin Hrovatin transferred this issue from another repository May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants