Skip to content

BeckResearchLab/learn2thermML

Repository files navigation

learn2thermML

Machine learning of low and high temperature proteins

Getting started

Environment

Create and activate the environment specified in environment.yml

conda env create --file environment.yml
conda activate learn2thermML

Ensure that the following environmental variables are set for pipeline exacution:

  • LOGLEVEL (optional) - Specified logging level to run the package. eg 'INFO' or 'DEBUG'

Execution

Data Version Control (DVC) is used to track data, parameters, metrics, and execution pipelines.

To use a DVC remote, see the the documentation.

DVC tracked data, metrics, and models are found in ./data while scripts and parameters can be found in ./pipeline. To execute pipeline steps, run dvc exp run <stage-name> where stages are listed below:

  • ogt_protein_classifier_data_prep
  • ogt_protein_classifier_train_evaluate

Note that script execution is expected to occur with the top level as the current working directory, and paths are specified with respect to the repo top level.

Python package

Installable, importable code is found in ltml_utils and should be installed given the above steps in the Environemnt section.

Directory

-data/                                      # Contains DVC tracked data, models, and metrics
-pipeline/                                  # Contains DVC tracked executable pipeline steps and parameters
-notebooks/                                 # notebooks for testing and decision making
-environment.yml                            # Conda dependancies
-docs/                                      # repository documentation
-l2tml_utils/                               # python package

About

Machine learning of low and high temperature proteins

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published