Skip to content

deepmodeling/rid-kit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of contents

About Rid-kit

Rid-kit is a package written in Python, designed to do enhanced sampling using reinforced dynamics. It aims to learn the free energy surface on the fly during MD run, and uses it as the bias potential during the next MD run. Its advantage is the ability to use a large number of CVs (100), thus can be used to simulate conformational changes of big molecules such as in the problem of protein folding.

Rid-kit is based on dflow, one can decide whether to use a k8s environment to run a workflow. It is recommended to use k8s environment to run the workflow unless it is not doable anyway.

For more information, check the documentation.

Use Rid-kit with Bohrium k8s

Quick start

The dflow team provide a community version of k8s deepmodeling k8s, making the use of Rid-kit very convenient. To use the community version of k8s, one first need to register a Bohrium account in Bohrium and learn a few concepts (job, jobgroup, project id) in the Bohrium website documents. Then the use of rid-kit is very easy.

Set the environment variables

Just set the environment variables based on your personal Bohrium account information by

export DFLOW_HOST=https://workflows.deepmodeling.com
export DFLOW_K8S_API_SERVER=https://workflows.deepmodeling.com
export DFLOW_S3_REPO_KEY=oss-bohrium
export DFLOW_S3_STORAGE_CLIENT=dflow.plugins.bohrium.TiefblueClient
export BOHRIUM_USERNAME="<bohrium-email>"
export BOHRIUM_PASSWORD="<bohrium-password>"
export BOHRIUM_PROJECT_ID="<bohrium-project-id>"

A convenient way is to put these lines in a bash script such as env.sh, and then execute command source env.sh.

Note that the "bohrium-project-id" is the specific ID related to your own Bohrium account, which can be created in Bohrium Project after login into your own account. You can create multiple project IDs for your account.

Install Rid-kit

Install the latest rid-kit

pip install setuptools_scm
pip install -U rid-kit

Run an example

Change to the rid-kit directory

cd rid-kit

Run a example of Ala-dipeptide on Bohrium using dihedral as CVs:

In "machine_bohrium_k8s.json" file, set "email", "password" and "program_id" to your own Bohrium account information. Note that the program_id information in machine.json file should be int type rather than string type.

Then run this command to submit the jobs

rid submit -i ./tests/data/000 -c ./rid/template/rid_gmx_dih.json -m ./rid/template/machine_bohrium_k8s.json

After successful submission you can login into your Bohrium account at deepmodeling k8s to monitor your job status, where the workflow is displayed using the Argo UI.

If you want to run Rid-kit using your own customized CVs, an example is given (see configure simulations for more details):

Note that the length of angular_mask, weights and kappas keys in rid.json should all be equal to the length of your defined CV number.

rid submit -i ./tests/data/001 -c ./rid/template/rid_gmx_custom.json -m ./rid/template/machine_bohrium_k8s.json

You can also run the example on a Slurm machine (But you need to configure a conda environment on the slurm, see Installation)

rid submit -i ./tests/data/000 -c ./rid/template/rid_gmx_dih.json -m ./rid/template/machine_slurm_k8s.json

You can specify the workflow name by providing WORKFLOW_ID after "-d", for example:

rid submit -i ./tests/data/000 -c ./rid/template/rid_gmx_dih.json -m ./rid/template/machine_bohrium_k8s.json -d ala-dipeptide-1

Note that the defined workflow-id should only contain lower case alphanumeric character, and specifal character "-".

You can also specify other types of CVs such as distance or any customized CVs, for detailed explanation you can check

Continue from an old workflow

Using resubmit to continue from an old workflow

# suppose the original workflow id is OLD_ID
rid resubmit -i ./tests/data/000 -c ./rid/template/rid_gmx_dis.json -m ./rid/template/machine_bohrium_k8s.json OLD_ID -d NEW_ID 

If you want to resubmit from a particular iteration and step:

rid resubmit -i your_dir -c path_to_rid.json -m path_to_machine.json OLD_ID -t ITERATION-ID -p STEP-KEY -d NEW_ID

The ITERATION-ID(start from 1) is just nth iteration the workflow has been executed. The STEP-KEY in rid includes the following steps: prep-exploration, run-exploration, prep-select, run-select, prep-label, run-label, label-stats, collect-data, merge-data, train, model-devi.

Download files from the workflow

rid download WORKFLOW_ID -p STEP-KEY -f FILE_NAME -a ITERATION_START -e ITERATION_END -o OUTPUT_DIR

typically we want the trajectories information from each exploration step, suppose we run a workflow for 20 iterations.

rid download WORKFLOW_ID -p run-exploration -f trajectory -a 1 -e 20 -o my_protein_out

Detained explanation for other files in the Rid run can be found as follows:

Reduce the dimension of free energy model

After the Rid-kit Run, the workflow will generate several numbers of free energy models (.pb). The Rid-kit currently support MCMC to reduce the dimension of the free energy model, for example:

rid redim -i ./test/data/models -c ./rid/template/rid_mcmc_cv_dih.json -m ./rid/template/machine_bohrium_k8s_mcmc.json

Then you will get the projected free energy surface for ala-dipeptide

image1 image2

You can also include .out file representing the CV output information inside the directory specified by -i parameter, this will plot the CV output upon the free energy surface.

Use Rid-kit with Local k8s

A tutorial on using Rid-kit with k8s environment configured by your own can be found as follows:

Use Rid-kit without k8s environment

To run the workflow without k8s environment, one can use the Debug mode of Dflow. In this mode however, one can not monitor the workflow in the Argo UI.

Install the computation environment

If you want to run Rid-kit on a local machine or server, without k8s enviroment, you need to configure a conda environment on your machine, see Installation)

Install the latest rid-kit

pip install setuptools_scm
pip install -U rid-kit

Run an example

If one wants to run the workflow on the Slurm machine locally, change to the rid-kit directory and type (change to your slurm configuration)

DFLOW_DEBUG=1 rid submit -i ./tests/data/000 -c ./rid/template/rid_gmx_dih.json -m ./rid/template/machine_slurm_local.json -d ala-dipeptide-1

Main procedure of RiD

RiD will run in iterations. Every iteration contains tasks below:

  1. Biased MD;
  2. Restrained/Constrained MD;
  3. Training neural network.

Biased MD

Just like Metadynamics, RiD will sample based on a bias potential given by NN models. An uncertainty indicator will direct the process of adding bias potential.

Restrained/Constrained MD

This procedure will calculate the mean force based on the sampling results of restrained MD or constrained MD, which can generate data set for training.

Neural network training

A fully connected NN will be trained via sampling data. This network will generate a map from selected CV to free energy.

A more detailed description of RiD is published now, please see:

[1] Zhang, L., Wang, H., E, W.. Reinforced dynamics for enhanced sampling in large atomic and molecular systems[J]. The Journal of chemical physics, 2018, 148(12): 124113.

[2] Wang, D., Wang, Y., Chang, J., Zhang, L., & Wang, H. (2022). Efficient sampling of high-dimensional free energy landscapes using adaptive reinforced dynamics. Nature Computational Science, 2(1), 20-29.

Preparing files in the input directory

Rules for preparing files in the input directory can be found in

Configure simulations

Configure machine resources

Configure mcmc

Installation of environment on Slurm

If you sumit the RiD workflow to Bohrium, you do not need to install the enviroment yourself, rather Bohrium will pull the docker images to do the computation. But if you submit the workflow to Slurm machine, you will have to install the computation environment, details of the installation can be found in

Installation with DeepMD potential support

Installation of the computation environment with DeepMD potential support on slurm machines can be found in

Workflow Synopsis

  • image

Troubleshooting