OpenKnotScorePipeline

Python pipeline to generate OpenKnotScores for Eterna sequence libraries

How to use this pipeline

The notebooks in the notebooks directory are ordered and meant to be run to transform the data as it needs to be modified at various stages in the pipeline. Generally,

We start with an RDAT file containing reactivity data for an RNA sequence library. We extract the library (sequence, reactivity data, reads, etc) into a dataframe. If you're planning to process the dataset in batch mode on Sherlock, refer to this notebook to split the dataframe into multiple subsets.
Next, we compute silico predictions using a range of RNA structure predction algorithms. The actual script to generate these predictions is available; this notebook provides more details if you're planning to run on Sherlock. If you do use batch processing on Sherlock to generate the predictions, you'll need to collate the processed subset files into a single dataframe for the next step. If you have a CSV of predicted structures which was exported from this pipeline that you wish to use, you can alternatively merge that data instead.
Now that the sequence library has structure predictions, we can calculate the OpenKnotScore for each sequence. This step creates a new dataframe with a bunch of scoring details added to the sequence library.
Finally, we extract relevant scoring details from the library and add them to the original RDAT file for upload to Eterna.

Notes on Sherlock processing

If you plan on running these notebooks/scripts on Stanford's Sherlock computing cluster (which is a good idea if you have a large sequence library to process), you may also want to review https://daslab.github.io/arnie/#/sherlock/environment for some tips on how to properly set up an arnie environment on Sherlock. The structure generation relies on having a wide range of folding algorithms available, and Python environments on Sherlock can be tricky.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
notebooks		notebooks
openknotscore		openknotscore
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenKnotScorePipeline

How to use this pipeline

Notes on Sherlock processing

About

Releases

Packages

Contributors 2

Languages

License

eternagame/OpenKnotScorePipeline

Folders and files

Latest commit

History

Repository files navigation

OpenKnotScorePipeline

How to use this pipeline

Notes on Sherlock processing

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages