This project contains the script to run the experiments and reproduce the figures in the manuscript Old dog, new tricks: Exact seeding strategy improves RNA design performances
The project is heavily based on ViennaRNA (RNAlib)
and LinearBPDesign. The latter one is installed as a git submodule. Run git submodule update --init --recursive
to initiate.
Other required dependencies for figures are numpy
, pandas
, seaborn
, matplotlib
, varnaapi
, logomaker
...
The folder script
contains different scripts to run experiments described in the manuscript
rundesign.py
: design given target structure usingRNAinverse
with different seeds. For example, the command below returns 10 solutions of structure..(((((..((.((((.......)))).))..)))))...
with Biseparable seed with modulo up to 3python script/rundesign.py -n 10 --seed linearbp -m 3 "..(((((..((.((((.......)))).))..)))))..."
immediate_solutions.py
: sample seeds and check whether they are T-design (w/oRNAinverse
). The command below sample each 20 Boltzmann sampled seeds for 1st to 10th structures in xxx.txt (one line for each structure in dot-bracket notation)python script/immediate_solutions.py xxx.txt -n 20 -s 1 -e 10 --seed bpenergy
gen_ss.py
: generate uniform or MFE structures. The command below generates uniformly 1,000 structures of size 100 nts with helix length of 3+python script/gen_ss.py 100 -n 1000 --helix_length 3
parseDesignResult.py
: evaluate (ensemble defect, diversity ...) resulting design produced byrundesign.py
and stored as pandas DataFrame in pickle. The script assumes the result of each target indiced i is stored inpuzzle_i.csv
. The command below parses allpuzzle_*.csv
results underxxx
and stores the DataFrame inyyy.pkl
.python script/parseDesignResult.py xxx yyy.pkl
The folder notebooks
contains jupyter notebooks as indicated individually by the file name to reproduce figures in the manuscript