These are the complete full instructions to install LIGYSIS from scratch.
DSSP is incompatible with all other environments, and so must go on its environment of its own. You can install locally as well, all we will need is the location of the executable. The version of the libboost library must be specified to be this one, otherwise, dssp will not run.
conda create -n DSSP salilab::dssp=3.0.0 libboost=1.73.0
Installing HMMER on its own environment. Just easier...
conda create -n HMMER hmmer=3.4
The first step to install LIGYSIS is to Git Clone the repository.
# git clone LIGYSIS from repository
git clone -b running-arpeggio https://github.com/JavierSanchez-Utges/LIGYSIS.git
The following commands will crate an environment and install pdbe-arpeggio.
# install pdbe-arpeggio environment
conda create -n ARPEGGIO python=3.9 gemmi openbabel biopython -c conda-forge
# activating pdbe-arpeggio environment
conda activate ARPEGGIO
# install pdbe-arpeggio
pip install pdbe-arpeggio
# test pdbe-arpeggio with help function
pdbe-arpeggio -h
# change directory to environments directory
cd ligysis_custom/ENVS
# install DEEP_LEARNING environment
conda env create -f DEEP_LEARNING.yml
The following instructions are to install VarAlign. Fore more information refer to the VarAlign repository.
# install LIGYSIS environment
conda env create -f LIGYSIS.yml
# VarAlign installation (Grabbed from URL)
# change directory to main working directory
cd ../..
# git clone specific branch of VarAlign from repository
git clone -b JSU_branch https://github.com/bartongroup/SM_VarAlign.git
# change directory to VarAlign directory
cd SM_VarAlign
# activate varalign_env environment
conda activate LIGYSIS
# install VarAlign
pip install .
The following instructions are to install ProIntVar. Fore more information refer to the ProIntVar repository.
# ProIntVar installation (Grabbed from URL)
# change directory to main working directory
cd ..
# git clone specific branch of ProIntVar from repository
git clone -b JSU_branch https://github.com/bartongroup/ProIntVar.git
### change directory to ProIntVar directory
cd ProIntVar
# pip install ProIntVar dependencies
pip install -r requirements.txt
#then
python setup.py install
ProIntVar needs to be configured in order to run DSSP. The path to the DSSP binary must be added.
# change directory to VarAlign directory
cd ../SM_VarAlign
# set up ProIntVar configuration
ProIntVar-config-setup new_config.ini
# edit the following values in new_config.ini
dssp_bin = /path/to/anaconda/envs/bin/mkdssp
# Update the settings according to user preferences and push them
ProIntVar-config-load new_config.ini
The configuration file can be found here. It includes the necessary paths to databases and binaries needed to run the pipeline. It looks like this:
### LIGYSIS CONFIG FILE ###
[paths]
## BINARIES
arpeggio_python_bin =/path/to/miniconda/envs/ARPEGGIO/bin/python
arpeggio_bin = /path/to/miniconda/envs/ARPEGGIO/bin/pdbe-arpeggio
## DATABASES
ensembl_sqlite =/path/to/.varalign/ensembl_cache.sqlite
gnomad_vcf =/path/to/gnomad/gnomad.exomes.r2.0.1.sites.vcf.gz
swissprot = /path/to/swissprot.fasta
[other]
max_retry = 3
sleep_time = 0.5
### END OF CONFIG FILE ###
These are mock paths and need to be replaced with the right ones for the pipeline to run successfully.
It is necessary to apply patch to pdbe-arpeggio. pdbe-arpeggio crashes when CIF files do not present the _chem_comp.
fields. See GitHub issue here. This will be the case for all the preferred biological assemblies in the PDBe. This patch is a quick solve for this until the issue is appropriately approached. It simply comments the lines that extract the _chem_comp.
information, which is not needed in LIGYSIS.
This is how the patch is applied:
# change directory to where pdbe-arpeggio is installed
cd /path/to/environments/ARPEGGIO/lib/python3.9/site-packages/
# apply patch
git apply /path/to/LIGYSIS/OTHER/JSU_patch.diff
This should have commented these 7 lines of code in the arpeggio/core/interactions.py
file.
+ # self.component_types = protein_reader.get_component_types(filename)
+ # result_entry['bgn']['label_comp_type'] = self.component_types[utils.get_residue_name(contact.bgn_atom)]
+ # result_entry['end']['label_comp_type'] = self.component_types[utils.get_residue_name(contact.end_atom)]
+ # result_entry['bgn']['label_comp_type'] = self.component_types[utils.get_residue_name(contact.bgn_res)]
+ # result_entry['end']['label_comp_type'] = self.component_types[utils.get_residue_name(contact.end_res)]
+ # result_entry['bgn']['label_comp_type'] = self.component_types[utils.get_residue_name(contact.bgn_atom)]
+ # result_entry['end']['label_comp_type'] = self.component_types[utils.get_residue_name(contact.end_res)]