Welcome to the LigTMap target and activity prediction for small molecules. This method currently support prediction for 17 target classes including 6000+ protein targets. This code includes the main prediction workflow and all data/models so it can be run offline in your working computer. However, for better visualization of the prediction result, our web server is recommended.
Visit our online server at https://cbbio.online/LigTMap/
The code is still in its early stage. You are welcome to feedback or contribute in making in silico target prediction a truly powerful method for novel drug discovery for everyone!
Anaconda, RDKit, Openbabel, MOPAC2016, ODDT, PSOVina, MGLTools, and Python libraries.
Specifically, our method has been tested with these versions:
- python 2.7 (from anaconda)
- rdkit-2016.03.4
- numpy-1.11.3
- openbabel-3.0.0
- pychem-1.0
- pybel-0.12.2
- scikit-learn-0.19.2
- scipy-1.1.0
- pandas-0.23.4
- boost-1.59.0
This code was tested on MacOS X 11.2, CentOS 7.6 and 7.8. We will be glad to know if it works also on your platform!
MOPAC2016 can be downloaded from http://openmopac.net/MOPAC2016.html You need a license to use, please go to the homepage to obtain a license. The license key will be emailed to you.
In essence, the installation steps are:
Create the directory:
% sudo mkdir -p /opt/mopac
% sudo chmod 777 /opt/mopac
Copy over the MOPAC executable and library that are obtained after unpacking the downloaded package:
% cp <source-path>/MOPAC2016.exe /opt/mopac
% cp <source-path>/libiomp5.so /opt/mopac
% chmod +x /opt/mopac/MOPAC2016.exe
Add the following lines to your .bashrc start-up script:
alias mopac='/opt/mopac/MOPAC2016.exe'
export LD_LIBRARY_PATH=/opt/mopac:$LD_LIBRARY_PATH
Source the start-up script, e.g.
% source ~/.bashrc
Install the license key that you have received in your email:
% /opt/mopac/MOPAC2016.exe <license-key>
Test the installation using the given example:
% mopac Example_data_set.mop
If the run is completed with output at Example_data_set.out, then your installation is successful!
Download and install the latest version of Anaconda from https://www.anaconda.com/download/.
Simply run the Anaconda3-xxx.sh
file and provide an installation directory, e.g.
% ./Anaconda3-2020.11-Linux-x86_64.sh
...
/home/user/opt/anaconda3
It's good to organize your program files in one central place like
/home/user/opt/
% conda create -n ligtmap -c rmg rdkit python=2.7
% conda activate ligtmap
After activation, your default python
interpreter should be the one from the ligtmap
env.
Check to confirm:
% which python
e.g. /home/user/opt/anaconda3/envs/ligtmap/bin/python
Follow http://openbabel.org/wiki/Category:Installation to install Openbabel that suits your platform.
% conda install -c conda-forge openbabel
Download and install PyChem from https://code.google.com/archive/p/pychem/downloads
% tar cvfz pychem-1.0.tar.gz
% cd pychem-1.0
% python setup.py install
% python -m pip install pybel
% conda install -c conda-forge scikit-learn=0.19.2
% conda install -c conda-forge pandas=0.23.4
Make sure you have all previous libraries installed with the correct version before running this:
% python -m pip install oddt
In case you meet errors in between and want to remove and reinstall from Step 3:
% conda env remove --name ligtmap
Download and install boost-1.59.0.tar.gz
from https://sourceforge.net/projects/boost/files/boost/1.59.0/ if boost
is not yet in your system.
% tar xfz boost_1_59_0.tar.gz
% cd boost_1_59_0
% ./bootstrap.sh --prefix=/home/user/opt/boost-1.59.0
% ./b2 -j 4
% ./b2 install
Add boost to the library path in .bashrc
export LD_LIBRARY_PATH=$HOME/opt/boost-1.59.0/lib:$LD_LIBRARY_PATH
Once your boost is in place, download and install psovina-2.0.tar.gz
from
https://sourceforge.net/projects/psovina/
% tar xfz psovina-2.0.tar.gz
% cd psovina-2.0/build/<your-platform>/release
Modify Makefile
to suit your system setting, specifically
give the location of the boost, e.g.:
BASE=/home/user/opt/boost-1.59.0
% make
% mkdir /home/user/opt/psovina-2.0
% cp psovina psovina_split /home/user/opt/psovina-2.0
Make it accessible by adding the location of the compiled psovina
to the PATH
in .bashrc
export PATH=/home/user/opt/psovina-2.0:$PATH
Download and install MGLTools of your platform from http://mgltools.scripps.edu/downloads
% tar xfz mgltools_x86_64Linux2_1.5.6.tar.gz
% mv mgltools_x86_64Linux2_1.5.6 /home/user/opt
% cd /home/user/opt/mgltools_x86_64Linux2_1.5.6
% ./install.sh
Following the instructions at the end of the installation to include some variables in your .bashrc
file.
Install some GNU utilities via Homebrew, especially, we need gsplit
as an alternative to the darwin split
.
% brew install coreutils
Download and unpack ligtmap-0.1.tar.gz
. You can move the
program directory to anywhere.
% tar xfz ligtmap-0.1
% mv ligtmap-0.1 /home/user/opt
Define necessary environment variables in the .bashrc
start-up script file:
export LIGTMAP=/home/user/opt/ligtmap-0.1
export MGLTools=/home/user/opt/mgltools_x86_64Linux2_1.5.6/
Finally, source the script file.
% source ~/.bashrc
- Prepare your molecule(s) to be predicted in
input.smi
, e.g. our benchmark molecules for HIV. Make sure you don't leave any empty lines in the file:
c1ccccc1Oc(ccc2)c(c23)n(c(=O)[nH]3)CC
c1c(C)cc(C)cc1Oc(ccc2)c(c23)n(c(=O)[nH]3)CC
N#Cc(c1)cc(Cl)cc1Oc(ccc2)c(c23)n(c(=O)[nH]3)CC
N#Cc(c1)cc(Cl)cc1Oc(ccc2)c(c23)n(C)c(=O)[nH]3
- Prepare the list of targets in
target.lst
. For a complete list of supported targets, refer to $LIGTMAP/target.lst.
HIV
HCV
- Activate the condo environment
% condo activate ligtmap
- Run the prediction
% $LIGTMAP/predict
The run will generate two directories Input
and Output
.
Input stores each molecule SMILES
in a separate file:
input_00001, input_00002, ...
Output stores prediction results for each molecule separately in directories.
In case you have a previous run, the Input and Output directories will be backuped to
Input.xxx
andOutput.xxx
.
- Examine prediction results
In the summary section, the target class for which target proteins
have been identified for the query molecule is marked Complete
,
Otherwise Fail
.
For a molecule Input_xxxxx
, the top-ranked targets sorted by
the LigTMapScore
can be found in Output/Input_xxxxx/IFP_result.csv
.
This file contain 9 columns of data of the identified targets:
- PDB
- Class
- TargetName
- LigandName
- LigandSimilarityScore
- BindingSimilarityScore
- LigTMapScore
- PredictedAffinity
- DockingScore
The binding mode (PDB) of the molecule at the target protein can be
found in the corresponding directory
Output/Input_xxxxx/TargetName/Complex
Our method paper is currently under review:
Shaikh, Faraz; Tai, Hio Kuan; Desai, Nirali; Siu, Shirley (2020): LigTMap: Ligand and Structure-Based Target Identification and Activity Prediction for Small Molecular Compounds. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.12923474.v2
Developer: Faraz Shaikh ([email protected]), Giotto Tai ([email protected])
Project PI: Shirley Siu ([email protected] | [email protected] | https://twitter.com/ShirleyWISiu)
Computational Biology and Bioinformatics Lab (CBBIO)
University of Macau