Skip to content

Commit

Permalink
merged no_align
Browse files Browse the repository at this point in the history
  • Loading branch information
bjornwallner committed Mar 23, 2024
2 parents 796b927 + 5b5be5a commit 327aac8
Show file tree
Hide file tree
Showing 11 changed files with 165 additions and 15,066 deletions.
48 changes: 48 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: CI

on: [push, pull_request, workflow_dispatch]

env:
MIN_COVERAGE_REQUIRED: 50


jobs:
test:
if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name != github.event.pull_request.base.repo.full_name
runs-on: ubuntu-latest
timeout-minutes: 10
defaults:
run:
shell: bash -l {0}

steps:
- uses: actions/checkout@v3
- name: Test outputs
run: |
python -m pip install .
python -m pip install coverage
bash run_test.sh
- name: Test with old biopython
run: |
python -m pip install biopython==1.79
bash run_test.sh
- name: Coverage
run: |
test $(coverage report | grep TOTAL | awk '{ print $4 }' | tr -d "%") -ge $MIN_COVERAGE_REQUIRED
#test_oldbio:
# if: github.event_name != 'pull_request' || github.event.pull_request.head.repo.full_name != github.event.pull_request.base.repo.full_name
# runs-on: ubuntu-latest
# timeout-minutes: 10
# defaults:
# run:
# shell: bash -l {0}
#
# steps:
# - uses: actions/checkout@v3
# - name: Test pipeline output
# run: |
# python -m pip install biopython==1.79
# python -m pip install coverage
# python -m pip install .
# bash ./run_test.sh
674 changes: 0 additions & 674 deletions LICENSE

This file was deleted.

255 changes: 81 additions & 174 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,90 @@
# DockQ
Requires python packages: `numpy` and `Biopython`
![CI status](https://github.com/bjornwallner/DockQ/actions/workflows/main.yml/badge.svg)

Installation
# DockQ: A Quality Measure for Protein-Protein Docking Models

Just clone the repository, the main script `DockQ.py` is in the cloned folder:
## Installation

Clone the repository, then install the necessary libraries with `pip`:

```
git clone https://github.com/bjornwallner/DockQ/
cd DockQ
pip install .
```

Install (i) `numpy` (a prerequisite to install 'Biopython') and (ii) `Biopython`
## Quick start:

- Numpy: http://www.scipy.org/install.html
- Biopython: http://biopython.org/wiki/Download#Installation_Instructions
After installing DockQ with `pip`, the `DockQ` binary will be in your path. Just run DockQ with:

This version of DockQ has been tested with numpy v1.21.6 and biopython v1.79.
`DockQ <model> <native>`

Quick start for two interacting partners (two-chain-models) run with:
**Example**

`./DockQ.py <model> <native>`
When running DockQ on model/native complexes with one or more interfaces, you will get a result for each interface. Results are computed to maximise the average DockQ across all interface:

Example
```
$ DockQ examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb
****************************************************************
* DockQ *
* Scoring function for protein-protein docking models *
* Statistics on CAPRI data: *
* 0.00 <= DockQ < 0.23 - Incorrect *
* 0.23 <= DockQ < 0.49 - Acceptable quality *
* 0.49 <= DockQ < 0.80 - Medium quality *
* DockQ >= 0.80 - High quality *
* Ref: S. Basu and B. Wallner, DockQ: A quality measure for *
* protein-protein docking models *
* doi:10.1371/journal.pone.0161879 *
* For comments, please email: [email protected] *
****************************************************************
Model : examples/1A2K_r_l_b.model.pdb
Native : examples/1A2K_r_l_b.pdb
Total DockQ over 3 native interfaces: 1.959
Native chains: A, B
Model chains: B, A
DockQ_F1: 0.996
DockQ: 0.994
irms: 0.000
Lrms: 0.000
fnat: 0.983
Native chains: A, C
Model chains: B, C
DockQ_F1: 0.567
DockQ: 0.511
irms: 1.237
Lrms: 6.864
fnat: 0.333
Native chains: B, C
Model chains: A, C
DockQ_F1: 0.500
DockQ: 0.453
irms: 2.104
Lrms: 8.131
fnat: 0.500
```

bash$ ./DockQ.py examples/model.pdb examples/native.pdb
***********************************************************
* DockQ *
* Scoring function for protein-protein docking models *
* Statistics on CAPRI data: *
* 0 < DockQ < 0.23 - Incorrect *
* 0.23 <= DockQ < 0.49 - Acceptable quality *
* 0.49 <= DockQ < 0.80 - Medium quality *
* DockQ >= 0.80 - High quality *
* Reference: Sankar Basu and Bjorn Wallner, DockQ:... *
* For comments, please email: [email protected] *
***********************************************************
Number of equivalent residues in chain A 1492 (receptor)
Number of equivalent residues in chain B 912 (ligand)
Fnat 0.533 32 correct of 60 native contacts
Fnonnat 0.238 10 non-native of 42 model contacts
iRMS 1.232
LRMS 1.516
CAPRI Medium
DockQ_CAPRI Medium
DockQ 0.700
A more compact output option is available with the flag `--short`:

```
$ DockQ examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb --short
DockQ 0.994 DockQ_F1 0.996 Fnat 0.983 iRMS 0.000 LRMS 0.000 Fnonnat 0.008 clashes 0 mapping BA:AB examples/1A2K_r_l_b.model.pdb B A -> examples/1A2K_r_l_b.pdb A B
DockQ 0.511 DockQ_F1 0.567 Fnat 0.333 iRMS 1.237 LRMS 6.864 Fnonnat 0.000 clashes 0 mapping BC:AC examples/1A2K_r_l_b.model.pdb B C -> examples/1A2K_r_l_b.pdb A C
DockQ 0.453 DockQ_F1 0.500 Fnat 0.500 iRMS 2.104 LRMS 8.131 Fnonnat 0.107 clashes 0 mapping AC:BC examples/1A2K_r_l_b.model.pdb A C -> examples/1A2K_r_l_b.pdb B C
Help page
```
bash$ ./DockQ.py -h
usage: DockQ.py [-h] [-short] [-verbose] [-useCA] [-skip_check] [-no_needle]
[-perm1] [-perm2]
[-model_chain1 model_chain1 [model_chain1 ...]]
[-model_chain2 model_chain2 [model_chain2 ...]]
[-native_chain1 native_chain1 [native_chain1 ...]]
[-native_chain2 native_chain2 [native_chain2 ...]]
<model> <native>

**Other uses**

Run DockQ with `-h/--help` to see a list of the available flags:

```
bash$ DockQ -h
usage: DockQ [-h] [--capri_peptide] [--short] [--verbose] [--use_CA] [--no_align] [--optDockQF1] [--allowed_mismatches ALLOWED_MISMATCHES] [--mapping MODELCHAINS:NATIVECHAINS]
<model> <native>
DockQ - Quality measure for protein-protein docking models
Expand All @@ -68,138 +94,19 @@ positional arguments:
optional arguments:
-h, --help show this help message and exit
-short short output
-verbose talk a lot!
-useCA use CA instead of backbone
-skip_check skip initial check fo speed up on two chain examples
-no_needle do not use global alignment to fix residue numbering
between native and model during chain permutation (use
only in case needle is not installed, and the residues
between the chains are identical
-perm1 use all chain1 permutations to find maximum DockQ
(number of comparisons is n! = 24, if combined with
-perm2 there will be n!*m! combinations
-perm2 use all chain2 permutations to find maximum DockQ
(number of comparisons is n! = 24, if combined with
-perm1 there will be n!*m! combinations
-model_chain1 model_chain1 [model_chain1 ...]
pdb chain order to group together partner 1
-model_chain2 model_chain2 [model_chain2 ...]
pdb chain order to group together partner 2
(complement to partner 1 if undef)
-native_chain1 native_chain1 [native_chain1 ...]
pdb chain order to group together from native partner
1
-native_chain2 native_chain2 [native_chain2 ...]
pdb chain order to group together from native partner
2 (complement to partner 1 if undef)
```


##### Multi-chain functionality

For targets with more than two interacting chains. For instance a
dimer interacting with a partner. There are options to control which
chains to group together and also in which order to combine
them. There are also options to try all possible chain combinations
(`-perm1` and `-perm2`), this is important if for instance a homo
oligomer is interacting asymmetrically with a third partner, or if
there are symmetries that make multiple solution possibly correct.

For this mode to work if there are missing residues the global
alignment program `needle` from the [EMBOSS
package](http://emboss.sourceforge.net/download/) needs to be in your
path, i.e `which needle` should return the location.

This mode is illustrated by a homodimer that is interacting with a
third partner asymmetrically (A,B) <-> C (1A2K from docking benchmark
5.0).

The following commands will put the chains A,B as one partner and the
remaining chain, C, as the second partner. It will assume the chain
naming is the same in the model protein:

`./DockQ.py examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb -native_chain1 A B -model_chain1 A B -native_chain2 C -model_chain2 C`

Assuming the chains are the same in the model and native it is enough to just specify one set chains to group and the second group will be formed from the complement using the the remaining chains.

`./DockQ.py examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb -native_chain1 A B`
(chain C is remaining)

these are also equivalent:

`./DockQ.py examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb -native_chain1 C`
(chain AB is remaining)

`./DockQ.py examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb -native_chain1 A B -model_chain1 A B`

This will reverse the relative chain order of AB, comparing modelBA with nativeAB interacting with chain C:

`./DockQ.py examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb -native_chain1 A B -model_chain1 B A` (observe how the score increases)


To try all permutations for model_chain1, observe at the reverse
mapping BA -> AB gets the best score:

```
bash$ ./DockQ.py examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb -native_chain1 A B -perm1
1/2 AB -> C 0.00972962403319
Current best 0.00972962403319
2/2 BA -> C 0.476267208558
Current best 0.476267208558
Best score ( 0.476267208558 ) found for model -> native, chain1:BA -> AB chain2:C -> C
***********************************************************
* DockQ *
* Scoring function for protein-protein docking models *
* Statistics on CAPRI data: *
* 0.00 <= DockQ < 0.23 - Incorrect *
* 0.23 <= DockQ < 0.49 - Acceptable quality *
* 0.49 <= DockQ < 0.80 - Medium quality *
* DockQ >= 0.80 - High quality *
* Reference: Sankar Basu and Bjorn Wallner, DockQ:... *
* For comments, please email: [email protected] *
***********************************************************
Model : examples/1A2K_r_l_b.model.pdb
Native : examples/1A2K_r_l_b.pdb
Best score ( 0.476267208558 ) found for model -> native, chain1:BA -> AB chain2:C -> C
Number of equivalent residues in chain A 248 (receptor)
Number of equivalent residues in chain B 196 (ligand)
Fnat 0.491 26 correct of 53 native contacts
Fnonnat 0.103 3 non-native of 29 model contacts
iRMS 1.988
LRMS 7.300
CAPRI Medium
DockQ_CAPRI Acceptable
DockQ 0.476
--capri_peptide use version for capri_peptide (DockQ cannot not be trusted for this setting)
--short short output
--verbose, -v talk a lot!
--use_CA, -ca use CA instead of backbone
--no_align Do not align native and model using sequence alignments, but use the numbering of residues instead
--optDockQF1 optimize on DockQ_F1 instead of DockQ
--allowed_mismatches ALLOWED_MISMATCHES
number of allowed mismatches when mapping model sequence to native sequence.
--mapping MODELCHAINS:NATIVECHAINS
Specify a chain mapping between model and native structure. If the native contains two chains "H" and "L" while the model contains two chains "A" and "B",
and chain A is a model of native chain H and chain B is a model of native chain L, the flag can be set as: '--mapping AB:HL'. This can also help limit the
search to specific native interfaces. For example, if the native is a tetramer (ABCD) but the user is only interested in the interface between chains B and
C, the flag can be set as: '--mapping :BC' or the equivalent '--mapping *:BC'.
```


To try all permutations for model_chain1 and model_chain2 (ok only 1 chain in this example:-):

`./DockQ.py examples/1A2K_r_l_b.model.pdb examples/1A2K_r_l_b.pdb -native_chain1 A B -perm1 -perm2`

For a dimer interacting with one partner this is only 2 (2!\*1!),
however for larger oligomers the number of possible permutations
explodes. For two tetramers the number will be 576 (4!\*4!)

Multimeric biological assemblies of proteins (of higher order than
that of dimers) are also found in nature to interact: e.g., PDB IDS:
3J07, 4IXZ, 4IY7, 4IYO, 3IYW, 2VQ0, 1E57, 2IZN etc. particularly
common in viral envelopes / capsids. We choose an instance of two
interacting tetramers (1EXB) to further demonstrate the multi-chain
functionality of DockQ

Tetramer example (24 combinations):

`./DockQ.py examples/1EXB_r_l_b.model.pdb examples/1EXB_r_l_b.pdb -native_chain1 A B C D -perm1`

Tetramer example with all possible permutations (576 combinations):

`./DockQ.py examples/1EXB_r_l_b.model.pdb examples/1EXB_r_l_b.pdb -native_chain1 A B C D -perm1 -perm2`


Binary file added examples/1EXB.cif.gz
Binary file not shown.
Loading

0 comments on commit 327aac8

Please sign in to comment.