-
Notifications
You must be signed in to change notification settings - Fork 24
Tutorial: Martini 3 IDPs proteins
We can use the Martini 3 library in Polyply to generate topologies for disordered proteins from a sequence fasta file. These topologies have been adjusted from the default Martini 3 amino acid topologies to adjust protein-water interactions, and improve the bonded parameters for IDPs.
PLEASE NOTE: This is not designed to generate topologies for Martini 3 proteins with folded domains. If you are not sure you are working with an IDP, check your structure and sequence before using this tool. Metapredict may be a useful tool for checking the sequence in particular. If you have a multidomain protein, it is best to use Martinize2 as described in the Martini 3 Go model paper. In any significant disordered region, Martinize2 may be used to selectively apply disordered parameters to the correct regions. For more detail on how disordered domains can be handled in Martinize2, please see the documentation.
Please cite the preprint that describes this work.
To begin, we need a fasta file with the IDP sequence. The fasta file must specify PROTEIN
in the header for polyply to interpret it correctly. Here, we use the example of an artificial disordered protein, as designed by Dzuricky et al., with 10 octapeptide repeat units. We'll call the file WT10.fasta
> WT10 PROTEIN
SKGPGRGDSPYSGRGDSPYSGRGDSPYSGRGDSPYSGRGDSPYSGRGDSPYSGRGDSPYSGRGDSPYSGRGDSPYSGRGDSPYSGY
Once we have our disordered sequence, we can use the gen_params
program of Polyply to generate the simulation input topology:
polyply gen_params -seqf WT10.fasta -name WT10 -o WT10.itp -lib martini3
Which will generate a topology file containing the parameters for the input protein.
The Martini 3 approach to IDPs uses virtual Go sites along the backbone to effectively adjust the backbone-water interaction. Using the protocol above to generate parameters automatically introduces these virtual sites (along with other improved bonded interactions). For example, the first few residues of the topology for the WT10 IDP discussed above now read:
[ atoms ]
1 Q5 1 SER BB 1 1
2 TP1 1 SER SC1 1 0.0
3 VS 1 SER CA 1 0.0 0.0
4 P2 2 LYS BB 2 0.0
5 SC3 2 LYS SC1 2 0.0
6 SQ4p 2 LYS SC2 2 1.0
7 VS 2 LYS CA 2 0.0 0.0
8 SP1 3 GLY BB 3 0.0
9 VS 3 GLY CA 3 0.0 0.0
10 SP2a 4 PRO BB 4 0.0
11 SC3 4 PRO SC1 4 0.0
12 VS 4 PRO CA 4 0.0 0.0
...
where an atom called CA
of type VS
has been introduced into each residue. Before using the input files generated with this method you must ensure that in your main itp file:
- VS is defined in your
[ atomtypes ]
directive, e.g.:
...
[ atomtypes ]
...
TX1er 36.0 0.000 A 0.0 0.0
W 72.0 0.000 A 0.0 0.0
SW 54.0 0.000 A 0.0 0.0
TW 36.0 0.000 A 0.0 0.0
U 24.0 0.000 A 0.0 0.0
VS 0.00 0.000 V 0.0 0.0
[ nonbond_params ]
P6 P6 1 4.700000e-01 4.990000e+00
P6 P5 1 4.700000e-01 4.730000e+00
P6 P4 1 4.700000e-01 4.480000e+00
...
- An interaction is defined between
VS
andW
in your[ nonbond_params ]
directive, e.g.:
...
TX2er SQ1n 1 3.660000e-01 3.528000e+00
TX2er TQ1n 1 3.520000e-01 5.158000e+00
TX1er Q1n 1 3.950000e-01 1.981000e+00
TX1er SQ1n 1 3.780000e-01 3.098000e+00
TX1er TQ1n 1 3.660000e-01 4.422000e+00
VS W 1 0.4650000000 0.5000000000
The suggested parameters for the latter from the Go Martini 3 paper are
Once these additional parameters have been included in the input force field files, the IDP topologies can be used as with any other input files for preparing simulations with Polyply or running them with Gromacs.
As of Polyply v1.X.X, polyply gen_params
supports modifications of protein. Modification syntax is :. For example:
polyply gen_params -lib martini3 -seq GLY:10 -name pGLY -o pGLY.itp -mods GLY1:N-ter GLY10:C-ter
will generate the parameters for polyglycine with 10 residues, with N and C termini at neutral pH. Note that these terminal modifications are applied automatically when polyply determines that the input sequence is a protein, so the same topology would be achieved with:
polyply gen_params -lib martini3 -seq GLY:10 -name pGLY -o pGLY.itp
In addition to terminal modifications, many of the usual protein modifications available in Martinize2 are available, and may be combined however is desired. For example:
polyply gen_params -lib martini3 -seq HIS:5 -name HIS5_mods -o HIS5_mods.itp -mods HIS1:HIS-HD HIS1:NH2-ter
generates a histidine pentapeptide with a neutralised N terminal, and with the same histidine side chain mutated to be representative of neutral histidine with hydrogen on the delta carbon.