Symmetrising AMBER topology files

From Docswiki
Jump to navigation Jump to search

Symmetrising a Simple Protein System

The Basics

When LEap is used to generate AMBER topology files (coords.prmtop) it doesn't do a very good job of ensuring that improper dihedral angle energies are invariant with respect to permutations.

These energies can be made invariant by rearranging the permutable atoms within the relevant dihedral term in the coords.prmtop file. For example, the residue aspartate has two permutable oxygen atoms (described by OD1 and OD2 respectively, with the relevant dihedral angle also including two carbon atoms, CB and CG). The dihedral describing these atoms is typically represented in coords.prmtop as CB-OD1-CG-OD2. In order to symmetrise these oxygen atoms, this must be rearranged to OD1-CB-CG-OD2 (i.e. the first permutable atom must be listed first in the dihedral, and the second permutable atom listed fourth). The reasons for this are somewhat esoteric but the paper here addresses it rather well.

The %FLAG DIHEDRALS_INC_HYDROGEN and %FLAG DIHEDRALS_WITHOUT_HYDROGEN sections in coords.prmtop list the groups of dihedrals present in your system. Strangely, these are listed not in terms of atomic indices but instead in terms of the atomic index multiplied by 3, with 1 then subtracted. You can probably imagine that trying to find all of the relevant permutable atoms to swap in the coords.prmtop file (which can be quite a lot in a protein system!) can become very tedious indeed, if done manually.

Fortunately, scripts have been written (perm-prmtop.*.py, which can be found in ~/svn/SCRIPTS/AMBER/symmetrise_prmtop) that can identify these atoms within the dihedral terms that need to be swapped, and then does it for you.

Note that the 1-4 non-bonded interactions are not calculated for improper torsions to avoid double counting. The double counting can be understood by considering a torsion 1-2-3-4 as a normal one and 1-3-2-4 as improper torsion. Improper torsions are designed to keep atoms in a plane. In the AMBER topology file, the dihedrals are defined such that the actual atom index A =N/3 +1 where N is the absolute value of coordinate array index (http://ambermd.org/prmtop.pdf).

Ensuring the termini are correct

The python scripts mentioned above only work if the termini of your protein system are properly considered. This is because the improper dihedrals for terminal residues are different.

It is possible to alter the coords.prmtop file manually to account for these termini. First, find the section of your file formatted as follows:

%FLAG RESIDUE_LABEL     
%FORMAT(20a4)
ALA GLU PHE
%FLAG RESIDUE_POINTER

The chances are that your protein system will be much bigger and contain many more residues but the principles are the same. Each residue technically is assigned four spaces, although one of these is usually blank. This is no longer the case when the terminal residues are altered to reflect whether they are N-terminal or C-terminal. In this case, ALA is transformed to NALA and PHE to CPHE. This gives:

%FLAG RESIDUE_LABEL     
%FORMAT(20a4)
NALAGLU CPHE
%FLAG RESIDUE_POINTER

There is a script, A9prmtoptermini.sh (which can also be found in ~/svn/SCRIPTS/AMBER/symmetrise_prmtop), which does this for you. A typical command would be something like this:

./A9prmtoptermini.sh initial.prmtop final.prmtop initial.pdb ff02

Please note that to run this script, you need a .pdb file as well as a .prmtop file. ff02 specifies which AMBER forcefield to use (and therefore which of the perm-prmtop.*.py files to select).


Symmetrising a Protein System with Cofactors

Including these cofactors in the first place

The method above works if your system comprises only protein residues. However, you may be interested in how a certain molecule or group of molecules interacts with the protein. The addition of these cofactors presents many challenges. If your cofactor is very unusual, you may even need to generate parameters yourself. In which case, it may be useful to refer to some of the AMBER tutorials, such as this one here. Fortunately, many of the most-commonly used cofactors have already been parameterised, particularly by the Bryce Group in Manchester.

There is already a page, Setting up, which deals quite comprehensively in how to generate AMBER input files using LEap. When you want to include cofactors in your system, all you have to do is ensure that the relevant .frcmod and .prepin (or .lib) files are included when you use tleap or xleap. It's also useful to have a .pdb file from another database or source (such as from an X-ray crystallography structure on the Protein Data Bank), although this is not strictly necessary. These files may not be formatted in an appropriate way (again, for details, see Setting up).

Below is an example. After launching tleap, and ensuring all of the relevant files were in my PATH or working directory, I generated a set of AMBER input files for a protein called HemS alongside haem and NADH cofactors via the following commands:

source leaprc.ff99SB
loadamberparams hem.frcmod
loadamberprep hem.prepin
loadoff nadh.lib
loadamberparams nadh.frcmod
mol=loadpdb  1.pdb
savepdb mol  start.pdb
saveamberparm mol coords.prmtop.old coords.inpcrd

Symmetrising the cofactors as well as the residues

Of course, this creates AMBER input files where the permutable atoms aren't necessarily properly symmetrised. Depending upon what cofactors you're including, these could contain permutable atoms as well. Therefore, we must ensure that when finding the correct symmetry, we include the relevant cofactor libraries too or else only the atoms in the standard residues will be treated.

A four-point plan to deal with protein systems containing cofactors that need symmetrising is as follows:

1.

  • First, make sure the cofactors in you system are already listed in perm-prmtop.*.py.
  • To do this: grep -r 'group8 =' .
  • This should give a list of all of the cofactors contained in the various perm-prmtop files (perm-prmtop.ff02.py, perm-prmtop.ff03.py and perm-prmtop.ff14.py should have identical lists).
  • If your cofactor is present, proceed to step 2.
  • If it's not present, you will have to insert the relevant library for your cofactor, and list which atoms need to be symmetrised. Please do this for perm-prmtop.ff02.py, perm-prmtop.ff03.py and perm-prmtop.ff14.py in order to keep them consistent (this should be a simple copy and paste job). Also, it would be helpful once you've done this if you add your new cofactor to the group8 list.

2.

  • Ensure that your relevant cofactor libraries are in your working directory, e.g. if the cofactors in your system are haem (HEM) and NADH (NAD), make sure that hem.lib and nadh.lib are present.

3.

  • Make a note of the first atom in each of your cofactors. This corresponds to the third column in your pdb file. Using HEM and NAD again as examples, the first atoms for each are FE and P1 respectively.

4.

  • Now we have all of the information we need to launch our symmetrisation. Just add on the atoms you noted in step 3 as extra arguments, e.g.
    • ./A9prmtoptermini.sh initial.prmtop final.prmtop initial.pdb ff02 FE P1


And behold, your system is properly symmetrised. The natural order of things, alongside Rangers beating Celtic in the Old Firm or Scotland thumping England in the Rugby, is returned to the universe. You are now ready to launch your GMIN or OPTIM calculations.

--adk44 17.00, 9 May 2019 (BST)