Calculating order parameters

From Docswiki
Jump to navigation Jump to search

When constructing a potential/free energy surface, or trying to understand the landscape topology by colouring a disconnectivity graph, you need to calculate order parameters for your system. These are structural quantities, such as the radius of gyration, or the number of native contacts (Q).

Often, these are calculated after the fact i.e. a system is studied with OPTIM/PATHSAMPLE and then for each identified minimum, the coordinates are extracted, and the quantities of interest are calculated.

In this article, you can find examples of some useful order parameters - and how to use CHARMM to calculate them. I have also included example script segments which you can use to extract the values from the CHARMM output.

Setting up CHARMM

NOTE: some of these only work with newer versions of CHARMM, so please compile the source in ~/svn/CHARMM35, and use the executable produced.

Reading in the structure of interest

Before asking for order parameters, we need to set CHARMM up, and load in the coordinates for the structure of interest, in this case test.crd. Here is the top of a CHARMM input file which does this for TZ2:

! Example CHARMM input file to calculate three order parameters for TZ2
! Chris Whittleston 25/01/12

! Set paths for topology and parameter files
set pardir "/home/csw34/svn/CHARMM31/toppar"
set top "toph19_eef1_perm.inp"
set par "param19_eef1_perm.inp"

! BOMLev sets the level of warnings what do not cause the program to exit. -5 = very lax
BOMLev -5

! PRNLEV sets the ammount of output you get from CHARMM. 0 = small
PRNLEV 3

! Read standard topology and parameter files
OPEN READ CARD UNIT 1 NAME @pardir/@top
READ RTF CARD UNIT 1
CLOSE UNIT 1

OPEN READ CARD UNIT 2 NAME @pardir/@par
READ PARAMETER CARD UNIT 2
CLOSE UNIT 2

! Generate the PSF for TZ2
READ SEQUence CARD
*
12
SER TRP THR TRP GLU GLY ASN LYS TRP THR TRP LYS
GENErate FIRS NTER LAST CT2 SETUp

OPEN UNIT 20 NAME test.crd READ CARD
READ COOR UNIT 20 CARD FREE
CLOSE UNIT 20

! Build the internal coordinate tables
IC FILL PRESERVE
IC PARAMETERS
IC PURGE
IC BUILD

! Set up the EEF1 solvent model
eef1 setup temp 298.15 unit 93 name "/home/csw34/svn/CHARMM31/toppar/solvpar.inp"
update ctonnb 7. ctofnb 9. cutnb 15. group rdie

This file is read into CHARMM by piping it in like this:

charmm < order_parameters.inp > charmm.out

If you do this for the above, the charmm.out file will show that the coordinates were read, but not much else.

Making selections for groups of interest to save time

In TZ2, we are interested in the TRP sidechains, as it is the interaction of these rings which give the small hairpin its remarkable stability. Because we are likely to want to consider them multiple times, it saves time to pre-define a CHARMM atom selection for each - like this:

! Set up selections for each TRP ring in TZ2 (saves typing later!)
define TRP2 sele resid 2 .and. .not. ( type n .or. type ca .or. type c .or. -
       type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end
define TRP4 sele resid 4 .and. .not. ( type n .or. type ca .or. type c .or. -
       type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end
define TRP9 sele resid 9 .and. .not. ( type n .or. type ca .or. type c .or. -
       type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end
define TRP11 sele resid 11 .and. .not. ( type n .or. type ca .or. type c .or. -
       type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end

This selection is quite simple. For the second residue, TRP2, we select all atoms in residue 2 (resid 2), and then exclude those in the backbone, including in any terminal residues. A full explaination of CHARMM selection logic can be found on this really useful page.

Calculating and extracting order parameters

Once we can read in the coordinates, we can add to the bottom of the CHARMM input file commands to calculate the order parameters for us. Sometimes we cannot get the exact result we want right away, but with some post-processing in a bash script, anything is possible! :) For each example, the CHARMM input file segment is accompanied by a bash script block demonstrating how to get the single value of interest out of the CHARMM output file.

Number of native backbone hydrogen-bonds

For TZ2, there are four hydrogen-bonds between backbone atoms in the native state. As we know these will not be formed when the peptide is extended, monitoring how many are present should give us a rough idea of how folded the hairpin is:

! First BB H-bond
COOR HBOND sele resid 5 .and. type O end sele resid 8 .and. type H end

! Second BB H-bond
COOR HBOND sele resid 3 .and. type O end sele resid 10 .and. type H end

! Third BB H-bond
COOR HBOND sele resid 10 .and. type O end sele resid 3 .and. type H end

! Fourth BB H-bond
COOR HBOND sele resid 12 .and. type O end sele resid 1 .and. type HT1 end

Here, for each hydrogen bond, we specify the atoms involved exactly. For example, one bond is between O in residue 5, and H in residue 8. This produces the following CHARMM output:

 CHARMM>    ! First BB H-bond
 CHARMM>    COOR HBOND sele resid 5 .and. type O end sele resid 8 .and. type H end
 SELRPN>      1 atoms have been selected out of    147
 SELRPN>      1 atoms have been selected out of    147

 NONBOND OPTION FLAGS:
     ELEC     VDW      GROUps   RDIElec  SWITch   VGROup   VSWItch
     BYGRoup  NOEXtnd  NOEWald
 CUTNB  = 15.000 CTEXNB =999.000 CTONNB =  7.000 CTOFNB =  9.000
 WMIN   =  1.500 WRNMXD =  0.500 E14FAC =  0.400 EPS    =  1.000
 NBXMOD =      5
 There are        0 atom  pairs and      658 atom  exclusions.
 There are     1502 group pairs and      139 group exclusions.


 Analysing hydrogen bond patterns for:
              #donors  #acceptors
 1st sel:           0           1
 2nd sel:           1           0



   hydrogen bonds. Total time=      0.0ps. Resolution=  0.000ps.
 Cutoff distance=   2.40 time=    0.0 occupancy= 0.00
          ATOM            <occupancy>      <lifetime> (ps)
 1    GLU  5    O              1.0000

We are only interested in the presence of the <occupancy> field in the last couple of lines. A value of 1.0000 here shows this bond is present, so if we have four of these commands in the CHARMM input file, we will get four of these output blocks. This bash script fragment can be used to extract the number of occupied bonds:

# NUMBER OF NATIVE BACKBONE H-BONDS
   bbhbond=`grep -3 "hydrogen bonds. Total" charmm.out | grep 1.0000 | wc | awk '{print $1}'`
   echo "There are ${bbhbond} native backbone hydrogen bonds present"
   echo $bbhbond >> bbhbond.dat

TRP sidechain COM/COM distances

If you want to get an idea for how many of the TRP rings are forming hydrophobic contacts with each other, you can check the distance between the centres of mass of each pair of rings. If this distance is less than a certain cutoff, you consider that a 'contact', and count it. In this section of CHARMM input, we consider each possible pairing of the 4 TRP rings:

! COM/COM distance TRP2/TRP4 sidechains
quick sele TRP2 end sele TRP4 end mass

! COM/COM distance TRP2/TRP9 sidechains
quick sele TRP2 end sele TRP9 end mass

! COM/COM distance TRP2/TRP11 sidechains
quick sele TRP2 end sele TRP11 end mass

! COM/COM distance TRP9/TRP11 sidechains
quick sele TRP9 end sele TRP11 end mass

! COM/COM distance TRP4/TRP11 sidechains
quick sele TRP4 end sele TRP11 end mass

! COM/COM distance TRP4/TRP9 sidechains
quick sele TRP4 end sele TRP9 end mass

Because we specify two selections (using the ring groups we defined above), the quick function knows to calculate a distance. The mass keyword on the end of each line specifies that the distance should be between the centres of mass of each group. Each call to quick results in the following output:

CHARMM>    ! COM/COM distance TRP2/TRP4 sidechains
 CHARMM>    quick sele TRP2 end sele TRP4 end mass
 SELRPN>     12 atoms have been selected out of    147
 SELRPN>     12 atoms have been selected out of    147
 QUICKA: The distance is:     5.41712

There will be six blocks like this in this case, as there are six possible pairs. The following bash script fragment is an example of how the number of contacts can be extracted in the case where we use a cutoff of 6.0A:

# TRP-TRP HYDROPHOBIC CONTACTS (COM/COM DISTANCE <6A)
   comcutoff=6.0
   nhydro=0
# Extract distances to file
   grep -4 'COM/COM distance' charmm.out | grep QUICKA | awk '{print $5}' > comcom.tmp
# Check each distance in the file to see if it is less than the cutoff ($comcutoff)
   for dist in `cat comcom.tmp`
   do
   compare=`echo "${dist} <= ${comcutoff}" | bc`
      if [ $compare -eq 1 ]; then
# If the distance is less than the cutoff, increment the number of contacts
         nhydro=$((nhydro + 1))
      fi
   done
   echo "There are ${nhydro} hydrophobic contacts between TRP rings (COM/COM)"
   echo $nhydro >> nhydro.dat

Because bash can't handle floating point numbers like these distances, we resort to a trick here to compare each to the cutoff by piping it to bc which returns 1 (true) if the equality is satisfied.

End-to-end distance (CA-CA)

Another measure of how folded the hairpin is, could be the distance between the ends, the termini of the peptide. when it is folded, we know this distance is short (<6A), but it can be very large when it is unfolded (>30A). This distance is very easy to calculate, using the quick function in the CHARMM input:

quick sele resid 1 .and. type CA end sele resid 12 .and. type CA end

As before, because we specify two selections (the CA atoms in the first and last residues), quick knows to calculate the distance between them. This results in the following CHARMM output:

 CHARMM>    ! 4) END-TO-END DISTANCE (CA-CA)
 CHARMM>    quick sele resid 1 .and. type CA end sele resid 12 .and. type CA end
 SELRPN>      1 atoms have been selected out of    147
 SELRPN>      1 atoms have been selected out of    147
 QUICKA: The distance is:     5.15931

All we want here is the distance itself, which is easily extracted using a simple bash script:

# END-TO-END DISTANCE
   endtoend=`grep -4 "END-TO-END DISTANCE" charmm.out | tail -n1 | awk '{print $5}'`
   echo "End-to-end distance (A) is ${endtoend}"
   echo $endtoend >> endtoend.dat

Calculating order parameters for a whole PATHSAMPLE database

Most often, you'll want to calculate such order parameters for a whole database of minima, not just a single structure. This can be done by simply looping over each minimum in a bash script, extracting it using PATHSAMPLE with the EXTRACTMIN keyword, and then running your CHARMM input on the resulting structure, and extracting the values from the output.

Here is the top of such a bash script for interest:

# Script to extract order parameters from CHARMM output
nmin=`wc min.data | awk '{print $1}'`

# Loop over minima
for ((min=1; min <= $nmin ; min++))
do
   echo "Working on minimum ${min}"

# Run PATHSAMPLE to extract crd file
   cp pathdata pathdata.tmp
   sed 's/.*EXTRACTMIN.*/EXTRACTMIN '$min'/' pathdata.tmp > pathdata
   PATHSAMPLE > output.extract
# Rename the extracted structure
   mv extractedmin.crd test.crd
# Run CHARMM to calculate the order parameters
   charmm_new < order_parameters.inp > charmm.out

NOTE: This script assumes that you have EXTRACTMIN already in your pathdata file before you start. It can be commented - but it must be present!