Calculating order parameters
When constructing a potential/free energy surface, or trying to understand the landscape topology by colouring a disconnectivity graph, you need to calculate order parameters for your system. These are structural quantities, such as the radius of gyration, or the number of native contacts (Q).
Often, these are calculated after the fact i.e. a system is studied with OPTIM/PATHSAMPLE and then for each identified minimum, the coordinates are extracted, and the quantities of interest are calculated.
In this article, you can find examples of some useful order parameters - and how to use CHARMM to calculate them. I have also included example script segments which you can use to extract the values from the CHARMM output.
Setting up CHARMM
NOTE: some of these only work with newer versions of CHARMM, so please compile the source in ~/svn/CHARMM35, and use the executable produced.
Reading in the structure of interest
Before asking for order parameters, we need to set CHARMM up, and load in the coordinates for the structure of interest, in this case test.crd. Here is the top of a CHARMM input file which does this for TZ2:
! Example CHARMM input file to calculate three order parameters for TZ2 ! Chris Whittleston 25/01/12 ! Set paths for topology and parameter files set pardir "/home/csw34/svn/CHARMM31/toppar" set top "toph19_eef1_perm.inp" set par "param19_eef1_perm.inp" ! BOMLev sets the level of warnings what do not cause the program to exit. -5 = very lax BOMLev -5 ! PRNLEV sets the ammount of output you get from CHARMM. 0 = small PRNLEV 3 ! Read standard topology and parameter files OPEN READ CARD UNIT 1 NAME @pardir/@top READ RTF CARD UNIT 1 CLOSE UNIT 1 OPEN READ CARD UNIT 2 NAME @pardir/@par READ PARAMETER CARD UNIT 2 CLOSE UNIT 2 ! Generate the PSF for TZ2 READ SEQUence CARD * 12 SER TRP THR TRP GLU GLY ASN LYS TRP THR TRP LYS GENErate FIRS NTER LAST CT2 SETUp OPEN UNIT 20 NAME test.crd READ CARD READ COOR UNIT 20 CARD FREE CLOSE UNIT 20 ! Build the internal coordinate tables IC FILL PRESERVE IC PARAMETERS IC PURGE IC BUILD ! Set up the EEF1 solvent model eef1 setup temp 298.15 unit 93 name "/home/csw34/svn/CHARMM31/toppar/solvpar.inp" update ctonnb 7. ctofnb 9. cutnb 15. group rdie
This file is read into CHARMM by piping it in like this:
charmm < order_parameters.inp > charmm.out
If you do this for the above, the charmm.out file will show that the coordinates were read, but not much else.
Making selections for groups of interest to save time
In TZ2, we are interested in the TRP sidechains, as it is the interaction of these rings which give the small hairpin its remarkable stability. Because we are likely to want to consider them multiple times, it saves time to pre-define a CHARMM atom selection for each - like this:
! Set up selections for each TRP ring in TZ2 (saves typing later!) define TRP2 sele resid 2 .and. .not. ( type n .or. type ca .or. type c .or. - type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end define TRP4 sele resid 4 .and. .not. ( type n .or. type ca .or. type c .or. - type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end define TRP9 sele resid 9 .and. .not. ( type n .or. type ca .or. type c .or. - type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end define TRP11 sele resid 11 .and. .not. ( type n .or. type ca .or. type c .or. - type o .or. type oct* .or. type ha* .or. type hn .or. type ht* ) end
This selection is quite simple. For the second residue, TRP2, we select all atoms in residue 2 (resid 2), and then exclude those in the backbone, including in any terminal residues. A full explaination of CHARMM selection logic can be found on this really useful page.
Calculating and extracting order parameters
Once we can read in the coordinates, we can add to the bottom of the CHARMM input file commands to calculate the order parameters for us. Sometimes we cannot get the exact result we want right away, but with some post-processing in a bash script, anything is possible! :) For each example, the CHARMM input file segment is accompanied by a bash script block demonstrating how to get the single value of interest out of the CHARMM output file.
Number of native backbone hydrogen-bonds
For TZ2, there are four hydrogen-bonds between backbone atoms in the native state. As we know these will not be formed when the peptide is extended, monitoring how many are present should give us a rough idea of how folded the hairpin is:
! First BB H-bond COOR HBOND sele resid 5 .and. type O end sele resid 8 .and. type H end ! Second BB H-bond COOR HBOND sele resid 3 .and. type O end sele resid 10 .and. type H end ! Third BB H-bond COOR HBOND sele resid 10 .and. type O end sele resid 3 .and. type H end ! Fourth BB H-bond COOR HBOND sele resid 12 .and. type O end sele resid 1 .and. type HT1 end
Here, for each hydrogen bond, we specify the atoms involved exactly. For example, one bond is between O in residue 5, and H in residue 8. This produces the following CHARMM output:
CHARMM> ! First BB H-bond CHARMM> COOR HBOND sele resid 5 .and. type O end sele resid 8 .and. type H end SELRPN> 1 atoms have been selected out of 147 SELRPN> 1 atoms have been selected out of 147 NONBOND OPTION FLAGS: ELEC VDW GROUps RDIElec SWITch VGROup VSWItch BYGRoup NOEXtnd NOEWald CUTNB = 15.000 CTEXNB =999.000 CTONNB = 7.000 CTOFNB = 9.000 WMIN = 1.500 WRNMXD = 0.500 E14FAC = 0.400 EPS = 1.000 NBXMOD = 5 There are 0 atom pairs and 658 atom exclusions. There are 1502 group pairs and 139 group exclusions. Analysing hydrogen bond patterns for: #donors #acceptors 1st sel: 0 1 2nd sel: 1 0 hydrogen bonds. Total time= 0.0ps. Resolution= 0.000ps. Cutoff distance= 2.40 time= 0.0 occupancy= 0.00 ATOM <occupancy> <lifetime> (ps) 1 GLU 5 O 1.0000
We are only interested in the presence of the <occupancy> field in the last couple of lines. A value of 1.0000 here shows this bond is present, so if we have four of these commands in the CHARMM input file, we will get four of these output blocks. This bash script fragment can be used to extract the number of occupied bonds:
# NUMBER OF NATIVE BACKBONE H-BONDS bbhbond=`grep -3 "hydrogen bonds. Total" charmm.out | grep 1.0000 | wc | awk '{print $1}'` echo "There are ${bbhbond} native backbone hydrogen bonds present" echo $bbhbond >> bbhbond.dat
TRP sidechain COM/COM distances
If you want to get an idea for how many of the TRP rings are forming hydrophobic contacts with each other, you can check the distance between the centres of mass of each pair of rings. If this distance is less than a certain cutoff, you consider that a 'contact', and count it. In this section of CHARMM input, we consider each possible pairing of the 4 TRP rings:
! COM/COM distance TRP2/TRP4 sidechains quick sele TRP2 end sele TRP4 end mass ! COM/COM distance TRP2/TRP9 sidechains quick sele TRP2 end sele TRP9 end mass ! COM/COM distance TRP2/TRP11 sidechains quick sele TRP2 end sele TRP11 end mass ! COM/COM distance TRP9/TRP11 sidechains quick sele TRP9 end sele TRP11 end mass ! COM/COM distance TRP4/TRP11 sidechains quick sele TRP4 end sele TRP11 end mass ! COM/COM distance TRP4/TRP9 sidechains quick sele TRP4 end sele TRP9 end mass
Because we specify two selections (using the ring groups we defined above), the quick function knows to calculate a distance. The mass keyword on the end of each line specifies that the distance should be between the centres of mass of each group. Each call to quick results in the following output:
CHARMM> ! COM/COM distance TRP2/TRP4 sidechains CHARMM> quick sele TRP2 end sele TRP4 end mass SELRPN> 12 atoms have been selected out of 147 SELRPN> 12 atoms have been selected out of 147 QUICKA: The distance is: 5.41712
There will be six blocks like this in this case, as there are six possible pairs. The following bash script fragment is an example of how the number of contacts can be extracted in the case where we use a cutoff of 6.0A:
# TRP-TRP HYDROPHOBIC CONTACTS (COM/COM DISTANCE <6A) comcutoff=6.0 nhydro=0 # Extract distances to file grep -4 'COM/COM distance' charmm.out | grep QUICKA | awk '{print $5}' > comcom.tmp # Check each distance in the file to see if it is less than the cutoff ($comcutoff) for dist in `cat comcom.tmp` do compare=`echo "${dist} <= ${comcutoff}" | bc` if [ $compare -eq 1 ]; then # If the distance is less than the cutoff, increment the number of contacts nhydro=$((nhydro + 1)) fi done echo "There are ${nhydro} hydrophobic contacts between TRP rings (COM/COM)" echo $nhydro >> nhydro.dat
Because bash can't handle floating point numbers like these distances, we resort to a trick here to compare each to the cutoff by piping it to bc which returns 1 (true) if the equality is satisfied.
End-to-end distance (CA-CA)
Another measure of how folded the hairpin is, could be the distance between the ends, the termini of the peptide. when it is folded, we know this distance is short (<6A), but it can be very large when it is unfolded (>30A). This distance is very easy to calculate, using the quick function in the CHARMM input:
quick sele resid 1 .and. type CA end sele resid 12 .and. type CA end
As before, because we specify two selections (the CA atoms in the first and last residues), quick knows to calculate the distance between them. This results in the following CHARMM output:
CHARMM> ! 4) END-TO-END DISTANCE (CA-CA) CHARMM> quick sele resid 1 .and. type CA end sele resid 12 .and. type CA end SELRPN> 1 atoms have been selected out of 147 SELRPN> 1 atoms have been selected out of 147 QUICKA: The distance is: 5.15931
All we want here is the distance itself, which is easily extracted using a simple bash script:
# END-TO-END DISTANCE endtoend=`grep -4 "END-TO-END DISTANCE" charmm.out | tail -n1 | awk '{print $5}'` echo "End-to-end distance (A) is ${endtoend}" echo $endtoend >> endtoend.dat
Calculating order parameters for a whole PATHSAMPLE database
Most often, you'll want to calculate such order parameters for a whole database of minima, not just a single structure. This can be done by simply looping over each minimum in a bash script, extracting it using PATHSAMPLE with the EXTRACTMIN keyword, and then running your CHARMM input on the resulting structure, and extracting the values from the output.
Here is the top of such a bash script for interest:
# Script to extract order parameters from CHARMM output nmin=`wc min.data | awk '{print $1}'` # Loop over minima for ((min=1; min <= $nmin ; min++)) do echo "Working on minimum ${min}" # Run PATHSAMPLE to extract crd file cp pathdata pathdata.tmp sed 's/.*EXTRACTMIN.*/EXTRACTMIN '$min'/' pathdata.tmp > pathdata PATHSAMPLE > output.extract # Rename the extracted structure mv extractedmin.crd test.crd # Run CHARMM to calculate the order parameters charmm_new < order_parameters.inp > charmm.out
NOTE: This script assumes that you have EXTRACTMIN already in your pathdata file before you start. It can be commented - but it must be present!