Calculating rate constants (GT and fastest path)
So, what do you do with the database of stationary points when you've finished the sampling part of the process? Usually, calculate rate constants (of various flavours) and examine mechanistically discrete paths with significant rate constants (i.e., make nice movies!).
Additionally, it's often helpful as you're building up the database to monitor the various rate constants that pathsample produces, by running separate one-processor pathsample jobs as described below. For safety, there should only be one running pathsample job in a directory at a time, though, so do this on a copy or between restarts of the sampling jobs.
It's also possible to ask pathsample to calculate the GT rate constant every n cycles of a sampling run (put n as the second integer on the GT keyword line in the pathdata input file), but this can be prohibitively expensive if n is small and the database is large.
pathdata should contain the usual keywords that set up information about the system: NATOMS, PERMDIST, etc., as used in the database-generating runs. However, remember to remove or comment out the sampling commands: SHORTCUT, FREEPAIRS, etc. A common gotcha: DIRECTION AB means we're interested in paths from the B state to the A state. The same is true with the subscripts on rate constants: i.e., we're using the spectroscopists' convention. The direction can also be changed, if desired, from the sampling runs.
The following keywords must be present:
CYCLES 0 TEMPERATURE x
where x is the reduced temperature, i.e. for CHARMM, it's the energy in the prevailing units of kcal/mol corresponding to the desired temperature: 0.592 kcal/mol for 298 K. The CYCLES 0 ensures that no further sampling is performed. The temperature need not be the same as that used for the sampling runs; MAXTSENERGY y can also changed (or included) in the rate constant calculations.
The different flavours of rate constant calculation...
Find the 'fastest' path: the single discrete path that makes the largest individual contribution to the phenomenological DPS steady-state rate constant (equation 8 of Wales02) in the desired direction, by including the following keyword:
DIJKSTRA i
where minima with integer i direct connections or fewer are removed from the analysis. Dijkstra's shortest-path algorithm is used with an edge-weight formulation based on log-weighted adjacency matrix elements, as described in EvansW04. It's slightly adapted from the original to ensure that the edge weights are non-negative (see Appendix E of Semen Trygubenko's Ph.D. thesis). It's important to note that this analysis only compares the contribution to the rate constant from paths that don't revisit any of the minima: i.e., it doesn't include the effect of full recrossings of TS's, whereby a particular set of stationary points gives rise to an infinite number of paths.
This use of Dijkstra's algorithm is also distinct from that used in the CONNECT algorithm (CarrTW05), with a distance-based edge-weight formulation.
Key output:
standard output file: search for 'Largest contribution to SS rate constant' for the contribution to the DPS steady-state (SS) rate constant from this 'fastest' path and the number of TS's on it. The mean first-passage times (= the reciprocal of the rate constants) in the forwards and backwards directions from a GT calculation on just the stationary points on this path are also quoted on the 'MFPT for fastest path' line. This latter time includes the effect of recrossings (which acts to increase the resulting rate constant from the SS contribution) but does not take account of off-path transitions (which also acts to increase the resulting rate, as the presence of off-path transitions reduces a rate constant). It's the rate constants corresponding to these GT times that are given on the 'rates without conditional probability' and 'rates with conditional probability' lines. The conditional probability bit refers to the factor p_x/p_X, where x=a,b and X=A,B: the probability of occupying minimum x given that we're in state X.
Sample chunk of output:
Dijkstra> 47099 minima with 1 connections or fewer will not be considered Dijkstra> steps to A region converged in 47 cycles; maximum= 97 disconnected= 48089 Dijkstra> steps to B region converged in 38 cycles; maximum= 78 disconnected= 48089 Dijkstra> Best path for min 1 and any B minimum, k^SS B<-A 0.2648920258E-26 82 27545 78123 76027 93353 93573 103810 104873 104876 108780 113457 113455 113816 111592 111596 111595 11159 113595 113540 111670 108193 107117 114410 113282 106425 106426 109207 113864 107981 92364 92361 92362 29005 81267 49822 13885 13883 13880 82260 82261 26172 1 Dijkstra> Best path between any A minimum and any B minimum: 82 27545 78123 76027 93353 93573 103810 104873 104876 108780 113457 113455 113816 111592 111596 111595 111599 113595 113540 111670 108193 107117 114410 113282 106425 106426 109207 113864 107981 92364 92361 92362 29005 81267 49822 13885 13883 13880 82260 82261 26172 1 Dijkstra> Largest contribution to SS rate constant B<-A for any A and B is 0.2648920258E-26 for 41 transition states: Dijkstra> Note that path is printed backwards starting with B, ending with A E+ Ets E- 82 -555.5906153117 36918 -549.1427629896 27545 -550.7885223554 27545 -550.7885223554 105580 -543.5187142799 78123 -549.6194842919 78123 -549.6194842919 106531 -548.9845557840 76027 -551.8813095578 76027 -551.8813095578 130259 -540.4728394131 93353 -542.4178176348 93353 -542.4178176348 130266 -542.3606545656 93573 -542.3609660040 93573 -542.3609660040 145394 -534.7644579433 103810 -535.6463272297 103810 -535.6463272297 159813 -534.0226300928 104873 -534.7646896816 104873 -534.7646896816 159646 -525.0030493670 104876 -526.2912996602 104876 -526.2912996602 159359 -523.8936274377 108780 -530.2551921098 108780 -530.2551921098 159972 -524.2397843902 113457 -527.3469929975 113457 -527.3469929975 159495 -522.7281206378 113455 -522.8085028867 113455 -522.8085028867 162260 -518.9479471804 113816 -519.9294092195 113816 -519.9294092195 160045 -518.4098972167 111592 -520.7709917382 111592 -520.7709917382 156953 -520.6519499996 111596 -520.6537222572 111596 -520.6537222572 156948 -509.2607120466 111595 -510.7824074831 111595 -510.7824074831 156952 -510.7615417725 111599 -514.0064177058 111599 -514.0064177058 159697 -513.9535007538 113595 -513.9756163730 113595 -513.9756163730 159695 -513.2434291865 113540 -515.2702307728 113540 -515.2702307728 159696 -514.0369077195 111670 -516.4878259009 111670 -516.4878259009 157039 -513.3376855362 108193 -513.4635144282 108193 -513.4635144282 152012 -511.8132316150 107117 -513.5373413547 107117 -513.5373413547 161138 -510.2038266180 114410 -512.1777532197 114410 -512.1777532197 161142 -512.1643025601 113282 -512.2181171004 113282 -512.2181171004 161140 -503.8542455138 106425 -508.1494849751 106425 -508.1494849751 149226 -503.2919483110 106426 -506.8230208740 106426 -506.8230208740 159225 -506.6494193252 109207 -507.4352131723 109207 -507.4352131723 160121 -496.8941335219 113864 -497.9365618057 113864 -497.9365618057 160122 -497.2537286432 107981 -500.2176671595 107981 -500.2176671595 151632 -496.4669107556 92364 -497.0219772986 92364 -497.0219772986 128174 -496.3379764022 92361 -497.0175900124 92361 -497.0175900124 132028 -495.5050402587 92362 -495.9664182959 92362 -495.9664182959 130702 -495.2502792770 29005 -496.1568838964 29005 -496.1568838964 110518 -485.9257443857 81267 -490.5071327086 81267 -490.5071327086 110523 -490.4186741312 49822 -491.3243280411 49822 -491.3243280411 63967 -491.3031137536 13885 -491.3671659476 13885 -491.3671659476 110345 -486.3772296810 13883 -487.1372548014 13883 -487.1372548014 22162 -481.8341455402 13880 -483.2413610903 13880 -483.2413610903 112061 -476.0721186686 82260 -478.9507687081 82260 -478.9507687081 112062 -478.9115947145 82261 -479.8779631229 82261 -479.8779631229 112063 -476.4618088629 26172 -477.8753536184 26172 -477.8753536184 160106 -476.4919243777 1 -476.9018216251 Dijkstra> dumping minima and transition state coordinates to file redopoints, steps= 41 Dijkstra> detailed balance, ratio should be one if SS applies: 136.2100556 Dijkstra> MFPT for fastest path: A<-B value= 0.7030943368E+25 B<-A value= 0.2425804971E-08 Dijkstra> rates without conditional probability: A<-B value= 0.1422284248E-24 B<-A value= 412234294.2 Dijkstra> rates with conditional probability: A<-B value= 0.1422284248E-24 B<-A value= 412234294.2 main> CPU time spent in Dijkstra = 51.940 s main> total CPU time spent in PATHSAMPLE = 101.49 s
File EofS: energies of the stationary points on the fastest path as a function of the integrated Cartesian path length. Feed into gnuplot or similar graph-plotting program!
File redopoints: contains the coordinates of the stationary points in bare xyz format. Give as input to OPTIM to re-create the approximate steepest-descent paths, writing more frames of coordinates, to make smoother movies :-)
For CHARMM: file stationary.points.pdb contains the coordinates for the sequence of stationary points in protein data bank format. Feed into a protein structure viewer as a trajectory file.
More to come...
--jmc49 21:45, 2 July 2008 (BST)