Finding an initial path with OPTIM and starting up PATHSAMPLE

From Docswiki
Jump to navigation Jump to search

Now that we have two interesting end point structures, we can make an initial discrete path (=min1--TS1--min2--TS2--min3...minN) between them using OPTIM. Create an odata and min.in file containing the following:

odata

UPDATES 6000
NEWCONNECT 15 3 2.0 20.0 30 0.5
COMMENT NEWNEB 30 500 0.01
NEBK 10.0
DIJKSTRA EXP
DUMPALLPATHS
REOPTIMISEENDPOINTS
COMMENT MAXTSENERGY -4770.0
EDIFFTOL  1.0D-4
MAXERISE 1.0D-4 1.0D0
GEOMDIFFTOL  0.05D0
BFGSTS 500 10 100 0.01 100
NOIT
BFGSMIN 1.0D-6
PERMDIST
MAXSTEP  0.1
TRAD     0.2
MAXMAX   0.3
BFGSCONV 1.0D-6
PUSHOFF 0.1
STEPS 800
BFGSSTEPS 2000
MAXBFGS 0.1
NAB start

The most important of these options specify the following:

  • BFGSMIN/BFGSCONV: the RMS force must be less than 10^-6 for a structure to be considered a stationary point.
  • BFGSSTEPS: up to 2000 BFGS steps will be done to minimise the structure, with a maximum step size of 0.1 coordinate units.
  • REOPTIMISEENDPOINTS: If the given initial structures are not already minima, according to the above RMS force criterion, then they will be locally minimized before being connected.
  • DUMPALLPATHS: all min--TS--min triples found during the connection run will be written to the path.info file. Important so that we are not throwing away possibly useful information.
  • DIJKSTRA EXP: the Dijkstra shortest-path algorithm with an exponential edge weight function will be used to select pairs of minima to connect in the iterative connection-making algorithm based on the doubly-nudged elastic band (DNEB) method.
  • NEWCONNECT: we're doing a connection making run. In order, the options mean 15 connection-making cycles/iterations (search for CONNECT CYCLE in the output file...); maximum 3 attempts to connect the same pair of minima (incrementing other DNEB parameters each time) before giving up on that pair of minima; a DNEB image density of 2.0 per unit distance along the vector separating the end points; an iteration density in the DNEB algorithm of 20.0 per image; maximum of 30 images in the DNEB; increment the image density by 0.5 when retrying the same pair of minima.

min.in

STOP
 &cntrl
  imin   = 1,
  ncyc = 1,
  maxcyc = 1,
  igb = 2, saltcon=0.1,
  ntb    = 0,
  cut    = 999.99,
  rgbmax = 25.0,
  ifswitch=1,
 /

As before (line by line)

  • comment line
  • specifies the control block, these settings are for the AMBER program SANDER (see the AMBER manual for more!).
  • we're doing minimisation
  • no AMBER SD steps (just energy call)
  • no AMBER CG steps (just energy call)
  • using the igb=2 generalised born solvent model with a salt concentration of 0.1
  • not using periodic boundary conditions
  • interaction cutoff of 999.99A (infinite)
  • the generalised born radius is set to 25A (the default)
  • smooth force switching turned on (custom keyword) to remove discontinuity in the solvent potential

Initial coordinates

Also put the coordinates of the starting end point into a file called start in the working directory, and the finish end point into finish. The format of these two files is bare XYZ coordinates, (x, y, z on a line, no atom label, no header lines at all), like this:

     3.55466675623289         3.16995629699793         1.54270798594802
     3.74753723250149         2.38942840233047         2.15184547061042
     4.17047717876711         3.93030795815968         1.78617683056756
     2.60386805295699         3.46607619676665         1.69960134494847
     3.73685649622452         2.77553090114839        0.123895029803436
     3.05032512058915         1.95214203202409       -0.786736327696273E-01
     3.35753139113690         3.91791869742228       -0.840504280699272
     2.30617919243626         4.17838802995329       -0.707688694615758
     3.96596515078617         4.80439047228292       -0.653169465698656
     3.50530517168077         3.59312703684487        -1.87189577557318
     5.14998969752088         2.21803054972706       -0.118519057183609
     5.32848416664402        0.995978917943404       -0.100324651337933
     6.15633697502413         3.08299331420796       -0.333025793368451
     5.93881982817854         4.06968704515201       -0.359551838035292
     7.57887293760818         2.71575857855800       -0.469528491560451
     7.67569566749208         2.02302478673195        -1.30623525156648
     8.38649561194879         3.98040568453036       -0.807315727170310
     7.94420445205871         4.45665448807773        -1.68406513918905
     8.30633077214107         4.68583450802602        0.214476997361609E-01
     9.85473280227943         3.72765387154945        -1.10114580198275
     10.2469598219409         3.24521826282487        -2.36509558341843
     9.50018457761475         3.05071206353910        -3.12384108206122
     11.6071325158249         3.01534996277546        -2.64455423147238
     11.9051878291689         2.64420970286146        -3.61747682547994
     12.5801817066824         3.26743026352440        -1.66087949639059
     13.6269130317628         3.09047449172246        -1.87578123663855
     12.1924529889115         3.74904973902444       -0.397640907207177
     12.9414713078565         3.94253331400061        0.360436563497512
     10.8322540264783         3.97886891829675       -0.118091695569543
     10.5377206130995         4.34750878884673        0.855785184894536
     8.13312305141417         1.99487549970490        0.778494881750552
     7.91746950457804         2.48841016169658         1.91162811585053
     8.77327437045932        0.929071062747699        0.621985551381477

So it could be a copy of a points.final file from an OPTIM local minimization, or a coords.<number> from the final quenches of a GMIN run, or something hacked by hand :-)

Remember that you also need files coords.prmtop and coords.inpcrd (which may contain any old coordinates as long as there are the correct number of them) in the working directory.

Also, now is the time to bring the perm.allow file into the working directory (from your initial setup), if it's not there already.

Running A9OPTIM and checking the output

Assuming you have compiled A9OPTIM, now run

A9OPTIM > out &

Check the out file to see if it found a connected path:

 Connected path found
  ts        E+         Ets - E+          Ets       Ets - E-          E-          S       D      gamma   ~N
   6     -152.1547993 0.44154        -151.7132543 0.80458E-01    -151.7937119   5.161   5.080   2.191  15.060
   4     -151.7937119  3.4194        -148.3743518  3.3870        -151.7613738  13.537  12.878   1.437  22.970
   5     -151.7613738  2.2821        -149.4793084  1.2224        -150.7017332   6.943   5.868   2.648  12.462
   2     -150.7017332  58.767         -91.9349718  58.767        -150.7017332   3.265   2.382  14.873   2.219
   3     -150.7017332 0.48419        -150.2175395 0.18851        -150.4060492   5.113   5.034   1.724  19.137

 Number of TS in the path       =      5
 Number of cycles               =      2

 Elapsed time=                                18.63
 OPTIM> # of energy calls=                          0 time=           0.00 %=  0.0
 OPTIM> # of energy+gradient calls=             39852 time=          15.76 %= 84.6
 OPTIM> # of energy+gradient+Hessian calls=       133 time=           1.40 %=  7.5

Hoorah! Have a look at EofS with your favourite graphing program; it contains the energy of the stationary points on the connected pathway as a function of their integrated path length. Note that it is in "triples" format, i.e. min1,TS1,min2,min2,TS2,min3,... hence the duplication of lines. Likewise, path.xyz contains the coordinates of the stationary points on the connected path, as triples and in XYZ format. So you can visualize a path using an appropriate molecular viewer.

path.info is the key file needed as input for PATHSAMPLE. It contains

  • the energy
  • the order of the molecular point group and its symbol
  • 3N Hessian eigenvalues, with N=number of atoms. Note the 6 zero eigenvalues, which may not seem very close to zero but just need to be orders-of-magnitude smaller than the 3N-6 non-zero ones.
  • 3N coordinates

for each structure in a min-TS-min triple, for all triples found during the OPTIM run.

If the run failed to find a connected path :-( there could be a number of reasons for this: the single-ended TS searches failed to converge, there was trouble minimising, the number of connect cycles was not enough considering the initial path length/expected "complexity" of paths, or the DNEB parameters were not good. Talk to someone with expertise (I'm volunteering DJW...:-) ) to diagnose and fix the problem.

Running PATHSAMPLE to read in the information from this OPTIM connect run

Now you know that you can make connected paths, you can use the above OPTIM job to seed a PATHSAMPLE kinetic transition network (or database).

You can run this job in the same directory as the OPTIM connect run; make sure you have at least these files present:

  • path.info.startup, which is a copy (for future reference) of path.info from the OPTIM connect job.
  • perm.allow
  • pathdata
STARTFROMPATH path.info.startup 0 0
STARTTRIPLES
TRIPLES
NATOMS         33
ETOL           1.0D-4
ITOL 1.0D1
GEOMDIFFTOL 1.0D-1
TEMPERATURE   0.592
CYCLES 0
PERMDIST
AMBER9

Make sure the value of NATOMS is correct for your system! Also note that for AMBER (and CHARMM) the quantity specified on the TEMPERATURE line is kT in units of kcal/mol. So 0.592 corresponds to a chosen temperature of 298 K.

N.B. If you have existing min.data, min.A, min.B, ts.data, points.[min,ts] files in the working directory then a STARTFROMPATH job will stop with a useful message without doing anything.

Run PATHSAMPLE like this:

PATHSAMPLE >& output &

Check the output file for consistency with the OPTIM out file. The bottom of output should look like this:

getallpaths> writing data for new ts to ts.data
getallpaths> writing data for      1 new minima to min.data
getallpaths> writing data for new ts to ts.data
getallpaths> writing data for      2 new minima to min.data
getallpaths> writing data for new ts to ts.data
getallpaths> writing data for      2 new minima to min.data
getallpaths> writing data for new ts to ts.data
getallpaths> writing data for new ts to ts.data
getallpaths> writing data for new ts to ts.data
getallpaths> writing data for      1 new minima to min.data
setup> The unique A and B minima are     0 and     0 respectively

Has it got the right number of minima and TSs? Compare with the last CONNECT cycle in out. Note that PATHSAMPLE has removed the duplicates from the triples in the path.info file, and we are faking the indices of the minima we want to be referred to as A and B (the two zeroes on the STARTFROMPATH line of pathdata).

You'll now have files min.data, ts.data, min.A, min.B, points.min, points.ts. See the List of output files for PATHSAMPLE for an explanation of their format.

Now you must edit the min.A/min.B files (by hand, using a text editor) so that the number on the second line of each is the PATHSAMPLE index of the start/finish end point minimum (i.e. the corresponding line number in min.data). These are NOT the same as the 1,2 indices of the end point minima in OPTIM, because they are determined by the order of triples written to the path.info file!!! You will know the energies of the start and finish minima from previous OPTIM/GMIN runs; to find the PATHSAMPLE indices of these minima, search for those energies (to an appropriate number of significant figures) in the newly-made min.data file. The line number on which you find the entry for that energy is the index that should go into min.A/B. Make a note of whether A is start and B is finish, or the other way round. It will be important to know this later. Note also that, in general, you can have more than one minimum in each of the A and B sets, and no minimum can belong to both simultaneously. You can edit the min.A/B files as you wish between PATHSAMPLE jobs - the definition of the end point sets is important for certain types of run (when the concept of a discrete path is explicitly used, e.g. SHORTCUT, DIJKSTRA, or when we're calculating phenomenological quantities like committor probabilities or overall rate constants).