Pathsampling short paths

From Docswiki
Jump to navigation Jump to search

If your initial connected path is short, and there weren't very many off-pathway stationary points found in the initial connection run, then you should think about the underlying energy landscape before pathsampling.

Go back to the GMIN run (or set one up!) and see what the energy range of all the minima found is (look at the file "energy" in your favourite graphing program). Also look at "markov" and the overall acceptance ratio for the run to see how effective the chosen input settings were. Look at the structures of the x lowest-energy minima found (lowest[1-x].pdb) to examine the structural range covered. You could try rerunning with a higher temperature, or with more final quenches, or with different Monte Carlo steps, to explore more of the landscape.

Then, one, more or none of the sections below may describe appropriate pathsample runs for your system:

Multiple conformational families

If you have minima corresponding to more than 2 distinct conformational families (for example, amongst the structures found in the GMIN run) then you can set up additional, separate OPTIM connection-making runs, as before when Finding an initial path with OPTIM and starting up PATHSAMPLE, between all the pairs of end points of interest. However, when you have the path.info file(s) from OPTIM you may not want to use them to start a new database -- instead, you can add them (one at a time) to an existing PATHSAMPLE kinetic transition network (KTN) with a pathdata file like this:

ADDPATH path.info.add 
ADDTRIPLES
TRIPLES
NATOMS         33
ETOL           1.0D-4
ITOL 1.0D1
GEOMDIFFTOL 1.0D-1
TEMPERATURE   0.592
CYCLES 0
PERMDIST
AMBER9

Run this PATHSAMPLE job in the directory containing the existing KTN, after having copied the relevant OPTIM path.info to ./path.info.add . Note that this run will NOT change or add anything to the min.[A,B] files. Also, DON'T RUN an ADDPATH job when a different kind of pathsample job is also running in the same directory - that would potentially mess up the database files! In fact that's a general rule - don't run multiple PATHSAMPLE jobs at the same time in the same directory.

Systematically increasing the number of connections per minimum

Starting from an existing KTN, with all the necessary files for AMBER in the working directory, add two new files called odata.tssearch and odata.path. These two files, respectively, ask OPTIM to do a single-ended TS search and to find the pathway defined by an input TS and the two connected minima reached by stepping off the TS parallel and antiparallel to the eigenvector whose e-value is negative.

odata.tssearch:

DUMPVECTOR
ENDHESS
UPDATES 6000
EDIFFTOL  1.0D-4
MAXERISE 1.0D-4 1.0D0
GEOMDIFFTOL  0.05D0
BFGSTS 500 10 100 0.01 100
NOIT
PERMDIST
MAXSTEP  0.1
TRAD     0.2
MAXMAX   0.3
BFGSCONV 1.0D-6
PUSHOFF 0.1
STEPS 800
BFGSSTEPS 2000
MAXBFGS 0.1
NAB start

odata.path:

PATH
DUMPPATH
ENDHESS
UPDATES 6000
COMMENT MAXTSENERGY -4770.0
EDIFFTOL  1.0D-4
MAXERISE 1.0D-4 1.0D0
GEOMDIFFTOL  0.05D0
BFGSTS 500 10 100 0.01 100
NOIT
BFGSMIN 1.0D-6
PERMDIST
MAXSTEP  0.1
TRAD     0.2
MAXMAX   0.3
BFGSCONV 1.0D-6
PUSHOFF 0.1
STEPS 800
BFGSSTEPS 2000
MAXBFGS 0.1
NAB start

Many of these options are the same as in odata in Finding an initial path with OPTIM and starting up PATHSAMPLE , but there are some important changes at the top of each.

Then, pathdata should contain something like this:

CHECKCONNECTIONS
CONNECTIONS    6
PERTURB 0.7
CYCLES 0
TRIPLES
NATOMS         33
ETOL           1.0D-4
ITOL 1.0D1
GEOMDIFFTOL 1.0D-1
TEMPERATURE   0.592
COPYFILES perm.allow min.in coords.prmtop coords.inpcrd
EXEC /home/jmc49/bin/A9OPTIM
PERMDIST
AMBER9

The commands at the top are the most important: CHECKCONNECTIONS, CONNECTIONS, PERTURB. Play around with the values of CONNECTIONS (the target number of directly-connected TSs that lead to different minima for each minimum in the database) and PERTURB (the initial maximum size of the Cartesian perturbation applied to each coordinate, in the prevailing units) to find something that works well for your system. Of course, don't forget to change EXEC appropriately, etc :-) Note that this kind of PATHSAMPLE job can currently only be run as a serial calculation.

Such a PATHSAMPLE run will add new structures found in the OPTIM jobs to the PATHSAMPLE KTN (min.data, ts.data, points.min and points.ts will be updated on the fly). This type of run is most appropriate when the starting KTN is very small (i.e. few minima and TSs), to generate more stationary points before running a CONNECTREGION job (see below).

Increasing the overall connectivity of the KTN using CONNECTREGION

In the directory containing the files from the pathsampling runs so far, edit pathdata to include only the following:

TRIPLES
NATOMS         33
ETOL           1.0D-4
ITOL 1.0D1
GEOMDIFFTOL 1.0D-1
TEMPERATURE   0.592
CONNECTIONS    1
COPYFILES perm.allow min.in coords.prmtop coords.inpcrd
EXEC /home/jmc49/bin/A9OPTIM
CONNECTREGION 1 2
CYCLES 50
PAIRLIST 1
PERMDIST
AMBER9

and submit this PATHSAMPLE job, on multiple processors if available (add a JOBSPERNODE <n> or CPUS <n> line to pathdata as appropriate). Set EXEC and CYCLES appropriately - CYCLES tells PATHSAMPLE how many cycles of OPTIM jobs should be run, where the number of parallel, independent OPTIM jobs per cycle is determined by the JOBSPERNODE/CPUS settings and the number of compute nodes you are running the actual PATHSAMPLE job on. Note that a CONNECTREGION run only needs an odata.connect as far as OPTIM data files are concerned.

The result of such a run should be an expanded KTN, i.e. more entries in min.data, ts.data and the corresponding points files.