OPTIM and PY ellipsoids tutorial
GMIN: Find some minima
First run GMIN to find the global minimum and a high energy minimum (or any two stationary points).
OPTIM: Finding a single pathway
Input:
- odata has lines like in GMIN's data file to specify potential, energy criteria, etc.
- NEWCONNECT (to specify attempt to connect POINTS and finish)
- BFGSMIN (to specify minimizer)
- BFGSTS (to specify transition state search)
- PATH # (to specify number of slices along path for rbpath.xyz if a connection is found)
- NEBK # (adjusts the spring constant in the elastic band)
- RBSYM (need file rbsymops to specify symmetry operations, see below)
- DUMPALLPATHS (if you need to make a database; creates path.info)
- DIJKSTRA -- DIJKSTRA 0 for calculations of constants along a connected pathway; DIJKSTRA EXP for finding a path; DIJINITSTART n can also be used to finding a path
- CYCLES -- 0 for DIJKSTRA 0, non-zero otherwise. I use CYCLES 1000 when I want something to run until I stop it.
- lots of other lines
- POINTS followed by lines started by PY with the starting geometry
- finish -- has finishing geometry (no PY atom labels)
- perm.allow -- for identical rigid bodies, need "1 / N 0 / 1 2 ... N". This file cannot be omitted for rigid body setups.
- pysites.xyz
- rbsymops -- has syntax: [number of operations] / [blank line] / [x] [y] [z] [deg] / ..., where xyz is the axis of rotation and deg is the rotation angle in degrees. I've never gotten rbsymops to behave to my satisfaction!
Output from running OPTIM:
- EofS -- energy as a function of integrated path length for first pathway found
- EofS.1, ... -- like EofS for other paths found
- energies -- contains energy values dumped by DEBUG
- min.data.info -- empty unless DUMPDATA is specified in odata.start and odata.finish
- odata.read -- possibly obsolescent; used for IO with programs that use OPTIM
- path.1.xyz, ... -- like rbpath.xyz but for the other pathways from EofS.1, ...
- path.info -- contains stationary points and transition states in PATHSAMPLE format
- path.xyz -- empty when using MULTISITEPY, since the data is placed instead into rbpath.xyz
- points -- dumped coordinates from DEBUG (?)
- rbpath.xyz -- like ellipsoid.xyz but contains slices along the path
To visualize [should fix this up..]:
- Run vmd
- source plotGMINmspath.tcl (I keep mine in ~/tools)
- initGMINms rbpath.xyz
- plotGMINms 0.0
Creating a PATHSAMPLE database
Starting from an OPTIM run
Input: Even if the OPTIM job did not find a connection between the two minima, it should have produced a number of stationary points and transition states and put them in path.info.
- odata.connect -- PATHSAMPLE provides endpoints to OPTIM and supplies an odata.# derived from this file, so it should be like the odata file used to find the initial pathway, but remove all the lines after POINTS, since PATHSAMPLE will provide different ones.
- path.input -- path.info from OPTIM run, renamed to avoid a naming clash
- pathdata
- STARTFROMPATH path.input
- RBAA # -- number of rigid bodies (mutually exclusive with NATOMS specification)
- lots of other lines
- perm.allow
- pysites.xyz
- rbsymops
Output:
- min.A -- number of A minima, then IDs of those minima. Initially set to 1 / 0.
- min.B -- like min.A for B minima
- min.data -- one line per stationary point. The ID of a minimum is its line number in this file
- path.info
- points.min -- binary file containing stationary point configurations
- points.ts -- binary file containing transition state configurations
- ts.data -- one line per transition state
Starting from minima only
Input:
- odata.start
- DUMPDATA (so PATHSAMPLE can get at the minima)
- Only has lines for geometry optimization, so no CONNECT, BFGSTS, etc.
- Ends with POINTS and then the starting geometry
- odata.finish -- like odata.start, but with finishing geometry
- pathdata
- DIJINITSTART
- odata.connect (can be omitted if you run DIJINITCONT afterwards)
- DUMPALLPATHS
Output: As usual. Rewrites any database that was there previously.
To continue the job, change DIJINITSTART to DIJINITCONT.
Expanding a PATHSAMPLE database
Part I
Input: It might be better to just copy everything from the PATHSAMPLE database creation step to a new folder and expand the database there.
- odata.connect
- min.A -- change the ID from 0 to desired start minimum (probably global minimum). Later on you can put more structures in here to work with kinetics.
- min.B -- change the ID from 0 to desired end minimum (probably a high energy minimum)
- min.data
- pathdata
- remove the STARTFROMPATH line
- DIJKSTRA 0 (again, only if the paths are connected!)
- CYCLES 0
- perm.allow
- points.min
- points.ts
- pysites.xyz
- rbsymops
- ts.data
Output:
- Epath -- fastest pathway calculated by DIJKSTRA 0 from the database
- min.A.fastest -- stationary points from original database that are in pathway in Epath
- min.B.fastest
- min.data.fastest -- minima along pathway in Epath
- points.min.fastest -- binary file of coordsinates of minima in min.data.fastest
- points.ts.fastest -- as above, for ts.data.fastest
- redopoints -- can be fed into an OPTIM connect job (see below)
- ts.data.fastest -- cf min.data.fastest
Part II
If DIJKSTRA 0 calculates some rate constants, then the A and B minima are connected. To continue expanding the database, you can either
- place a different minimum ID in min.B or
- set CYCLES 1000 and use the the UNTRAP (for potential energy minima) or FREEPAIRS (for free energy minima) keyword in pathdata to automate the new selection process. Consider also using CONNECTREGION, SHORTCUT, or SHORTCUT with BARRIER.
If PATHSAMPLE complains that there is no connection between the A and B sets, then we need to make some changes to continue expanding the database. Change pathdata to include DIJINITCONT and CYCLES 1000 to establish a connection.
When PATHSAMPLE starts slowing down in its discovery of minima and the topology of the disconnectivity graph stops changing, you can probably safely stop the computation. If you are unsatisfied with the topology, it's time to try a new connection scheme.
Finding an optimized pathway using redopoints
The PATHSAMPLE job will produce a redopoints file. Feeding this file back into OPTIM will produce an rbpath.xyz that may be smoother than the one produced initially. Without the PATH keyword, OPTIM produces an rbpath.xyz that has just transition states and minima. Using PATH will put in more frames along the pathway that are not transition states or minima.
Input:
- min.A -- same as used to create PATHSAMPLE database
- min.B
- odata
- PATH 100 or something
Output:
- Normal connect output, but now with a pathway that has more frames
Making a disconnectivity graph
Input:
- min.data
- ts.data
- dinfo -- There is apparently no documentation for these keywords!
- MINIMA min.data
- TS ts.data
- CENTREGMIN -- puts the global minimum in the middle and generally prettifies the tree
- somehow you can color according to some metric
- NCONNMIN # -- displays only those minima that have at least # connections with other minima in the database
Output:
- tree.ps
I got clever here and created a sibling directory to the one I was running PATHSAMPLE in. I made soft links to min.data and min.ts so that I could run disconnectionDPS in this separate place while using the most up-to-date data from PATHSAMPLE.
Running PATHSAMPLE in parallel
PATHSAMPLE should not be run on a number of nodes greater than the number of possible OPTIM jobs. For example, if you start a database with just two minima, then you should only run on one node. Otherwise PATHSAMPLE will just hang.
Input:
- pathdata
- change CPUS 1 to PBS 1
- EXEC (to OPTIM location)
- COPYFILES perm.allow pysites.xyz rbsymops
- SSH (only if running on sinister!)
- pbsscript
- #PBS -q (set the appropriate queue)
- #PBS -l (set the number of nodes)
- /home/???/bin/PATHSAMPLE (should point to yours, or at least a functional copy)
To go back to running interactively, switch back to CPUS 1. You may have to delete the nodes.info file.
Checking up on your job
- Look at logfile (using, e.g., tail -f logfile)
- Look at the OPTIM.* files produced by each run
- Look at the nodes in use (listed in 'output') by ssh-ing to that node, seeing if OPTIM is running using top, and going to /scratch/??? to see the actual OPTIM job files.
Common PATHSAMPLE gotchas
- weird permission problems -> SSH keyword should be turned on only on sinister
- perm.allow has too many bodies -> fix the RBAA line in pathdata
- PATHSAMPLE stops abruptly -> change CYCLES 0 to CYCLES 1000
- "WAIT return system code error" -> JOBSPERNODE/PBS versus CPUS
- "cannot stat" -> check that DUMPALLPATHS is on in odata.connect
- Too many minima -> OPTIM might be misidentifying TS as minima, so use
- CHECKINDEX [no parameters need to be specified as long as BFGSTS is defined first]
- USEEV 10
- NOIT
Cure-alls
- Remove the pair information (rm pair*)
Unresolved gotchas
- points.min attempt to read non-existent record ->
- "mv: cannot move `something' to ..." -> permissions? JOBSPERNODE/PBS vs CPUS again?
- "Operation not permitted" ... "error code returned by host stdio - 1." -> close the terminal and open a new one!?
- Runtime Error: Record too long for input buffer; Program terminated by I/O error on unit 11 (File="min.data",Formatted,Sequential) -> ??