Identifying the k fastest paths between endpoints using KSHORTESTPATHS

From Docswiki
Jump to navigation Jump to search

When analysing a folding mechanism from the fastest path (the path the makes the largest contribution to the steady-state rate constant), you may want to examine how common the features you identify are to the pathway ensemble between your endpoints. This can be done in many ways, but the first step is to extract the fastest k paths to compare. This is done using the KSHORTESTPATHS keyword in PATHSAMPLE.

From the documentation:

KSHORTESTPATHS npaths nconnmin: calculate the npaths paths with the largest contributions to the rate constant when intervening 
minima are put in steady state. nconnmin specifies that minima with nconnmin connections or fewer should be 
disregarded in this analysis.

The default setting removes 'islands' of minima and transition states from the database. As these are disconnected from both endpoints, they do not influence the rate, and therefore we do not want to waste time and memory considering them. This routine generates database files containing only the remaining minima and transition states:

  • points.min.removed
  • min.data.removed
  • min.B.removed
  • min.A.removed
  • ts.data.removed
  • points.ts.removed

Stick them in a new directory and rename them by removing the .removed and work from these files.

Before we begin: minimising database size

As this process is memory intensive, you may want to reduce the size of your database by removing any 'islands' of minima and transition states using REMOVEUNCONNECTED:

REMOVEUNCONNECTED set : the database is rewritten to files min.data.removed, points.min.removed, etc. Minima disconnected from set, 
where set is A B or AB, are removed, along with all transition states that connect them. Minima with NCONNMIN connections or fewer are 
also removed (this condition is applied recursively). The default for set is AB, which will remove minima that have no connection to 
any member of the A or B sets.

Some words of warning

  • the paths dumped are ranked according to their Dijkstra contribution. The MFPT and rates given in the PATHSAMPLE output are actually those where GT has been used to account for recrossings along the path. This analysis is done for each of the k paths individually and therefore does not consider any off pathway minima.
  • For k=250, the GT rate for pathway 250 may be significantly faster than pathway 50. One way around this problem is to dump many more paths than you need (say 1000) and then re-rank them according to the GT rates and take only the fastest 250. This minimises the chance that you have excluded a path with a high GT rate but low Dijkstra rate - however it does not eliminate it! Be careful!

To do

  • dumping the fastest paths
  • looking at correlation between Dijkstra contribution and GT rates for paths
  • visual inspection of a selection of paths to compare general features of mechanism
  • construct tree from only shortest paths as in Jo's beta3s paper (colour minima if they are in both fastest and slowest of set, if they are in just fastest or just slowest)