Difference between revisions of "CHECKSPMUTATE"

From Docswiki
Jump to navigation Jump to search
Line 5: Line 5:
 
What if we are interested in examining how a Wild Type protein behaves with respect to some carefully selected mutants? Or in comparing one protein against a close homologue? It would seem like a colossal waste of time to create large databases for each of these similar cases, completely independently from each other. This is where CHECKSPMUTATE comes in.
 
What if we are interested in examining how a Wild Type protein behaves with respect to some carefully selected mutants? Or in comparing one protein against a close homologue? It would seem like a colossal waste of time to create large databases for each of these similar cases, completely independently from each other. This is where CHECKSPMUTATE comes in.
   
CHECKSPMUTATE uses the CHECKSPODATA routine, which reoptimises the minima/transition states of a database. CHECKSPMUTATE extends this by allowing for user-selected sections of the coordinates of the stationary points comprising the database to be mutated before the reoptimisation takes place. Thus a database can be transformed so that it describes the behaviour of a mutated protein as opposed to the Wild Type. Though this new database will need to be tidied up through the use of, eg, [[SHORTCUT]], [[SHORTCUT 2 BARRIER]] and [[UNTRAP]], this process should be far quicker than starting a whole new database from scratch. In the example below, I shall show how a pathway describing the approach of the cofactor, NADH, towards another cofactor, haem, within the pocket of HemS (a pathway which took months to find and fully connect) could be quickly replicated in a system where the wt HemS has been replaced by a mutated form (or even by another protein entirely).
+
CHECKSPMUTATE uses the CHECKSPODATA routine, which reoptimises the minima/transition states of a database. CHECKSPMUTATE extends this by allowing for user-selected sections of the coordinates of the stationary points comprising the database to be mutated before the reoptimisation takes place. Thus a database can be transformed so that it describes the behaviour of a mutated protein as opposed to the Wild Type. Though this new database will need to be tidied up through the use of, eg, [[SHORTCUT]], [[SHORTCUT 2 BARRIER]] and [[UNTRAP]], this process should be far quicker than starting a whole new database from scratch. In the example below, I shall show how a pathway describing the approach of the cofactor, NADH, towards another cofactor, haem, within the pocket of HemS (a pathway which took months to find and fully connect) could be quickly replicated in a system where the wt HemS has been replaced by a mutated form (or even by another protein entirely).
   
== Preparation ==
+
== Example of Mutation ==
  +
  +
=== Preparation ===
   
 
To use CHECKSPMUTATE, we first need a database of interest, or a subset of it. In my example, I had extracted the minima and transition states comprising the pathway I was interested in using [[DIJKSTRA]] and moved the new min.data, points.min, points.ts and ts.data files to a new directory. Therefore, each of the stationary points in my database described a stage along this pathway. I wanted to see how this pathway, describing the approach of NADH to haem within the wt HemS pocket, changed when certain mutations were made to the HemS structure. One such residue of interest was a phe-gate (which appeared to regulate the approach of NADH) and so a mutation from phenylalanine to alanine (F104A) was made. I made input files for the new mutated system using tleap, elsewhere described in [[Preparing an AMBER topology file for a protein plus ligand system]] and [[Symmetrising AMBER topology files]]. This gave me coords.inpcrd, coords.prmtop and perm.allow files for my new system. These were moved to the same directory where I had the min.data, points.min, points.ts and ts.data files for the original system.
 
To use CHECKSPMUTATE, we first need a database of interest, or a subset of it. In my example, I had extracted the minima and transition states comprising the pathway I was interested in using [[DIJKSTRA]] and moved the new min.data, points.min, points.ts and ts.data files to a new directory. Therefore, each of the stationary points in my database described a stage along this pathway. I wanted to see how this pathway, describing the approach of NADH to haem within the wt HemS pocket, changed when certain mutations were made to the HemS structure. One such residue of interest was a phe-gate (which appeared to regulate the approach of NADH) and so a mutation from phenylalanine to alanine (F104A) was made. I made input files for the new mutated system using tleap, elsewhere described in [[Preparing an AMBER topology file for a protein plus ligand system]] and [[Symmetrising AMBER topology files]]. This gave me coords.inpcrd, coords.prmtop and perm.allow files for my new system. These were moved to the same directory where I had the min.data, points.min, points.ts and ts.data files for the original system.
  +
  +
Before running the reoptimisations, we need to prepare a series of auxiliary files. In all, we should have the following files in our directory:
  +
  +
*'''aa_ringdata.pyc''' list of parameters/definition of planes for residues with rings. Only required if we are mutating to a residue with a ring.
  +
*'''amino_acids.pyc''' list of parameters for all residues.
  +
*'''atomnumberlog''' list of indices of the first atom of the residues to be mutated.
  +
*'''coordinates_mut.pyc''' script which mutates the selected residue.
  +
*'''coords.inpcrd''' ensure this is for the system we are CHANGING TO.
  +
*'''coords.mdcrd''' for use with min.in, may not be required.
  +
*'''coords.prmtop''' ensure this is for the system we are CHANGING TO.
  +
*'''min.A''' needs to be present, although the index listed in it is unimportant
  +
*'''min.B''' needs to be present, although the index listed in it is unimportant
  +
*'''min.data''' ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using [[DIJKSTRA]].
  +
*'''min.in''' for use with [[AMBER]]. Defines certain aspects of the model, such as solvent being used.
  +
*'''mutate_aa.py''' organises the residues to be mutated, and does the mutations in conjunction with coordinates_mut.pyc
  +
*'''newreslog''' list of codes for residues which we are CHANGING TO.
  +
*'''nresidueslog''' list of the total number of mutations to be made to the system.
  +
*'''odata.checksp''' list of conditions for optimisation carried out on each stationary point by [[OPTIM]].
  +
*'''oldreslog''' list of codes for residues which we are CHANGING FROM.
  +
*'''original_protein.pdb''' pdb file for the system we are CHANGING FROM. All lines which do not dscribe an ATOM are (e.g. TITLE, TER and END) are removed, so that the number of lines of the file should correspond to the number of atoms in the system.
  +
*'''pathdata''' organises [[OPTIM]] jobs. Certain keywords are required, described below.
  +
*'''perm.allow''' full description of the groups of permutable atoms in the system we are CHANGING TO.
  +
*'''points.min''' ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using [[DIJKSTRA]].
  +
*'''points.ts''' ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using [[DIJKSTRA]].
  +
*'''resnumberlog''' list of indices of the residues to be mutated.
  +
*'''ts.data''' ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using [[DIJKSTRA]].
  +
*'''submission_script'''

Revision as of 13:32, 2 June 2020

Purpose

It can take an awfully long time to create a large database or to fully optimise a specific feature of it such as afully connected pathway showing complex protein folding. This is particularly acute when considering large proteins/protein+ligand systems.

What if we are interested in examining how a Wild Type protein behaves with respect to some carefully selected mutants? Or in comparing one protein against a close homologue? It would seem like a colossal waste of time to create large databases for each of these similar cases, completely independently from each other. This is where CHECKSPMUTATE comes in.

CHECKSPMUTATE uses the CHECKSPODATA routine, which reoptimises the minima/transition states of a database. CHECKSPMUTATE extends this by allowing for user-selected sections of the coordinates of the stationary points comprising the database to be mutated before the reoptimisation takes place. Thus a database can be transformed so that it describes the behaviour of a mutated protein as opposed to the Wild Type. Though this new database will need to be tidied up through the use of, eg, SHORTCUT, SHORTCUT 2 BARRIER and UNTRAP, this process should be far quicker than starting a whole new database from scratch. In the example below, I shall show how a pathway describing the approach of the cofactor, NADH, towards another cofactor, haem, within the pocket of HemS (a pathway which took months to find and fully connect) could be quickly replicated in a system where the wt HemS has been replaced by a mutated form (or even by another protein entirely).

Example of Mutation

Preparation

To use CHECKSPMUTATE, we first need a database of interest, or a subset of it. In my example, I had extracted the minima and transition states comprising the pathway I was interested in using DIJKSTRA and moved the new min.data, points.min, points.ts and ts.data files to a new directory. Therefore, each of the stationary points in my database described a stage along this pathway. I wanted to see how this pathway, describing the approach of NADH to haem within the wt HemS pocket, changed when certain mutations were made to the HemS structure. One such residue of interest was a phe-gate (which appeared to regulate the approach of NADH) and so a mutation from phenylalanine to alanine (F104A) was made. I made input files for the new mutated system using tleap, elsewhere described in Preparing an AMBER topology file for a protein plus ligand system and Symmetrising AMBER topology files. This gave me coords.inpcrd, coords.prmtop and perm.allow files for my new system. These were moved to the same directory where I had the min.data, points.min, points.ts and ts.data files for the original system.

Before running the reoptimisations, we need to prepare a series of auxiliary files. In all, we should have the following files in our directory:

  • aa_ringdata.pyc list of parameters/definition of planes for residues with rings. Only required if we are mutating to a residue with a ring.
  • amino_acids.pyc list of parameters for all residues.
  • atomnumberlog list of indices of the first atom of the residues to be mutated.
  • coordinates_mut.pyc script which mutates the selected residue.
  • coords.inpcrd ensure this is for the system we are CHANGING TO.
  • coords.mdcrd for use with min.in, may not be required.
  • coords.prmtop ensure this is for the system we are CHANGING TO.
  • min.A needs to be present, although the index listed in it is unimportant
  • min.B needs to be present, although the index listed in it is unimportant
  • min.data ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using DIJKSTRA.
  • min.in for use with AMBER. Defines certain aspects of the model, such as solvent being used.
  • mutate_aa.py organises the residues to be mutated, and does the mutations in conjunction with coordinates_mut.pyc
  • newreslog list of codes for residues which we are CHANGING TO.
  • nresidueslog list of the total number of mutations to be made to the system.
  • odata.checksp list of conditions for optimisation carried out on each stationary point by OPTIM.
  • oldreslog list of codes for residues which we are CHANGING FROM.
  • original_protein.pdb pdb file for the system we are CHANGING FROM. All lines which do not dscribe an ATOM are (e.g. TITLE, TER and END) are removed, so that the number of lines of the file should correspond to the number of atoms in the system.
  • pathdata organises OPTIM jobs. Certain keywords are required, described below.
  • perm.allow full description of the groups of permutable atoms in the system we are CHANGING TO.
  • points.min ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using DIJKSTRA.
  • points.ts ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using DIJKSTRA.
  • resnumberlog list of indices of the residues to be mutated.
  • ts.data ensure this is for the system we are CHANGING FROM. Could be an entire database, or a section of it (such as a pathway) found using DIJKSTRA.
  • submission_script