QCMagic: Difference between revisions

Revision as of 12:59, 26 March 2019

QCMagic is a set of python libraries to manipulate qchem output, orbital-coefficient etc. files. You can clone it with

  git clone ch-thom@git.csx.cam.ac.uk:QCHEM/qcmagic

The current main branch is winterClean and needs some TLC and documentation.

Documentations are underway for the side branch BCHSummer2017.

Usage

Having cloned QCMagic, assuming it is in your home directory, add the following to your .bashrc:

  export PATH=$PATH:$HOME/qcmagic
  export PYTHONPATH=$PYTHONPATH:$HOME/qcmagic
  export VMDSIMPLEGRAPHICS=1

The last is to ensure you can open a VMD window (provided X forwarding is enabled, and you have VMD installed locally).

If there are problems using VMD via SSH try using xpra

Probably the best way to explore the capabilities of qcmagic is to use iPython. Say you have a Q-Chem output file myfile.out containing a single job; in iPython type:

  from QCManager import *
  O = OutputFile('myfile.out') # The class OutputFile is defined in QCOutput.py
  O.Parse()                    # Extract data from myfile.out. Returns False if no Q-Chem errors are found.

Some data will be stored as values in a container, which is a special type of dictionary defined in QCSupport.py, whilst others are regular attributes of O. For example, if myfile.out is the result of a Q-Chem optimisation ('jobtype opt'), you could find the initial and final energies with:

  Ei = O.Steps[0]['InternalEnergy']
  Ef = O.Steps[-1]['InternalEnergy']

and the initial and final geometries with:

  Xi = O.Steps[0]['System'].Atoms
  Xf = O.Steps[-1]['System'].Atoms
  # Or equivalently:
  Xf = O.System.Atoms

makePlots.py

Note This currently appears to work with qchem input files, but not output files.

Example command:

  python makePlots.py my_qchem_file.inp -T tag

This runs a qchem calculation on my_qchem_file.inp and plots the orbitals in VMD. Output files will be saved to a new directory tag, or Unnamed if tag is unspecified. Run with --help to view all possible options.

Once the plot is up, the display can be controlled entirely from the terminal. For example, view orbital #3 by typing 3. Type help to view all available commands.

scanSurface.py

Overview

A general command for scanSurface has the following form:

  scanSurface.py [options] qchem_file [qchem_file_2 ...] Tag

where qchem_file's are old Q-Chem input or output files whose inputs are used as the system (and assumed to be setup as above). If these files contain saved minima, they are stored and relocated. Outputs from scanSurface are stored in Tag.sd (a pickled Python data structure containing all calculation results), Tag.dat (a printout of energies at different geometries), and when necessary, Tag.dat2 (a printout of NOCI energies at different geometries).

The principle of working of scanSurface is rather simple: from the initial geometry of the system supplied in the old Q-Chem inputs or outputs, each electronic state requested will be followed as the geometry of the molecule is varied. The electron density of the state at the previous geometry is read in as the initial guess for the convergence procedure performed by Q-Chem at the current geometry. Upon convergence, a state distance is calculated between the electron densities at the current and previous geometry, so that only when the new density is located within a certain maximum threshold of the old density can it be accepted as belonging to the same state. scanSurface saves the Q-Chem output file for every calculation as Run*.out.

An example command is:

  scanSurface.py --stretch=0,1,1,0.01 my_qchem_file.out Tag

which runs a scan on a molecule specified in the input section of my_qchem_file.out, calling Q-Chem at points on a surface defined by the stretching of the bond between atom0 and atom1 by 10 units in steps of 0.01, and writing the results to Tag.sd and Tag.dat.

If multiple options are used, such as

  scanSurface.py --stretch=0,1,1,0.01 --stretch=0,2,1,0.01 my_qchem_file.out Tag

the scan will zigzag to cover the whole surface.

Options

scanSurface can be run with --help to view all possible options. In this Section, important remarks for some options are presented.

`--read-minima`

This option is used to specify the minima located by metadynamics in the specified Q-Chem output file that scanSurface should read in and follow. It must be noted that this is one-based and not zero-based, so --read-minima=1 ensures only the first minimum located by metadynamics is scanned. On the other hand, --read-minima=0 would have no effect on the scan. For example,

  scanSurface.py --read-minima=1,2,6 --stretch=0,1,1,0.01 my_qchem_file.out Tag

instructs scanSurface to read in minima 1, 2 and 6 from the old Q-Chem output file my_qchem_file.out, follow each of them as the bond between atom0 and atom1 is stretched by 10 units in steps of 0.01, then store the results as states 0, 1 and 2 in Tag.sd and print out the energies in Tag.dat.

Note that in order for scanSurface to read in the requested minima, the Q-Chem output file must have been generated with the rem option print_orbitals set to true.

`--read-only`

If on, scanSurface only reads and stores the states from the Q-Chem output file to an SD file without trying to reconverge to any of them. This is useful when one seeks to export the states located by metadynamics in a Q-Chem output file to an SD file for further manipulation, analysis or extraction.

`--normal-rotate`

This option may be used in simple cases to change a single bond angle. It relies on being able to work out the connectivity from knowledge of the bond lengths and VDW radii, so can be problematic if (a) the bond lengths are unusual, and (b) there are multiple paths between two atoms. For example,

   --normal-rotate=0,1,2,90,0.1

will:

split the molecule between atom0 and atom1;
check if each of the other atoms is connected to atom0; if they are, group them with atom0 and if not, group them with atom1; and
rotate the group of atoms containing atom1 about an axis through atom1, and normal to the plane containing atom0, atom1 and atom2.

`--tie`

This option allows multiple parameters to change in sync with one another. For example, one might want to investigate the effect of varying two bond lengths at the same time, but would like to ignore what happens when one is stretched and the other unchanged. This could be very useful in the context of molecular vibrations, or when certain symmetry elements need to be preserved.

For each parameter that needs to be tied, two integers are required:

one referring to the option:

ext_charge -- 0, scale -- 1, stretch -- 2, rotate -- 3, normal_rotate -- 4, orth_rotate -- 5, all_normal_rotate -- 6, rotate_axis -- 7;

another one specifying the particular instance of said option (zero-based).

An example of a command using --tie:

  scanSurface.py --stretch=0,1,1,0.01 --rotate=2,3,90,0.9 --scale=2 --tie=2,0,3,0 my_qchem_file.out Tag

This will:

simultaneously stretch bond atom0-atom1 by 1 unit, and rotate about bond atom2-atom3 by 90°, in 100 steps;
scale the whole molecule up to twice its original size. As the --scale option is not tied, scanSurface will scale separately every possible geometry allowed by the previous (tied) options.

Note that an equal number of steps for each tied parameter must be ensured in order for --tie to work correctly!

`--focus`

If this flag is on, once scanSurface fails to locate the correct minimum at a geometry, it will not attempt to find other minima at that geometry but move on to the next geometry instead, where it will use the minimum on the same state at the last well-converged geometry as reference. If scanSurface locates the correct minimum after a few other minima that belong to different states, only the correct minimum is saved and used as reference for the next geometry, while all other superfluous minima are discarded.

This option is useful when following states that pass through regions of discontinuity (and non-convergence), as it prevents scanSurface from getting trapped in a cascading landslide where it tries to look for other minima and follow them, only to be stuck in more regions of discontinuity (and non-convergence) of these new minima later on, halting the state-tracking process altogether.

qcSDExtract.py

Currently qcSDExtract only works with traditional, non-augmented SDS objects (i.e. those that do not contain variations in exchange-correlation).

Overview

qcSDExtract reads in states stored in a single SD file generated by scanSurface, performs further calculations on them if need be, and extracts out their energies, atomic charges and spins from Mulliken analysis, as well as oxidation states of individual atoms from LOBA analysis. qcSDExtract can parametrise geometries using multiple bond lengths, bond angles, and dihedral angles, allowing multi-dimensional surfaces to be traced out.

A general command for qcSDExtract has the following form:

  qcSDExtract.py [options] SDfile Tag

where SD_file is an SD file containing information of all states read in and possibly followed by scanSurface. Several output files are generated by qcSDExtract:

Tag.energy, containing all geometrical parameters requested and the energies of the associated states;
Tag.atomi.charge, containing all geometrical parameters requested and the Mullikan charges of atomi in the associated states;
Tag.atomi.spinz, containing all geometrical parameters requested and the Mullikan spin charges (i.e. spin imbalances) of atomi in the associated states; and
Tag.atomi.oxstate, containing all geometrical parameters requested and the LOBA oxidation states of atomi in the associated states.

An example command is:

  qcSDExtract.py --read-minima=0,1,2 --read-geometries=0,1,2,3,4 --bondlength=0,1,0,2,1,2 --bondangle=1,0,2 --rem="LOBA 12" --atoms=0 mydata.sd Tag

which:

reads in states 0, 1 and 2 at geometries 0, 1, 2, 3 and 4 stored in my_data.sd;
parametrises these geometries using bond lengths atom0--atom1, atom0--atom2, atom1--atom2, and bond angle atom1--atom0--atom2;
performs additional LOBA 12 calculations on these minima; and
outputs Tag.energy, Tag.atom0.charge, Tag.atom0.spinz and Tag.atom0.oxstate, where the three latter files contain details for atom0.

In addition, the actual Q-Chem output file generated when qcSDExtract calls Q-Chem to reconverge into state j at geometry i (i and j are 0-based indices) and/or to perform additional calculations is saved as _SDfile.gi.sj.out (note the underscore prefix).

Options

`--read-minima`

At least one 0-based index of one or more states to read in from the SD file must be specified. Multiple indices are separated by commas. For example,

   --read-minima=0,1,2

instructs qcSDExtract to read in minima 0, 1 and 2 from the SD file.

Q: How do I know which (1-based) minimum in the original Q-Chem output file corresponds to which (0-based) minimum in the SD file after scanSurface has been run?

A: Examine the stdout of scanSurface and look for the pattern "Old min XYZ -> New sol ABC" which shows the mapping of the (1-based) indices in the original Q-Chem output file, XYZ, to the (0-based) indices in the SD file, ABC.

`--read-geometries`

This option allows one to specify the 0-based indices of one or more geometries to be examined, separated by commas. If no indices are specified, or if this option is omitted altogether, all geometries will be scanned. For example, if only geometries with indices 0, 5 and 10 are needed, one should specify

   --read-geometries=0,5,10

which is useful when only certain interesting portions of the surface are required to be examined.

`--atoms`

This option allows the specification of a list of atom indices (0-based) from which chemical properties are extracted. If this option is omitted, properties will be extracted from only atom0. For example, to extract chemical properties of atom0 and atom1 along state 0 at all geometries, one would use the command

   qcSDExtract.py --read-minima=0 --atoms=0,1 --rem="LOBA 12" my_file.sd Tag

`--bondlength`

This is where one can specify the bond lengths that should be used to parametrise the geometries. This, together with the following two options, --bondangle and --bonddihedral, provides a workaround to construct a full multi-dimensional surface from the flatten-down one-dimensional storage of geometries originally implemented in scanSurface.

Here, one specifies a list of 0-based indices of pairs of atoms whose bond lengths will be used to parametrise the geometries. Therefore, the list must contain an even number of elements, otherwise qcSDExtract will exit with an error. For example,

   --bondlength=0,1,0,2,1,2

parametrises the geometries using three bond lengths: atom0--atom1, atom0--atom2, and atom1--atom2. If this option is omitted, atom0--atom1 bond length will be used by default.

`--bondangle`

Here, one specifies a list of 0-based indices of triplets of atoms whose bond angles will be used to parametrise the geometries. The list must contain a multiple of three of elements. In each triplet, the middle index is the centre atom. For example,

   --bondangle=1,0,2,1,0,3

uses the bond angles atom1--atom0--atom2 and atom1--atom0--atom3 to parametrise the geometries. There is no default for this option.

`--bonddihedral`

This option specifies a list of 0-based indices of triplets of atoms whose bond dihedrals will be used to parametrise the geometries. The list must contain a multiple of four of elements. In each quartet, the first three atoms form the initial plane. For example,

   --bonddihedral=0,1,2,3

uses the dihedral angle between the two planes atom0--atom1--atom2 and atom1--atom2--atom3 to parametrise the geometries. There is no default for this option.

varyXCB.py

Overview

varyXCB reads in minima from a Q-Chem output file or an SD file (traditional or augmented) and follows them as exchange-correlation functionals are varied, or as basis sets are changed.

A general command for varyXCB has the following form:

  varyXCB.py [options] file Tag

where file is a Q-Chem output file or an SD file containing minima to be read in. If the SD file contains an instance of class SystemDataSet, it will be augmented to an instance of class SystemDataSetXC, placing all existing minima at exchange-correlation functional index 0 (see later). However, if the SD file already contains a SystemDataSetXC object, it will be read in as-is, but an index for exchange-correlation functionals must be specified via --read-xc so that varyXCB will know which minima to follow (see later).

When run in exchange-correlation following mode, varyXCB outputs the energies of the minima along the parametrised path in the exchange-correlation space in Tag.gi.sj.dat, and stores all minima at all exchange-correlation functionals along the path in Tag.xc.sd, which contains a SystemDataSetXC object. On the other hand, when run in basis projection mode, varyXCB outputs the energies of the minima in the new basis set in Tag.newbasis.gi.sj.dat, and stores all minima in the new basis set in Tag.b.sd, which contains a SystemDataSetXC object.

An example command for exchange-correlation following mode is:

  varyXCB.py --xc-template=xctemplate --read-minima=1,2,3 --para=lambda,0,1,0.1 my_qchem_file.out Tag

which:

reads in minima 1, 2 and 3 from my_qchem_file.out and stores them at exchange-correlation functional index 0 of states 0, 1 and 2 respectively in the SystemDataSetXC object in Tag.xc.sd;
uses the content of xctemplate as a template for $xc_functional section in Q-Chem runs;
understands that lambda in xctemplate represents a variable parameter that needs to be increased from 0 to 1 in steps of 0.1 to define a path in the exchange-correlation space;
runs Q-Chem at different exchange-correlation functionals; and
stores all calculation results in Tag.xc.sd and outputs Tag.gi.sj.dat for j=0,1,2 which contains the energies of states 1, 2 and 3 (read from the Q-Chem output file) as followed along the specified exchange-correlation path.

Another example command for basis projection mode is:

  varyXCB.py --basis-template=changebasis.template --purecart=1112 --read-minima=0,2,5 --read-geometries=0 --read-xc=1
             --rem="SCF_MAX_CYCLES 100" --rem="SCF_MINFIND_READDISTTHRESH 00500" file.sd Tag

which:

reads in minima 0, 2 and 5 at geometry 0 and exchange-correlation functional index 1 from file.sd and stores them at exchange-correlation functional index 0 of states 0, 1 and 2 respectively in the SystemDataSetXC object in Tag.b.sd;
reads in the new basis set specified in changebasis.template;
uses purecart specification 1112 for the new basis set;
runs basis projection calculations with additional rem options;
outputs Tag.newbasis.g0.si.dat containing the energies of state i=0,1,2 at geometry 0 in the old and new basis sets;
outputs Run-new_basis.gi.sj.xc1.out which is the Q-Chem output file for the actual basis-set projection calculation of state i=0,1,2 at geometry 0 and exchange-correlation 1; and
stores all calculated data in Tag.b.sd.

Structure of a `QCSystemData.SystemDataSetXC` Object (for developers only)

QCSystemData.SystemDataSetXC (SystemDataSetXC for short) inherits from the base class QCSystemData.SystemDataSet (SystemDataSet for short) with some existing methods overridden and some new methods implemented to cope with the incorporation of exchange-correlation functional variations.

An SD file is a pickled Python dictionary containing three keys: 'Output', 'Job' and 'SysData'. Let us call this dictionary D. D['SysData'] is then an instance of the class SystemDataSet or the class SystemDataSetXC, depending on how it has been constructed. Here we will only focus on the latter case.

D['SysData'] contains various useful methods and variables. However, all important calculation information is contained within the two variables GeomList and StateList.

`GeomList`

GeomList is a one-dimensional list containing elements which are instances of class QCMolecule.System. Each element therefore carries attributes for describing and manipulating a particular geometry (i.e. atom arrangement) of the system. For example, GeomList could contain successive geometries of a diatomic molecule, XY, where the bond length X--Y is gradually increased in a particular range.

In other words, if GL is a GeomList, then GL[i] is an instance of class QCMolecule.System and contains information of geometry with index i.

`StateList`

StateList is a one-dimensional list containing elements which are dictionaries. Each dictionary then contains key-value pairs, each of which corresponds to a state whose the index is given by the key. The corresponding value is yet another list whose elements are instances of QCSystemData.SCFStateRich (a class inheriting from QCSystemData.SCFState and able to store more information such as LOBA oxidation states, atomic charges and spins, as well as exchange-correlation functionals) and describe the state at different exchange-correlation functionals.

In other words, if SL is a StateList, then

SL is a list with a one-to-one correspondence to GL;
SL[i] is a dictionary containing information of all states at geometry with index i, GL[i];
SL[i][j] is a list containing information at all exchange-correlation functionals of state with index j at geometry with index i; and
SL[i][j][k] is an instance of QCSystemData.SCFStateRich containing information at exchange-correlation functional with index k of state with index j at geometry with index i.

Options

varyXCB can be run with -h or --help to display all options supported and how to use them. Here we explain the two most important options that control the working of varyXCB.

`--xc-template`

In this option, the path to the template file containing specifications of exchange and correlation functionals in terms of parameters is specified. There is no default value for this option, so if it is omitted, no exchange-correlation tracking will be carried out. The template file must have the following format:

   $xc_functional
       X   exchange_functional_name       parametrised_expression
       ...
       C   correlation_functional_name    parametrised_expression
       ...
       K                                  parametrised_expression !This is an alternative specification for Hartree-Fock exchange -- see Q-Chem manual
   $end

The file must begin with $xc_functional and end with $end, and its format must follow that of the $xc_functional section in Q-Chem inputs, noting that the parametrised expressions must contain no white spaces, e.g.

   1-(0.80*lambda)+gamma

where lambda and gamma are parameters. Any parameters used in the template file must then be declared using the --para option of varyXCB in order for varyXCB to parse the template file correctly. Therefore, for a template file containing the above expression, the following options must be present:

   --para=gamma,0,1,0.5 --para=lambda,0,1,0.1

`--basis-template`

In this option, the path to the template file containing the specification of a general, larger basis set is indicated. There is no default value for this option, so if it is omitted, no basis set projection will be carried out. This file must begin with $basis and end with $end, and its format must follow that of the $basis section in Q-Chem inputs. In particular,

   $basis
       general_basis_set_specification
   $end

When run in basis projection mode, varyXCB reads in minima in the smaller basis set (this must be one of the standard basis sets in Q-Chem and not a general basis set) and projects them into the larger, general basis set specified in the template file.

Note that the option --purecart MUST be specified when --basis-template is used, so that Q-Chem will know what types of functions (pure or Cartesian) to use for the general basis set.

radDist.py (not fantastic -- to be deprecated soon)

Overview

radDist can be used to extract and manipulate radial wavefunctions associated with multiple real spherical harmonics that contribute to the form of a general orbital, which might very well be symmetry-broken. For the time being, the orbital must be centred at the origin of the .cube file in order for radDist to work properly.

More specifically, for a general orbital centred at the origin that can be written in the following form,

   $\psi ({\mathbf {r}})=\sum _{lm_{l}}R_{lm_{l}}(r)Y_{lm_{l}}(\theta ,\phi ),$

where we have allowed for multiple real spherical harmonics, $Y_{lm_{l}}$ , to contribute, the main purpose of radDist is to pick out the various radial components by exploiting the orthogonality of real spherical harmonics,

   $R_{lm_{l}}(r)={\frac {\int Y_{lm_{l}}(\theta ,\phi )\psi ({\mathbf {r}})\sin \theta \,{\text{d}}\theta {\text{d}}\phi }{\int Y_{lm_{l}}^{2}(\theta ,\phi )\sin \theta \,{\text{d}}\theta {\text{d}}\phi }}=\int Y_{lm_{l}}(\theta ,\phi )\psi ({\mathbf {r}})\sin \theta \,{\text{d}}\theta {\text{d}}\phi$

as the spherical harmonics are normalised. However, numerically, we must calculate this as

   $R_{lm_{l}}(r)\approx {\frac {\sum _{i}Y_{lm_{l}}(\theta _{i},\phi _{i})\psi ({\mathbf {r}}_{i})\sin \theta _{i}}{\sum _{i}Y_{lm_{l}}^{2}(\theta _{i},\phi _{i})\sin \theta _{i}}}$

where the summation runs over all data points in the same bin $r\rightarrow r+{\text{d}}r$ .

Extracting radial wavefunction components: no flags

radDist has several modes of running. The standard mode is for extracting radial wavefunctions. A general command for radDist running in standard mode requires no special flags, and has the following format:

  varyXCB.py [options] File1 [File2 File3...] Tag

where File1 (File2, File3,...) is a .cube file of an orbital generated using Q-Chem. radDist can work on multiple .cube files (sequentially though, unfortunately) and extract the requested spherical harmonic components.

An example command for radDist in standard mode is:

  radDist.py --dr=0.1 --rmax=5.0 -l 0,1,2 mo.40.cube mo.41.cube d.all.50.2

which:

instructs radDist to consider the orbitals stored in the two files mo.40.cube and mo.41.cube and extract the radial components corresponding to one s ( $l=0$ ), three p ( $l=1$ ), and five d ( $l=2$ ) spherical harmonics from the origin (by default) to a maximum radius of 5.0 Bohr radii in increments of 0.1 Bohr radii;
stores the results in d.all.50.2.mo.40.cube.dat and d.all.50.2.mo.41.cube.dat for the two .cube files respectively. In each .dat file (i.e. for each orbital), the various radial components (not normalised individually) will be stored as columns of a matrix in increasing order of $l$ and $m_{l}$ within each $l$ (i.e. 0,0 -- 1,-1 -- 1,0 -- 1,+1 -- 2,-2 -- 2,-1 -- 2,0 -- 2,+1 -- 2,+2). The very first column of the matrix contains the radial distances, and the very last one contains the effective radial function, which is defined as the signed square root of the sum of the squares of all the spherical harmonic components considered (i.e. the Pythagoras hypotenuse, see later). The sign of the effective radial function follows that of the largest (magnitude) spherical harmonic contribution (a bit like the heavy-atom method in X-ray crystallography, but not quite, since this is not the full phase of the effective radial function). The effective radial function should be close to being normalised in the sense that

   $\int _{r=0}^{\infty }|R_{\text{eff}}(r)|^{2}r^{2}\,{\text{d}}r\approx \sum _{i}|R_{\text{eff}}(r_{i})|^{2}r_{i}^{2}\Delta r_{i}\approx 1;$

collects the results for all .cube files in d.all.50.2.sd, which is a pickled Python dictionary whose keys are the mocubefilenames, mo.40.cube and mo.41.cube, and whose corresponding values are the corresponding radial function matrices.

Combining radial wavefunctions from different runs: `--combine`

This mode is accessed by switching on the flag --combine. In this mode, radDist reads in several .sd files and attempts to combine the radial function matrices of the same orbitals. It is thus necessary that these matrices have the same columns (i.e. they must contain the same spherical harmonic components). The combined matrices are also sorted based on the radial distance, $r$ .

This mode is useful when one needs to combine the radial wavefunction components of the same orbital but generated with different --dr and from various .cube files of different dots per Anstrom (dpa) densities. One might need to plot the same orbital with different dpa's so as to obtain better resolutions for regions where there is a steep variation, e.g. that of an exponential decay.

An example command for radDist in Combine mode is:

  radDist.py --combine --combine-range=9,1000,0,1000 -l 0,1,2 d.all.50.2.sd d.all.100.-1.sd d.all.combined

which:

combines the radial wavefunction matrix of each orbital stored in d.all.50.2.sd (from row 9 to row 1000 or the end, whichever is earlier) with the radial wavefunction matrix of the same orbital in d.all.100.-1.sd (from row 0 to row 1000 or the end, whichever is earlier), if there is one;
sorts the combined matrices based on column 0, which contains the radial distance r;
adjusts the normalisation of the effective radial wavefunction (the last column) in the matrix of each orbital by demanding that ${\frac {1}{N^{2}}}\sum _{i}|R_{\text{eff}}(r_{i})|^{2}r_{i}^{2}\Delta r_{i}=1$ exactly;
stores the combined matrices of all orbitals in d.all.combined.sd; and
writes the combined matrix of each orbital to d.all.combined.orbitalname.dat.

Procedure for obtaining a normalised effective radial wavefunction with varying resolutions along $r$

Let's say we have two symmetry-broken orbitals (which are orbitals 40 and 41 in calculation.out that resemble d-orbitals but are oriented in strange manners. We would expect the radial wavefunction corresponding to each spherical harmonic for each orbital to have a strong exponential increase/decay near the origin, and so we seek to generate a .cube file with high dpa for the inner region close to the origin. For the outer region, a low-dpa .cube file would suffice. We thus run the following commands:

  makePlots.py -d --tag=orbitals.50.2 -b 2 --range=40,41 --dpa=50 calculation.out
  makePlots.py -d --tag=orbitals.100.-1 -b -1 --range=40,41 --dpa=100 calculation.out

where:

the tag -d is to suppress VMD (feel free to leave this out if opening VMD is desirable, perhaps for checking the orbital);
the tag -b specifies the border (in Angstrom) added to each side of the bounding box. This is particularly important in the high-dpa plot run, because if the volume of the bounding box were large, it would be very resource-consuming to perform a high-dpa plot.

When both commands return, two directories, orbitals.50.2 and orbitals.100.-1, have been created. We seek to decompose each orbital into its s, p, and d components, as follows:

  cd orbitals.50.2/_systemfragments_plot.scratch/plots
  radDist.py --dr=0.1 --rmax=5.0 -l 0,1,2 mo.40.cube mo.41.cube dlike.all.50.2
  cp dlike.all.50.2.* ../../../
  cd ../../../orbitals.100.-1/_systemfragments_plot.scratch/plots
  radDist.py --dr=0.02 --rmax=5.0 -l 0,1,2 mo.40.cube mo.41.cube dlike.all.100.-1
  cp dlike.all.100.-1.* ../../../
  cd ../../../

We note that:

The dpa=100 orbitals are sampled into a smaller bin size (--dr=0.02) due to their denser plot points.
Th request to extract the s, p, and d spherical harmonic components is made through the use of the -l 0,1,2 tag.
The .dat files generated at this stage, dlike.all.50.2.mo.40.cube.dat, dlike.all.50.2.mo.41.cube.dat, dlike.all.100.-1.mo.40.cube.dat, and dlike.all.100.-1.mo.41.cube.dat, contain the various spherical harmonic components of each of the two orbitals, as well as the corresponding unnormalised effective radial wavefunction.
The radial wavefunction matrices for both orbitals are also stored in dlike.all.50.2.sd and dlike.all.100.-1.sd.
What we have are essentially high-resolution plots of radial wavefunctions in intervals of 0.02 Angstrom within the radius 0.90 Angstrom, and medium-resolution plots in intervals of 0.10 Angstrom beyond the radius of 0.90 Angstrom.

We now need to combine the results from the two runs for each orbital. This is achieved by:

  radDist.py --combine --combine-range=9,1000,0,1000 -l 0,1,2 dlike.all.50.2.sd dlike.all.100.-1.sd dlike.all.combined

This does the followings:

For each orbital, rows 9 to 1000 (or the end, whichever is earlier) of the radial wavefunction matrix in dlike.all.50.2.sd are appended by rows 0 to 1000 (or the end, whichever is earlier) of the corresponding radial wavefunction matrix in dlike.all.100.-1.sd;
The resulted matrices are then sorted based on the first column containing values of radial distances from the origin;
The ${\textstyle \sum _{i}|R_{\text{eff}}(r_{i})|^{2}r_{i}^{2}\Delta r_{i}}$ value for each effective radial wavefunction is computed (which should be close to 1). But to ensure exact normalisation, each effective radial wavefunction is divided by ${\textstyle {\sqrt {\sum _{i}|R_{\text{eff}}(r_{i})|^{2}r_{i}^{2}\Delta r_{i}}}}$ ; and
The combined matrices are printed out in dlike.all.combined.mo.40.cube.dat and dlike.all.combined.mo.41.cube.dat, as well as stored in dlike.all.combined.sd.

Calculating the root-mean-square (rms) of multiple effective radial wavefunctions: `--rmsReff`

Occasionally it is necessary to compute the rms of the effective radial wavefunctions of M different orbitals, defined as

   $|R_{\text{rms}}(r)|={\sqrt {{\frac {1}{M}}\sum _{i}^{M}|R_{{\text{eff}},i}(r)|^{2}}}.$

The normalisation of $R_{{\text{eff}},i}(r)$ ensures that $R_{\text{rms}}(r)$ is also normalised. For example, when we have identified a set of symmetry-broken d-orbitals with the 3d-subshell, we seek to determine a unique effective radial wavefunction for this 3d-subshell, which can be achieved by finding the rms of the effective radial wavefunctions of these symmetry-broken 3d-orbitals.

An example command is:

  radDist.py --rmsReff --rmsReffmo=mo.40.cube,mo.41.cube --rmsReffname=3drms dlike.all.combined.sd dlike.all.Reffrms

which:

computes the rms of the effective radial wavefunctions of the orbitals mo.40.cube and mo.41.cube stored in dlike.all.combined.sd;
constructs a matrix whose first column contains the radial distances (copied over from one of the matrices in dlike.all.combined.sd), whose last column contains the calculated rms, and whose columns in between are the effective radial wavefunctions of the specified orbitals copied over;
stores this matrix in dlike.all.Reffrms.sd under the key named 3drms; and
writes this matrix to dlike.all.Reffrms.dat.

Note that when the flag --rmsReff is on, radDist only considers one .sd input. If more than one are specified, only the first one is considered.

Converting real radial wavefunction components to corresponding complex ones: `--realtocomplex` (in development)

In some calculations such as that for spin-orbit coupling, it is necessary to expand the orbital in complex spherical harmonics as

   $\psi ({\mathbf {r}})=\sum _{lm'_{l}}{\bar {R}}_{lm'_{l}}(r){\bar {Y}}_{lm'_{l}}(\theta ,\phi ),$

where a bar is used to denote a complex quantity. But earlier we have been able to expand the same orbital in real spherical harmonics and obtain the real radial wavefunctions:

   $\psi ({\mathbf {r}})=\sum _{lm_{l}}R_{lm_{l}}(r)Y_{lm_{l}}(\theta ,\phi ),$

so if the real and complex spherical harmonics are related by

   $Y_{lm_{l}}(\theta ,\phi )=\sum _{m'_{l}}S_{m_{l}m'_{l}}{\bar {Y}}_{lm'_{l}}(\theta ,\phi ),$

it can be shown that the complex radial wavefunctions are related to the real counterparts by

   ${\bar {R}}_{lm'_{l}}(r)=\sum _{m_{l}}R_{lm_{l}}(r)S_{m_{l}m'_{l}}.$

The transformation matrix ${\mathbf {S}}$ is defined as shown in [1].

when the flag --realtocomplex is switched on, radDist can read in a single .sd file containing real radial wavefunctions, go through each orbital and convert the real radial wavefunctions for each specified angular momentum l to corresponding complex radial wavefunctions.

An example command for this is:

   radDist.py --realtocomplex --realtocomplex-l=1,2 --realtocomplex-range=2,4,5,9 dlike.all.combined.sd dlike.all.r2c

From earlier, dlike.all.combined.sd contains real radial wavefunction matrices for mo.40.cube and mo.41.cube, each of which has s, p, and d components. But here we are only interested in, say, the p and d components,

Symmetry Tools

The symmetry tools are a series of commands used run symmetry related jobs on molecules whose geometry has got symmetry elements other than the identity (i.e. n-fold axis of rotation, inversion centre, mirror plane). The symmetry tools allow us to transform some set of orbitals according to some symmetry operations be able to compute relative energies or the orbitals. This allows us to then identify orbital degeneracy, excited states and much more. The symmetry tools are handled as shown below;

Step 1: metadynamics

The first step to using the symmetry tools is to run a metadynamics job on QChem which involves running an SCF job while settings the "SCF_SAVEMINIMA = N" as a $rem variable where N >= 1. Note: metadynamics is when N > 1. However, a single SCF determinant can also be used instead of having several minima. A sample job for metadynamics on Qchem which prints out 10 energy minima for the water molecule is shown below (The input file is water.inp);

$molecule O -2.56756 2.36129 -0.00000 H -1.57928 2.40480 -0.00000 H -2.85788 3.30911 -0.00000 $end

$rem

   BASIS                 6-31G
   METHOD                B3LYP
   UNRESTRICTED          true
   SCF_GUESS             core
   SCF_ALGORITHM         diis
   THRESH                18
   SCF_CONVERGENCE       7
   GEN_SCFMAN            false
   SCF_SAVEMINIMA        10
   SCF_MAX_CYCLES        100
   MOM_START             1
   PRINT_ORBITALS        true
   DIIS_SEPARATE_ERRVEC  true
   SYM_IGNORE true
   SCF_MINFIND_READDISTTHRESH 00500
   purecart              1111
   MEM_STATIC            1024

$end

The above job finishes successfully printing 10 minima in the output file, "water.out".

Step 2: runscanSurface.py

Once you have got your single determinant(s) (solution(s) for your SCF job) from step 1 above, you run the "runscanSurface.py" on the output file (water.out)as follows

$ runscanSurface.py -p 6 --read-minima=1,2,3,4,5,6,7,8,9,10 --read-only water.out YourNewOutputFileName > terminal.out Where -p 6 represents the number or CPU core used.

     --read-minima=1,2,3,4,5,6,7,8,9,10 represents the 10 minima printed in our water.out file. 
                                        You should only include the minimum or minima that you are interested in here. 
                                        You must not use all 10 minima
     --read-only water.out specifies the output file you are reading the minima from
     YourNewOutputFileName specifies the new output file for the runscanSurface.py run, which for the present 
                           job is waterExtract (No fill extension is neccesary here)

The above command prints files with .dat, .dat2 and .sd files in addition to the terminal.out file (for this example we generate, waterExtract.dat, waterExtract.dat2, waterExtract.sd and terminal.out ). The terminal.out simply lists the sequence of the run partly shown below;

Queue software is Local Adding Geometry 0 ind: 0 Trying a step with 10 minima... Looking for requested minimum with 1-based index 1... Solution with requested 1-based index 1 found. Old min 1 -> New sol 0 Looking for requested minimum with 1-based index 2... Solution with 1-based index 1 found but not requested, so ignored. Solution with requested 1-based index 2 found. Old min 2 -> New sol 1 Looking for requested minimum with 1-based index 3... Solution with 1-based index 1 found but not requested, so ignored. Solution with 1-based index 2 found but not requested, so ignored. Solution with requested 1-based index 3 found. Old min 3 -> New sol 2 Looking for requested minimum with 1-based index 4... Solution with 1-based index 1 found but not requested, so ignored. Solution with 1-based index 2 found but not requested, so ignored. Solution with 1-based index 3 found but not requested, so ignored. Solution with requested 1-based index 4 found. Old min 4 -> New sol 3 Looking for requested minimum with 1-based index 5... Solution with 1-based index 1 found but not requested, so ignored. Solution with 1-based index 2 found but not requested, so ignored. Solution with 1-based index 3 found but not requested, so ignored. Solution with 1-based index 4 found but not requested, so ignored. Solution with requested 1-based index 5 found. Old min 5 -> New sol 4 Looking for requested minimum with 1-based index 6... Solution with 1-based index 1 found but not requested, so ignored. Solution with 1-based index 2 found but not requested, so ignored. Solution with 1-based index 3 found but not requested, so ignored. Solution with 1-based index 4 found but not requested, so ignored. Solution with 1-based index 5 found but not requested, so ignored. Solution with requested 1-based index 6 found. . . . .

Step 3: Mapping the "Old min" to the "New sol"

The next step involves the use of the grep command to map "Old min" to the "New sol" in the terminal.out file using the following command

$ grep "Old min" terminal.out > oldminsol This simply copies all the lines having "Old min" in the terminal.out file and get them stored in a new "oldminsol" file (the file name could be anything). This is necessary because the old minima in our water.out file which are numbered 1 to 10 have changed to 0 to 9 in the .sd file generated after the "runscanSurface.py" run. So we need to specify this new order before executing step 4 below in which the .sd file is used.

Step 4: runqcSDExtract.py

The waterExtract.sd file generated in step 2 contains information for all specified minima in that step. Using that waterExtract.sd file from step 2, we run the "runqcSDExtract.py" command as follows;

$ runqcSDExtract.py -p 6 --reconverge waterExtract.sd waterReconverged.file

This command tries to reconverge every single minimum in the waterExtract.sd file and reconverges then into separate files. The number of output files from the run is dependent on the number of minima in the .sd file. In our example, we will have 10 of such files for each of "waterReconverged.file.g0.sn.xc0.reconverged.err", "waterReconverged.file.g0.sn.xc0.reconverged.out"(reconverged single solution). The bold n in the .out and .err file names proceeding "s" specifies the solution index number, in our example, this will range from 0 to 9 corresponding to solutions 1 to 10 in water.out file. In addition, we print the following files, "waterReconverged.file.xc0.atom0.charge.dat" (information on charge for each solution state), "waterReconverged.file.xc0.atom0.oxstate.dat"(information on oxidation state for each solution state), "waterReconverged.file.xc0.atom0.spinz.dat"(information on spin state for each solution state), "waterReconverged.file.xc0.energy.dat"(information on energy for each solution state). Some .scratch and .in folders are also printed. Note: All printed files in this step are for inspection purposes, only the .out file, (waterReconverged.file.g0.sn.xc0.reconverged.out) is needed and will be used in the next step (step 5).

Step 5: rearrangesol.py

It is a good practice to run each job in a new step on a separate folder. Hence, move all the required .out files (waterReconverged.file.g0.sn.xc0.reconverged.out) from step 4 run to a new folder (e.g. waterstep5 folder). In addition to the .out files, move the "oldminsol" file from step 3 into the "waterstep5" folder, alternatively, you could use the file path of the "oldminsol" if that works best for you. Then run the following python patch script as;

$ python2.7 rearrangesol.py oldminsol

The above command simply arranges each of the waterReconverged.file.g0.sn.xc0.reconverged.out into separate folders with folder names "oN", where N is 0 to 9 for the 10 solutions we are using in our current example. If you are using a single solution, this is not necessary.

Note: Only do this if you are using multiple solutions and have access to the rearrangesol.py patch file which is a simple python script that simply places each of the separate (waterReconverged.file.g0.sn.xc0.reconverged.out) file into a separate folder, otherwise move each output to a new folder and continue to step 6.

Then we copy the oldminsol file from the 2nd-action into the 4th-action folder as well as the rearrangesol.py file. Afterwhich we run the follwing python command below

$ python2.7 rearrangesol.py oldminsol

Useful Classes (for developers only)

QCMagic includes a number of useful python classes that may be used to generate or analyse data from Q-Chem calculations. Some of the more useful classes and their functions are listed below. QCManager imports most of these and thus importing QCManager sets up QCMagic for easy usage.

QCMolecule.Atom

A class used to keep track of atoms and their positions via self.Position.

QCMolecule.System

As a relatively fundamental class in QCMagic, this is used to keep track of a set of atoms contained in self.Atoms, and can be initiated with no inputs followed by addition of atoms, either individually or from a .xyz file/text. Most noteworthy are:

numerous methods for moving and rotating both the entire system and individual atoms;
and the method MakeInput() for generating a $molecule section of a Q-Chem input file and self.Charge, self.SpinZ to keep track of spin and charge.

QCSystemData.SystemDataSet

This class is used to keep track of multiple systems and their associated electronic states. Systems can be added either from SCF states (QCSystemData.SCFState(system,minimum)) or from a QCOutput.OutputFile object. Some important methods and variables in the class are:

self.GeomList is an instance variable, which is a list of all systems contained in the SDS;
self.StateList is another instance variable, which is the corresponding list of dictionaries (dicts) with each dictionary (dict) containing the minima (electronic states) associated with that system. Each state has a unique StateList index;
RunQChem() is a class method for running a Q-Chem calculation. This takes as arguments a Queue and Job (see below), the geometry index (GI) of the System for which the calculation is to be performed, as well as a list (FindMinima) of minima to use as initial guesses given by their StateList indices. if OverwriteGeom=True, the geometry of the GI will be used, otherwise the geometry from the the next pending Job will be used.

QCOutput.OutputFile

As the name suggests, this is a class used for analysing Q-Chem output files. the Parse() method extracts data from the output file and adds it to self.Outputs. If only one calculation has been performed, self.Outputs[0] contains the data of interest. This includes steps; a list of steps corresponding to the different geometries of e.g. a geometry optimisation and the corresponding System classes can be extracted. Each step (and CurStep / Steps[-1] for a single point calculation) contains a list of Minima which can be used to instantiate a QCSystemData.SCFState object and add to an SDS. The Minima also contain orbital coefficients and distance metrics.

QCInput.Input

A class used to generate and manipulate inputs for Q-Chem calculations. An instance of this class contains a section list, self.Sections, and a section dictionary, self.SectionDict, which contain sections to be added to a Q-Chem input file. The rem section is also added as a dictionary and its elements may be changed by e.g. self.Rem['jobtype']='opt'. Sections can be added, removed or edited. self.System contains the System for which an input is to be written; n.b. this overwrites any changes to the molecule section! MakeInputText() generates the text that goes into a Q-Chem input file, but jobs may more easily be submitted directly via QCMagic.

newJob.Job

A class used to managed Q-Chem jobs. A new input can be added to the job with the AddInput() method. This input is automatically added to self.Pending from which jobs are run when the Job class instance is used to do a Q-Chem calculation.

QCQueue.QCQueue

A class used for running Q-Chem calculations. This is where input files, submission scripts, scratch directories etc. are actually written and the job submitted. This is also where details about nodes and cores are determined and may need some editing as the hardware and software used to run the calculations is changed.

QCRegisterOptions.RegisterOptions

Made to register options for scanSurface calculations (see below) but can be partly copied, imitated or imported for easy addition of own options to new scripts.

A useful way of saving and loading such class instances is to use the pickle module.

QCMagic: Difference between revisions

Revision as of 12:59, 26 March 2019

Usage

makePlots.py

scanSurface.py

Overview

Options

--read-minima

--read-only

--normal-rotate

--tie

--focus

qcSDExtract.py

Overview

Options

--read-minima

--read-geometries

--atoms

--bondlength

--bondangle

--bonddihedral

varyXCB.py

Overview

Structure of a QCSystemData.SystemDataSetXC Object (for developers only)

GeomList

StateList

Options

--xc-template

--basis-template

radDist.py (not fantastic -- to be deprecated soon)

Overview

Extracting radial wavefunction components: no flags

Combining radial wavefunctions from different runs: --combine

Procedure for obtaining a normalised effective radial wavefunction with varying resolutions along r {\displaystyle r}

Calculating the root-mean-square (rms) of multiple effective radial wavefunctions: --rmsReff

Converting real radial wavefunction components to corresponding complex ones: --realtocomplex (in development)

Symmetry Tools

Step 1: metadynamics

Step 2: runscanSurface.py

Step 3: Mapping the "Old min" to the "New sol"

Step 4: runqcSDExtract.py

Step 5: rearrangesol.py

Useful Classes (for developers only)

Navigation menu

Search

`--read-minima`

`--read-only`

`--normal-rotate`

`--tie`

`--focus`

`--read-minima`

`--read-geometries`

`--atoms`

`--bondlength`

`--bondangle`

`--bonddihedral`

Structure of a `QCSystemData.SystemDataSetXC` Object (for developers only)

`GeomList`

`StateList`

`--xc-template`

`--basis-template`

Combining radial wavefunctions from different runs: `--combine`

Procedure for obtaining a normalised effective radial wavefunction with varying resolutions along $r$

Calculating the root-mean-square (rms) of multiple effective radial wavefunctions: `--rmsReff`

Converting real radial wavefunction components to corresponding complex ones: `--realtocomplex` (in development)