SimpleDMC

From Thom Group Wiki
Jump to navigation Jump to search

Theory

Compilation of SimpleDMC

Examples

The Hydrogen Atom

Gaussian

Let's begin by generating a trial wavefunction with Gaussian03. To start with, we need to define an atomic orbital basis, then tell Gaussian03 to use it in conjunction with some ab initio method. We'll use the STO-6G basis set, where 6 Gaussian primitives are used to represent an atomic orbital. We'll create an input file for Gaussian03, called H.com,

  $ cat -n H.com
     1   %chk=H.chk
     2   #p UHF/STO-6G 6d 10f SCF Test IOP(3/24=10) IOP(3/81=1) Pop=Full
     3   
     4   Hydrogen Atom - UHF STO-6G
     5
     6   0 2
     7   H   0.000   0.000   0.000
     8

This may seem cryptic for the moment, but each line (and entry therein) is necessary. Line 1 tells Gaussian03 to put some output in an encrypted file called H.chk. Line 2 is the ``route section": The IOP arguments are internal options, and tell Gaussian03 to spit out data in a certain format. Line 4 contains a title section, and must be included, as must the sandwiching blank lines. Line 6 specifies the charge and multiplicity (number of unpaired electrons + 1). Finally, on line 7, we specify the molecular configuration; atom types and Cartesian position vectors of the constituent atoms are enumerated here.

Now, let's run Gaussian03,

  $ g03 H.com

No return value is normally a good sign- you'll typically be greeted with an (unhelpful) SEGFAULT of some kind if the calculation was unsuccessful. It's also worth explicitly checking for errors, since Gaussian is also able to die silently

  $ grep "Error" H.log

Gaussian03 writes to two files- H.log and H.chk; the contents of the former are obvious while we've already briefly discussed the contents of the latter. In the output, all we care about is the energy, so invoke

  $ grep -n "SCF Done" H.log
  219: SCF Done:  E(UHF) = -0.471039054196     A.U. after     1 cycles

This is a rather poor value relative to the analytical -0.5 Hartrees, but at least we have something to improve upon. We must now do some conversions to extract information from the checkpoint file

  $ formchk H.chk H.fchk
  Read checkpoint file H.chk
  Write formatted file H.fchk
  Rotating derivatives, DoTrsp=T IDiff= 1 LEDeriv=    197 LFDPrp=       0 LDFDPr=       0.

formchk is a Gaussian utility that converts the unreadable H.chk to a formatted checkpoint file, H.fchk. Since this formatted file is more amenable to parsing, it can be used by a number of visualisation/ post-Gaussian tools.

qWalk

qWalk is a piece of software we can benchmark the simpleDMC software against, possessing both variational and diffusion Monte Carlo (VMC and DMC, respectively) functionalities. The simpleDMC software also uses the same input file formats as qWalk, and so we are able to use their file conversion tools to extract data from a host of ab initio software suites.

First, we parse the H.fchk file into a number of qWalk input files:

  $ ./g032qmc -log H.log -fchk H.fchk 
  --- read fchk file ---
  Number of atoms: 1
  Number of basis functions: 1
  Number of independant functions: 1
  Multiplicity: 2
  Number of alpha electrons: 1
  Number of beta electrons: 0
  Total Energy: -0.471039054195684
  Pure/Cartesian d shells: 6D
  Pure/Cartesian f shells: 10F
  found beta component, scftpe=UHF
  --- read log file ---
  basis set is from 'AO basis set' section
  --- have read files ---
  norb=1
  spin up end size=1
  moCoeff total size=2
  writing qwalk input files ... 
  writing orb file.  This could take a few seconds.
  output: sample.orb
  output: sample.basis
  output: sample.slater
  output: sample.jast2
  output: sample.sys
  output: sample.hf
  output: sample.opt
  Done
   
  Warning: Now the basisset must be the same for the same specie of atoms

We'll try to reproduce the UHF results by use of VMC with qWalk. Let's have a look at sample.hf:

  $ cat sample.hf
  METHOD { VMC nconfig 1000 } 
   
   
  INCLUDE sample.sys  
  TRIALFUNC { INCLUDE sample.slater}

Note that we have added "nconfig 1000", which gives us control over how many walkers are to be used for the VMC. sample.sys contains the system configuration (number of electrons with a given spin, atomic coordinates, etc.), while sample.slater contains the orbital information for the Slater determinant (in conjunction with sample.orb and sample.basis). This Slater determinant will be used to sample the trial wavefunction created by g03. We now run qWalk:

  $ ./qwalk sample.hf
  output to sample.hf.o
  System
  Wave function
  Pseudopotential
  Could not open sample.hf.config. Generating configurations from scratch
 

and upon completion, are left with three files,

sample.hf.o Information about the run.
sample.hf.log Log for statistical analysis.
sample.hf.config Checkfile for the run.

We can extract averages from the VMC by invoking

  $ ./gosling sample.hf.log      
  #####################
  vmc:  100 total blocks reblocked into 100
  #####################
  Threw out the first 0 blocks in the averaging.
  total_energy0        -0.4702684082 +/-  0.000922386677 (sigma     0.425114576 ) 
  kinetic0              0.7706545412 +/-  0.003605881864 (sigma     1.448514687 ) 
  potential0            -1.240922949 +/-  0.003148015019 (sigma      1.21325133 ) 
  nonlocal0                        0 +/-               0 (sigma               0 ) 
  weight0                          1 +/- 6.661338148e-17 (sigma               0 ) 
  
  --------Properties differences-------------
  approximate number of independent points: 212415.3753

We see that the VMC estimate of the energy of the trial wavefunction (-0.470(2)) matches relatively well with the UHF energy (-0.471039054195684). We have also obtained the sample.hf.config file, containing the distribution of walkers sampling the trial wavefunction.

  $ head -8 sample.hf.config
  walker { 
  nElec 1 ndim 3
  -0.731706383078456 0.620589802605041 -0.092157439050284 
  } 
  walker { 
  nElec 1 ndim 3
  0.366970593352899 0.216757302067973 -0.48313699793852 
  }


Known Issues

A bug has been found pertaining to the parsing of the Gaussian *.fchk file, by the qWalk executable "g032qmc". You may attempt the $ ./g032qmc -log H.log -fchk H.fchk command, only to receive the following error:

 --- read fchk file ---
 Number of atoms: 3
 Number of basis functions: 13
 Number of independent functions: 0
 Multiplicity: 1
 Number of alpha electrons: 5
 Number of beta electrons: 5
 Total Energy: -75.5859595605798
 Pure/Cartesian d shells: 6D
 Pure/Cartesian f shells: 7F
 main_read_g03fchkfile:  error in alpha MO

The issue is that qWalk must seach for the key phrase "Number of independent functions" in the *.fchk file, however, certain versions of Gaussian will write the *.fchk file with the following on line 10:

 " Number of independant functions            I        <number> "

The simple fix to this problem is to correct the spelling in the *.fchk file to instead read "independent" (the correct, english, spelling), and qWalk will then successfully parse the file.

simpleDMC

We can use this distribution of walkers as a starting point for our DMC calculations with simpleDMC. We require an input file to run simpleDMC, sample.simple:

  $ cat sample.simple
  simple { 
     nWalkers             1000     # Number of walkers
     nSteps              10000     # Maximum number of steps
     dTau                0.001     # Timestep
     dETrial              -0.5     # Trial energy should start (> FCI energy but close to it)
     dWeightInit           1.0     # Initial weight
     bConstWalkerNum     false     # Are we enforcing a constant number of walkers? (Currently not working)
     nWalkersSpace       20480	    # Walker growth space (Maximum number of walkers permitted)
     dWeightTarget     10000.0     # Total weight above which we undertake population control
     dWeightMin         1500.0     # Total weight below which we undertake population control
     dLowThresh            1.0     # If walker has weight below this, try to either kill or recombine it
     dWeightBranch         1.5     # If walker has weight above this, we branch the walker
     nShiftSteps             1     # Vary the shift every nShiftSteps steps.
     dShiftDamp            1.0     # Larger shift damping is more radical damping
     dOldWeight            0.0     # The total weight at the previous cycle for calculating the shift
     bUpdateShift        false     # Are we updating the shift every nShiftSteps cycles?
  }
                                                            

simpleDMC reuses the input files of qWalk, and can be invoked by listing the files as command line arguments (in any order)

  $ ./simpleDMC.x sample.{sys,slater,orb,basis,simple,hf.config,hf.log} >> sample.log

We pass all of the output to sample.log since the majority of information concerning the simulation is streamed straight to STDOUT. We are primarily concerned with the final nSteps lines, since these give the various energetic quantities associated with the system of walkers at each timestep. In fact, with a little foresight, the fifth column corresponds to the total energy averaged over all walkers at that timestep. Averaging over this column, we obtain the DMC energy:

  $ export nsteps=10000
  $ tail -n $nsteps sample.log | awk '{ sum += $5; n++ } END { if (n > 0) print sum / n; }'
  -0.498772

This quantity is in very good agreement with the exact energy of -0.5 Hartrees.


The Jastrow Factor

One further usable output from qWalk is the Jastrow factor parameters. This helps to add in explicit correlation terms between electron-electron pairs (known two body terms) and electron-ion pairs (one body terms). After having performed the parsing of the *.fchk file with ./g032qmc, a blank template of the Jastrow parameters will be created, called *.jast2. This is designed to include both two-body and one-body interaction, hence the suffix of "2".

A good tutorial for how to optimise the Jastrow parameters is given in the qWalk documentation ( [1] ), but a potted version which will consistently produce the same result is outlined here. Nothing needs to be altered in the *.jast2 file, but the *.opt file requires some additions. For a system called sample, Edit the sample.opt file so that the very first line reads:


 METHOD { OPTIMIZE iterations 1000 READCONFIG sample.hf.config  }
 
 INCLUDE sample.sys
 TRIALFUNC {
   SLATER-JASTROW
   WF1 { INCLUDE sample.slater }
   WF2 { INCLUDE sample.jast2 }
 }


The "iterations" keyword extends the number of Jastrow optimisation cycles beyond the default which is only 30 (which in practice is often not enough). The "READCONFIG" file forces qWalk to explicitly read in the electron configurations from a previous qWalk run (if a more accurate *.dmc.config file exists, then of course use that!). The advantage of this keyword is that qWalk does not have to make up its own configurations, which will make the Jastrow optimisation unreproducible and, quite often, spurious.

Once edited, simply run: $ qwalk sample.opt

A new file called "sample.opt.wfout" will have been produced, which contains the combined Slater and Jastrow parameters. Since we are only interested in the Jastrow parameters, and the parser gets confused with the Slater bit, delete the first portion of the file up until the lines which read:

 JASTROW2
 GROUP {
   OPTIMIZEBASIS
 [etc...]

If the system only contains a single atom, then you are done, and the file should be successfully parsed! The only addition you will need to make is in the Make.include file, whereby the .opt.wfout must be added to the line which calls the executable, and tells it which files to read. For example, for the sample, the following needs to exist in Make.include:

 [...]
 sample: program
         $(EXECWITH) bin/simpleDMC.x examples/SAMPLE_DIR/sample.{sys,slater,orb,basis,simple,hf.config,hf.log,opt.wfout}
 [etc...]


Water Example

One further complication arises when more than one atom type is present, and one further amendment must be made to the *opt.wfout file. Out of the box, qWalk will produce the following line for a monomer of water whose Jastrow has been optimised:

 [...]
 GROUP {
   OPTIMIZEBASIS
   ONEBODY {
     COEFFICIENTS { O  0.0552961257026809  -0.0570651722411533  0.0412059756740772   }
     COEFFICIENTS { H  -0.0203739180406208  -0.0465365828480515  0.225417700569402   }
   }
   TWOBODY {
     COEFFICIENTS { 0.163979168113989  -0.0968499964559348  -0.0100359372373529   }
   }
 [etc...]

The parser will not be able to read this, as it looks for the keyword "ONEBODY" and expects only a single line to follow. Therefore, the following addition will fix this problem, and a working file should look like:

 [...]
 GROUP {
   OPTIMIZEBASIS
   ONEBODY {
     COEFFICIENTS { O  0.0552961257026809  -0.0570651722411533  0.0412059756740772   }
   }
   ONEBODY{
     COEFFICIENTS { H  -0.0203739180406208  -0.0465365828480515  0.225417700569402   }
   }
   TWOBODY {
     COEFFICIENTS { 0.163979168113989  -0.0968499964559348  -0.0100359372373529   }
   }
 [etc...]