Difference between revisions of "CamCASP/Programming/7"

From CUC3
Jump to navigation Jump to search
import>Am592
import>Am592
Line 64: Line 64:
 
A_{k,k'} = -4 \sum_{ar} D_{ar,k} D_{ar,k'} \frac{(e_r - e_a)}{(e_r-e_a)^2 - w^2}
 
A_{k,k'} = -4 \sum_{ar} D_{ar,k} D_{ar,k'} \frac{(e_r - e_a)}{(e_r-e_a)^2 - w^2}
 
</math>
 
</math>
This is a rather simple term to evaluate, but it takes a while to evaluate becuase it must be done for every pair of auxiliry basis function. I had naively assumed that some sort of pre-screening might work for this; something along the lines described above. Yes, there is no reason to assume this as there is no underlying integral to screen. But if we assume that for basis functions ''far away'' (in the overap sense) <math>A_{k,k}=0</math> we get garbage. The elements of <math>A</math> don't seem to have any correlation with the overlap of basis functions <math>\int \chi_k \chi_{k'} dr</math>. Well, I don't know what's to be done about this. It seems to me that the solution may be to first make the DF linear-scaling (using local auxiliary basis sets), though I can't say how this would help here...
+
This is a rather simple term, but it takes a while to evaluate becuase it must be done for every pair of auxiliry basis function. I had naively assumed that some sort of pre-screening might work for this; something along the lines described above. Yes, there is no reason to assume this as there is no underlying integral to screen. But if we assume that for basis functions ''far away'' (in the overap sense) <math>A_{k,k}=0</math> we get garbage. The elements of <math>A</math> don't seem to have any correlation with the overlap of basis functions <math>\int \chi_k \chi_{k'} dr</math>. Well, I don't know what's to be done about this. It seems to me that the solution may be to first make the DF linear-scaling (using local auxiliary basis sets), though I can't say how this would help here...
 
* Hybrid functionals in new propagator module.
 
* Hybrid functionals in new propagator module.
   

Revision as of 13:29, 17 May 2010

CamCASP => Programming => Propagator

The new propagator module (in CamCASP 5.5-dev) can create the propagator in the LDAX approximation very efficiently. Construction scales as <math>O(N^4)</math>. Further, the kernel integrals are calculated with <math>O(N^3)</math> scaling (one power for grid points and two powers for auxiliary basis functions). So the new propagator can be calculated very efficiently. But the ALDAX approximation in the kernel results in significant errors in the polarizabilities and (probably dispersion energies).

As a temporary measure we could calculate the hybrid ADALX+CHF (not ALDA - I still don't have the VWN correlation functional programmed, but this is not a large error) by back transforming the kernel using the DF solution. So we go from a two-index object to a four-index object, then form the full propagator, and then transform back to a two-index density-fitted propagator. This is a round-about way, but should work and will still benefit from the efficiency gains in evaluating the kernel integrals.

Status

  • Molecular grids need to be explicitly defined before calculating kernel integral internally. This is because the default grid is too fine. Use
 BEGIN GRID
   MOLECULE <name>
   ANGULAR 100
   RADIAL  60
 END
  • <math>N^6</math> algorithm: Kernel can be constructed internally using the ALDAX(DF)+CHF approximation. Energies and properties can use this route. Activate with
 HESSIANS INTERNAL 

in the propagator block. Accuracies high.

  • <math>N^4</math> algorithm: Kernel can be constructed in the ALDAX approximation. As yet, no fraction of CHF can be added. Low accuracies when used with hybrid functionals like PBE0. Likely higher accuracies with GGAs. Have to test this. As yet only allowed for properties calculations. Needs commands
 SET NEW-PROP
   KERNEL-EPS 0.00001
 END
 BEGIN POLARIZABILITY
   MOLECULE <name>
   NEW-PROP
   ...
 END

Very fast and uses far less memory than the <math>N^6</math> algorithm. Can now be used in energy routines. Just add NEW-PROP to the commands.

  • Basis sets: Any basis allowed. No restrictions on type of GTOs in auxiliary basis.
  • ALDAX: The contribution from the correlation functional is still missing. This (I believe) is responsible for the residual (1-2%) errors in molecular properties and dispersion energies.
  • Coded for closed-shell systems.

--alston 19:15, 27 April 2010 (BST)

To Do & Thoughts

  • Use screening when calculating kernel integrals. Probably use the overlap matrix as a screen.

Done this. Using the pre-screening idea Fred Manby had suggested some time ago: Assume all functions are s-functions and evaluate the overlap between these assumed-s-functions. If larger than epsilon, evaluate the integral. I used the following bits of code (in num_integrals.F90):

!Calculate screening integrals:
allocate(OvrS(mol%aux%nshells,mol%aux%nshells),stat=info)
call check_allocate(info,'OvrS',this_routine,__LINE__,.false.)
if (info/=0) then; info =-2; return; endif
do kshell = 1, mol%aux%nshells
  do lshell = kshell, mol%aux%nshells
    call screening_s_ovr(kshell,mol%aux,lshell,mol%aux,OvrS(kshell,lshell))
    OvrS(kshell,lshell) = abs(OvrS(kshell,lshell))
    OvrS(lshell,kshell) = OvrS(kshell,lshell)
  enddo
enddo
!
...
do kshell = 1, mol%aux%nshells
  ... 
  l_loop: do lshell = 1, mol%aux%nshells
    !Screening test:
    if (OvrS(kshell,lshell)<kernel_integral_cutoff) then
      !Skip this pair of shells
      number_skipped = number_skipped + 1
      cycle l_loop
    endif
  enddo l_loop
enddo

The resulting improvement in efficiency is enormous - XVI2 kernel integrals used to take 3.5 hours; they are now calculated in 1 hour. kernel_integral_cutoff should be a lot larger that the standard integral cutoff. A value of 1.0e-6 seems to work well. Need to do a systematic test of accuracy and this parameter.

  • Timings for XVI2 show that screening makes the calculation of kernel integrals faster than the calculation of the A-matrix in subroutine A_DtLdD_ALDA. This calculation scales as <math>M^2ov</math>. And is the bottleneck in this calculation. The term to be evaluated is very simple:

<math> A_{k,k'} = -4 \sum_{ar} D_{ar,k} D_{ar,k'} \frac{(e_r - e_a)}{(e_r-e_a)^2 - w^2} </math> This is a rather simple term, but it takes a while to evaluate becuase it must be done for every pair of auxiliry basis function. I had naively assumed that some sort of pre-screening might work for this; something along the lines described above. Yes, there is no reason to assume this as there is no underlying integral to screen. But if we assume that for basis functions far away (in the overap sense) <math>A_{k,k}=0</math> we get garbage. The elements of <math>A</math> don't seem to have any correlation with the overlap of basis functions <math>\int \chi_k \chi_{k'} dr</math>. Well, I don't know what's to be done about this. It seems to me that the solution may be to first make the DF linear-scaling (using local auxiliary basis sets), though I can't say how this would help here...

  • Hybrid functionals in new propagator module.

Concerns

Numerical Integrals

  • Becke smoothing parameter: Surely the integration result should not depend on the choice of this parameter. But it does. I have not systematically explored this issue, but here are some energies for the water dimer:
Energies in kJ/mol
                         elst     ind2C    disp2C      exch   exind2C  exdisp2C 
        becke100.out   -28.499   -10.832    -9.971    24.756     5.624     1.670 
         becke10.out   -28.499   -10.829    -9.964    24.756     5.622     1.669 
          becke3.out   -28.499   -10.829    -9.962    24.756     5.622     1.668

The default parameter is 3. Increasing it makes the swicthover 'harder' - that is, more abrupt. But I certainly do not fully understand what this parameter does. Should dig into the matter.--alston 12:09, 3 May 2010 (BST)

The problem could arise from the cutoffs. The kernel integrals use a cutoff to evaluate <math>\rho^{-2/3}</math>. This is done only for densities larger than <math>\epsilon</math>; else this term is dropped. Changing the Becke parameter changes the weights (obviously). Now if my chosen value of <math>\epsilon</math> is too large, then the Becke parameter may have an effect on the results. I should experiment with smaller values of <math>\epsilon</math>. --alston 16:33, 3 May 2010 (BST)

The problem doesn't seem straightforward. Here are some observations:

  • EFFICIENCY: The grid consists of atomic grids. Connected atoms share points. So the grid contains several instances of points. This is very wasteful. There must be some way of reducing the atom grids into one molecular grid. I must look into it.
  • ACCURACY: We can achieve 4 decimal numerical accuracy for the dipole polarizability of water if we use very large grids (400 radial, 600 angular). For such grids, the Becke smoothing parameter doesn't change results (I used parameter values of 2, 4 and 10). But even with this large grids, the quadrupole polarizabilities differ in the second or third place as the smoothing parameter is varied.
  • APPROXIMATIONS: Using extremely large grids is not practical. So we must compromise. For a given choice of KernelEps and grid size (60 radial and 100 angular) what Becke smoothing parameter is appropriate? Based on the dipole-dipole polarizabilities of water, it seems like a value around 4 is appropriate. This is the default. Larger values (~10) cause significant change to the polarizability anisotropy. Smaller values (~2) seem to result in slightly larger errors in the quadrupole polarizabilities, but these are not quite significant enough. So, I'd say a value of 2 to 4 should be OK. The default value is 3.
Water aTZ/aTZ PBE0/AC ALDAX
Radial  Angular Becke KernelEps  alpha  anisotropy
==================================================
Reference values:
400     600     10     1.0e-8    8.2391  1.0236
400     600      2     1.0e-8    8.2390  1.0238
300     400     10     1.0e-8    8.2391  1.0236
300     400      2     1.0e-8    8.2390  1.0238
200     400     10     1.0e-8    8.2392  1.0238
200     400      2     1.0e-8    8.2391  1.0239
-------------------------------------------------- 
 60     100     10     1.0e-8    8.2409  1.0178
 60     100      4     1.0e-8    8.2387  1.0235
 60     100      2     1.0e-8    8.2386  1.0237
--------------------------------------------------

The reference values use very large grids and result in dipole-dipole polarizabiliites that are fairly insensitive to the Becke parameter. There is a larger variation for the smaller grid, with the agreement getting rather good for smaller values. --alston 18:55, 3 May 2010 (BST)

Kernel Notation

  • AUX-ALDA : ALDA kernel constructed directly in auxiliary basis.
  • AUX-ALDAX : ALDAX exchange-only LDA kernel constructed directly in auxiliary basis.
  • DF-(ALDA+CHF) : Hybrid ALDA+CHF kernel constructed in main basis, and then reduced in size using density-fitting.
  • DF-(ALDAX(DF)+CHF) : As above, but with exchange-only ALDAX kernel calculated from AUX-ALDAX and converted into an ov x ov object using the DF solution matrix.

Examples

Benzene

  • Basis: Sadlej
  • Old timings obtained using different processor (AMD), so using pyridine/sadlej timings (note: smaller basis set!):
    • DALTON: Kernel integrals: 3.3 hours
    • CamCASP: (properties) 2.5 hours
  • New timings (AUX-ALDAX kernel):
    • Kernel: 6.5 minutes
    • CamCASP (only total polarizabilities): 12 minutes

Not quite a good comparison, but you get the idea. The new propagator is very very fast.

Accuracy is poor. ALDAX is just not good enough for <math>\pi</math>-conjugated systems. Here are the dipole-dipole polarizabilities:

DF-(ALDA+CHF) kernel:

    43.247483     0.000000     0.000000
     0.000000    80.710408     0.000000
     0.000000     0.000000    80.713103
    Isotropic polarizability:    68.22366473
  Anisotropic polarizability:    37.46427266

AUX-ALDAX kernel:

 Order:    1 by    1
    39.035569     0.000000     0.000000
     0.000000    70.478442     0.000000
     0.000000     0.000000    70.487565
    Isotropic polarizability:    60.00052553
  Anisotropic polarizability:    31.44743535

That's a 12% error.

I've now made the new kernel integral routines accessible from the old densfit_prop.F90 module. This is a temporary work-around to allow us to use the hybrid Hessians. Here's what I get for benzene:

DF-(ALDAX(DF)+CHF) kernel:

    42.787209     0.000000     0.000000
     0.000000    80.353396     0.000000
     0.000000     0.000000    79.887600
    Isotropic polarizability:    67.67606832
  Anisotropic polarizability:    37.33546875

The error is now 0.8%. There still is a difference for these reasons (I think):

  1. I have constructed the ov x ov kernel integrals using the expression <math>H = D K D^T</math> where <math>D</math> is the OV density-fitting solution matrix and <math>K</math> is the kernel calculated in the auxiliary basis.
  2. The kernel integral is calculated in the ALDAX approximation. I still do not have the contribution from the correlation functional.
  3. The auxiliary basis uses Spherical GTOs.

It's hard to say which is the dominant source of error. But the total error is quite small now, so perhaps we can live with this for now.

Timings: it's slower than the previous method (using AUX-ALDAX), but only by a minute.

Completed the Cartesian aux basis calculation: DF-(ALDAX(DF)+CHF) kernel:

    42.614128     0.000000     0.000000
     0.000000    79.929958     0.000000
     0.000000     0.000000    79.930117
    Isotropic polarizability:    67.49140129
  Anisotropic polarizability:    37.31590957

Error here is 1.1%. More than the spherical aux case. This is strange. Well, anyway, the error is still small. So the residual error in the calculation must be from the DF or ALDAX approximations.

--alston 19:31, 26 April 2010 (BST)

  • PBE noAC
  • ALDAX kernel
    44.072016     0.000000     0.000000
     0.000000    83.389575     0.000000
     0.000000     0.000000    83.385620
    Isotropic polarizability:    70.28240379
  Anisotropic polarizability:    39.31558132

PBE overestimates the polarizability. This is a failure of GGAs in general and has been shown to get worse as the HOMO-LUMO gap reduces. But for systems the size of benzene, PBE with the ALDAX kernel should be a very good approximation. The error in the polarizability is only 3% (overestimation). This would likely amount to a 6% error in the long-range dispersion. At short range overlap effects often compensate, though this cannot be relied upon.

Time: This calculation took 20 minutes (full properties) in CamCASP and 9 minutes in DALTON.

--alston 13:07, 30 April 2010 (BST)

Water

  • Basis: aug-cc-pVTZ
  • Aux: aux-aug-cc-pVTZ Cartesian GTOs
  • XC-Func: PBE0

Here is the dipole-dipole polarizability using the DF-(ALDA+CHF), DF-(ALDAX(DF)+CHF) and AUX-ALDAX kernels.

DF-(ALDA+CHF):

     9.391882     0.000000     0.000000
     0.000000    10.034214     0.000000
     0.000000     0.000000     8.750848
    Isotropic polarizability:     9.39231483
  Anisotropic polarizability:     1.11142748

DF-(ALDAX(DF)+CHF):

     9.259201    -0.000000     0.000000
    -0.000000     9.901715     0.000000
     0.000000     0.000000     8.617225
    Isotropic polarizability:     9.25938032
  Anisotropic polarizability:     1.11240073

AUX-ALDAX:

     8.227020    -0.000000     0.000000
    -0.000000     8.808915     0.000000
     0.000000     0.000000     7.683247
    Isotropic polarizability:     8.23972716
  Anisotropic polarizability:     0.97504333

Once again large (12.2%) errors with the AUX-ALDAX kernel but far better agreement (1.4%) with the DF-(ALDAX(DF)+CHF) kernel. --alston 19:00, 26 April 2010 (BST)

Helium dimer

  • Basis: aTZ/MC+
  • Geom: R = 5.6 a.u.

Here's a comparison of interaction energies from the original method (ALDA+CHF kernel using DALTON, Cartesian auxiliary basis) and the new method (DF-(ALDAX+CHF) kernel using CamCASP, spherical auxiliary basis):

Energy-units cm-1
                              elst  exch    disp2C     ind2C    exind2C  exdisp2C   Eint(2)  
 Original DF-(ALDA+CHF)      -1.27  8.88  -15.684454 -0.216497  0.179052  0.449355 -7.657907  
 New      DF-(ALDAX(DF)+CHF) -1.22  8.88  -15.417550 -0.464630  0.383947  0.441358 -7.390518  

There is a large change in the induction energies because of the use of spherical GTOs in the auxiliary basis - this energy is very sensitive to the accuracy of the density fitting. There is an error of 1.7% in the dispersion energy.

Water dimer

  • Basis: Sadlej/DC+
  • Geom: Global minimum in energy.
Energy-units kJ/mol
                                elst     ind2C    disp2C      exch   exind2C  exdisp2C     E2int
 DF-(ALDA+CHF)       PBE0/AC  -28.941   -10.567    -9.728    23.740     5.507     1.649    -18.340
 DF-(ALDAX(DF)+CHF)  PBE0/AC  -28.941   -10.467    -9.581    23.740     5.457     1.624    -18.169
 DF-ALDAX            PBE0/AC  -28.941    -9.756    -8.872    23.740     5.097     1.504    -17.229
 DF-ALDAX            PBE/AC   -28.499   -10.832    -9.962    24.756     5.624     1.668    -17.246
 DF-ALDAX            PBE      -29.032   -13.262   -10.636    28.511     8.000     1.988    -14.430

Errors: PBE0/AC

  • DF-(ALDAX(DF)+CHF)
    • Dispersion: 1.5% in E2disp,pol
    • E2int : 0.9%
  • DF-ALDAX
    • Dispersion: 10.3% in e2disp,pol
    • E2int : 6%

The error made by the DF-(ALDAX(DF)+CHF) propagator are not large, but should be eliminated at some point soon! On the other hand, the errors made by the DF-ALDAX propagator are quite substantial.

As can be seen, the asymptotic correction is very important for water. The effect is mainly in the exchange energies.

XVI dimer

  • From this year's Blind Test.
  • Quite a large molecule: C6ON2H4 (Benzene ring with C=O next to C=N=N).
  • Basis: Sadlej/MC+ : size=404 spherical GTOs
  • Aux: aug-cc-pVTZ/DC+ : 2953 Cartesian GTOs
  • Geometries: Angstrom, Degrees.
    INDX  Rx          Ry         Rz         Alpha     Nx         Ny         Nz
    1   -3.962203  -0.306308   3.248817   0.000000   0.000000   0.000000   1.000000
    2   -1.176349   6.896238  -0.246782   0.000000   0.000000   0.000000   1.000000
  • Geometry 1:
                       elst     ind2C    disp2C      exch   exind2C  exdisp2C    Total  Diff Timings (DAL,CamCASP) 
 PBE0  AC   ALDA+CHF  -6.114   -10.220   -25.462    18.197     8.448     2.002  -13.149      36h      11h20m (Dalton direct, conv 10-7)
 PBE   noAC ALDAX     -6.563   -11.424   -25.981    20.223     9.529     2.092  -12.125 7.8% 11h45m   12h56m (Dalton direct, conv 10-7) 
 PBE   AC   ALDAX     -6.093   -10.207   -25.394    18.867     8.332     1.940  -12.554 4.5% 12h      11h45m (Dalton no direct, conv 10-6)

With PBE/noAC-ALDAX there is only a 0.5 kJ/mol difference in the dispersion energy. Most of the error arises from the electrostatic, induction and exchange energies - all, I'm sure, because of the lack of the asymptotic correction. There is a substantial decrease in the time taken per job when the ALDAX kernel is used. The lack of AC also contributes. DALTON takes a total of 18h to calculate the Kernel integrals. CamCASP takes 3h46m. This is the biggest time reduction. PBE/AC-ALDAX is probably the way to go. The large errors in the exchange energies that we see with PBE/noAC are largely corrected with the AC. The error in the interaction energy is now under 5%. This is acceptable. Also, without the .DIRECT option, DALTON seems quite fast and is able to implement the AC in almost the same time.

  • Geometry 2:
                       elst     ind2C      disp2C      exch   exind2C  exdisp2C    Total  Diff Timings (DAL,CamCASP)
 PBE0  AC   ALDA+CHF  -12.058    -4.139   -11.157     9.047     1.424     0.827  -16.056       31h48m  10h33m
 PBE   noAC ALDAX     -12.069    -4.862   -11.919    11.528     1.997     0.981  -14.342 10.7% 10h47m  12h48m

There is no screening used in the kernel integral routines in CamCASP so we see a very small reduction in time compared with Geometry 1. --alston 12:53, 3 May 2010 (BST)