Development of stochastic density fitting approaches

Project aims/abstract

One of the main bottlenecks in practical applications of quantum chemistry is the storage and the AO-MO transformation of two-electron integrals. These processes are highly memory intensive, which makes calculations challenging for systems of chemical interest.

In this project, we implement a tensor-factorisation-based technique, and exploit the inherent sparsity of the resulting intermediates to decrease this requirement. The procedure will be based on the “RI-type” density fitting technique, and we sample a set of small but still non-negligible integrals in order to decrease the number of integrals stored.

The method developed will be tested in the framework of deterministic methods (Hartree-Fock and MP2), and its performance features will be compared to existing techniques. In addition, the technique will be implemented with the coupled cluster quantum Monte Carlo technique, which – requiring only good enough estimators to the exact values – is expected to have a larger tolerance to integral errors, and hence, the application of this technique can provide an even more powerful way of extending feasibility in this framework.

Current state of the project and next steps

Past steps of this development process involved a review on the different types of density fitting techniques, and the choice of a deterministic variant that provided a basis for the stochastic algorithm. In addition, a density fitting Hartree-Fock and MP2 code has been developed in C++, which now is compatible with various proposed algorithms for the stochastic fitting. These explore different combinations of sampling the objects resulting from tensor decomposition, and the effect of contraction order. The algorithm now is based on the Vose-Alias method, and explicitly uses the integral values, which means that this does not yet lead to the desired memory advantage (though the gain in fitting time is currently half of the original technique).

Next steps in the project would be to facilitate integral screening, the estimation of selection probabilities based on bounds (without calculating the explicit integral values), and the parallelisation of the sampling process. After this, we will determine general thresholds for choosing the stochastic integrals based on test set results, and examine the performance of the resulting method both for deterministic and stochastic methods.

Useful skills and knowledge

This project requires the knowledge of the followings (and hence this is what you can prepare with if you would like):

Theoretical

Familiarity with the integral types of electronic structure theory (including their symmetries), and the efficient process of integral transformation
Basic notions of linear algebra, and matrix decomposition techniques
Understanding the mindset of scaling arguments (memory and computational)
Understanding of Hartree-Fock and MP2 theory, and the basic notions of coupled cluster theory (derivation is not required)

Practical

Basic understanding of the C++ syntax (or understanding the syntax of another programming language (e.g., Python) and willingness to explore how the other language works)
Some familiarity with terminal commands, bash scripting, and the VI editor

Learning outcomes

Theoretical

Navigating electronic structure literature on integrals, and finding relevant information for understanding/implementation purposes
Knowledge on existing approximation techniques that are extensively used in concurrent literature
Understanding the context of fitting (where and why we use it in the methods we are interested in, and what the advantage/limitations of the proposed technique are)
Knowledge on relevant statistical measures for performance testing

Practical

Knowledge on C++ specific structures
Familiarity and usage of the OpenMP/MPI parallelisation techniques in practice
Efficiency optimisation of codes: using relevant matrix operation packages, and appropriate computational algorithms
Efficient ways of dealing with test sets and extracting data (bash/Python scripting)
Using Linux-based systems, computer clusters and schedulers

Interesting references

O. Vahtras, J. Almlöf, and M. W. Feyereisen, Chem. Phys. Lett. 213, 5–6, 514–518 (1993).
M. Vose, IEEE Transactions on Software Engineering 17, 9, 972–975 (1991).
Practical account on the alias method
T. Y. Takeshita, W. A. de Jong, D. Neuhauser, R. Baer, and E. Rabani, J. Chem. Theory Comput. 13, 4605–4610 (2017).

Stochastic Density Fitting

Contents

Development of stochastic density fitting approaches

Project aims/abstract

Current state of the project and next steps

Useful skills and knowledge

Theoretical

Practical

Learning outcomes

Theoretical

Practical

Interesting references

Navigation menu

Stochastic Density Fitting

Development of stochastic density fitting approaches

Project aims/abstract

Current state of the project and next steps

Useful skills and knowledge

Theoretical

Practical

Learning outcomes

Theoretical

Practical

Interesting references

Navigation menu

Search