Open Access Article

This Open Access Article is licensed under a

Creative Commons Attribution 3.0 Unported Licence

DOI: 10.1039/D4FD00062E
(Paper)
Faraday Discuss., 2024, Advance Article

Kemal Atalar*^{a},
Yannic Rath^{ab},
Rachel Crespo-Otero^{c} and
George H. Booth*^{a}
^{a}Department of Physics and Thomas Young Centre, King’s College London, Strand, London, WC2R 2LS, UK. E-mail: kemal.atalar@kcl.ac.uk; george.booth@kcl.ac.uk
^{b}National Physical Laboratory, Teddington, TW11 0LW, UK
^{c}Department of Chemistry University College London, 2020 Gordon St., London, WC1H 0AJ, UK

Received
17th March 2024
, Accepted 2nd May 2024

First published on 2nd May 2024

We build on the concept of eigenvector continuation to develop an efficient multi-state method for the rigorous and smooth interpolation of a small training set of many-body wavefunctions through chemical space at mean-field cost. The inferred states are represented as variationally optimal linear combinations of the training states transferred between the many-body bases of different nuclear geometries. We show that analytic multi-state forces and nonadiabatic couplings from the model enable application to nonadiabatic molecular dynamics, developing an active learning scheme to ensure a compact and systematically improvable training set. This culminates in application to the nonadiabatic molecular dynamics of a photoexcited 28-atom hydrogen chain, with surprising complexity in the resulting nuclear motion. With just 22 DMRG calculations of training states from the low-energy correlated electronic structure at different geometries, we infer the multi-state energies, forces and nonadiabatic coupling vectors at 12000 geometries with provable convergence to high accuracy along an ensemble of molecular trajectories, which would not be feasible with a brute force approach. This opens up a route to bridge the timescales between accurate single-point correlated electronic structure methods and timescales of relevance for photo-induced molecular dynamics.

While there are a number of alternative formulations for efficient propagation of coupled electron and nuclear motion, for many purposes mixed quantum-classical approaches can be devised whereby the nuclear propagation is still treated classically, relying on information from all relevant electronic states in the dynamics. These electronic states can therefore be obtained under the Born–Oppenheimer (BO) approximation^{6} at each nuclear geometry without explicit coupling to quantum nuclear degrees of freedom. In this work, we consider the popular ‘fewest-switches surface hopping’ (FSSH) approach,^{7–9} although many other approaches also exist.^{1,2} In FSSH, the full electronic state is evolved at each timestep as a superposition over multiple Born–Oppenheimer adiabatic electronic states, with the nuclei propagated on a single adiabatic electronic surface. When the simulations are done using adiabatic states, nonadiabatic stochastic jumps between the electronic states occur with a probability depending on the nonadiabatic couplings, whilst ensuring overall energy conservation. While this approach can be motivated from higher levels of theory,^{10} it lacks rigorous interference and decoherence effects of quantum nuclei (noting recent work incorporating nuclear decoherence and tunneling processes into the formulation^{8,9,11–13}). Provided ad hoc corrections are incorporated to address some of these issues, FSSH has proven to be very effective in the modelling and prediction of photo-chemical processes over relevant atomic timescales.^{2}

However, the outstanding limitation in the scope of application of NAMD is the electronic structure problem at its heart,^{4,14} which has been found to be generally more important to the quality of results than the specifics of different NAMD approximations.^{15} This is a demanding electronic structure challenge, requiring many consecutive calculations as the nuclei are propagated in time, each with a finely balanced description of multiple electronic states. Furthermore, nuclear forces are required for each state to determine the propagation of the nuclei, as well as nonadiabatic couplings (NACs) between the states for the propagation of the electronic part and stochastic hopping. Although it is possible in some cases to approximate these gradients and NACs if they are not available analytically from the electronic structure solver used, this introduces additional approximations, uncertainty and potentially overheads for the calculation. While density functional theory (DFT) is almost ubiquitous for ab initio molecular dynamics on ground states, the requirement of multiple electronic states limits its applicability to NAMD (noting recent approximations for DFT-based NAMD, especially in simulations of transport^{16,17}). Furthermore, single-reference electronic structure theory based on linear response (such as TD-DFT,^{18} CC2^{19} or EOM-CCSD^{20}) are also generally not suitable where transitions to the ground state are required, due to their explicit formulation as excitations from a single electronic state, which precludes the critical region around conical intersections where this gap vanishes and ground and excited states must be treated on the same footing.^{14} These descriptions may not give the required balance in the accuracy of the ground and excited states, especially where excited states have significant charge rearrangement compared to the ground state.

This places a significant burden on multireference electronic structure methods for the description of these adiabatic states, in particular state-averaged complete active space self-consistent field (CASSCF) which can be considered the workhorse of NAMD, and which has analytic gradients and NACs^{21,22} available in many packages. Unfortunately, there are several drawbacks in these traditional multireference approaches, in particular the lack of dynamical correlations, which can unbalance the description of different states and manifest in e.g. dramatic overestimation of the relative energy of ionic states.^{23,24} Extensions to internally-contracted multireference perturbation theory (MRPT) and configuration interaction (MRCI) methods are more recently available, but they can still underestimate vertical excitations^{25} and are substantially more expensive with more challenging gradient theory and NACs.^{26,27} More fundamentally, these approaches depend significantly on the choice of active space^{28} and require the definition of a small active space to enable them to be tractable, which must remain consistent across the changes in geometry over the trajectory. This can often be hard or impossible, as relevant orbitals at one geometry can adiabatically change into an irrelevant subspace at another geometry, while orbitals crossing and entering or leaving the subspace can result in a discontinuous surface, difficulties with intruder states and issues with energy conservation during the NAMD simulations.^{29–31}

This context of NAMD for photochemistry, catalysis and beyond, is a prime motivation for developments in electronic structure methodology. However, new wavefunction-based approaches emerging in the last couple of decades have not yet impacted upon this field, underlining the challenges faced in this area. Modern accurate and systematically improvable approaches such as DMRG (density matrix renormalization group),^{32–34} selected configuration interaction (sCI),^{35,36} and a number of stochastic methods (including full configuration interaction quantum Monte Carlo,^{37–39} auxillary field QMC^{40,41} and advances in variational Monte Carlo models^{42,43}) can in principle be used as multiconfigurational solvers within NAMD, at least to extend the size of active spaces and mitigate the difficulty in their appropriate selection. However, despite much progress in accurately describing excited states within these frameworks,^{33,44–47} as well as their analytic nuclear gradients,^{30,48–51} the community generally still lack nonadiabatic couplings in these methods, while stochastic noise in the electronic structure can often be challenging for precise molecular dynamics that conserves total energy (noting significant work in ref. 52). More generally, while these methods can be powerful in obtaining a small number of single-point energies (often both for ground and excited states), they are still expensive for the number of calculations required in molecular dynamics. Furthermore, they often lack a robust black-box simulation protocol that hinders the reliable execution of these approaches for several thousand consecutive calculations, each relying on consistent convergence for all prior points for reliable trajectories, without manual checking of convergence and simulation parameters. This puts the timescales necessary to describe realistic photochemical processes out of their reach (a fact often even true with a CASSCF description).

As an alternative paradigm, machine learning (ML) potentials have been tremendously successful in breaking this computational barrier for ground state potential energy surfaces (PES). However, the application of ML methodologies to NAMD has various limitations and challenges (beyond those well-understood in ground-state ML force fields such as the local energy decomposition) despite showing great promise by accessing nanosecond NAMD trajectories for small molecules both with MR-CISD^{53} and CASSCF^{54} training data. Although progress has been made in predicting excited state energies and forces in these frameworks,^{53–59} direct learning of NACs tend to lead to their underestimation,^{59,60} while approximations to them based purely on energies and gradients can also lead to a substantial misrepresentation of true nonadiabatic processes.^{58,61,62} Capturing conical intersections (CoIns) is also a challenge, as small and sharp energy gaps near CoIns tend to be overestimated and smoothened since they are necessarily poorly represented in the training set.^{60,63} Furthermore, all of the same electronic structure challenges still exist in obtaining appropriate training data required to build these models.^{63}

The approach outlined in this work takes a step to address these shortcomings, in a largely method-agnostic framework for the robust and compact interpolation of accurate wave functions through chemical space. Initially described in ref. 64 for the interpolation of ground states, we show how this approach can be extended for the balanced interpolation of both ground and excited states for ab initio systems over changing atomic geometries with mean-field cost. Importantly, we also show how both excited state gradients and nonadiabatic couplings between the states are straightforwardly extracted from the interpolation framework, allowing application to NAMD. This allows for the first (to the best of our knowledge) NAMD simulation using modern DMRG derived electronic states,^{65,66} obtaining high-accuracy molecular dynamics of excited hydrogen chains – a system for which we believe no other solver would be effective. We discuss both the convergence of the dynamics with respect to increasing numbers of training DMRG calculations supporting the interpolated model, and an approach for efficient selection of these training geometries. Finally, we end with a perspective of the use of this framework more generally for practical NAMD applications.

We assume that we have a ‘training’ set of M non-orthogonal many-body states, as vectors in the electronic Hilbert space, {|C_{a}〉} = C^{(a)}_{n}, where a, b, … labels these distinct training states and n denote the occupation number vectors of the orthonormal many-electron configurations at the nuclear geometry of interest, R. We project the ab initio Hamiltonian at this nuclear configuration into the space spanned by these states. We can then consider a valid wave function at this geometry as resulting from a variational optimization within the span of this many-body subspace. This can be found in closed form as a generalized eigendecomposition of the M × M Hamiltonian in this basis,

〈C_{a}|Ĥ(R)|C_{b}〉x^{(A)}_{b}(R) = E_{A}(R)〈C_{a}|C_{b}〉x^{(A)}_{b}(R).
| (1) |

The training basis (comprising M elements) is small enough that this generalized eigenvalue problem can be completely solved at all geometries of interest, giving a variational approximation to both the ground state and excitation spectrum at each arbitrary ‘test’ geometry, E_{A}(R). The eigenvectors, x^{(A)}_{b}(R), denote the component of the many-body training vector in each interpolated state (these interpolated states are denoted by upper-case indices A, B, …). Through this scheme, wavefunctions at arbitrary test geometries and their observables can be inferred by sampling few wavefunctions at training geometries. As this training subspace is enlarged, the energies of all states from the subspace must necessarily variationally lower towards their exact eigenvalues, as ensured by the eigenvalue interlacing theorem.^{75} Due to the linearity of the model, the number of electronic states described remains constant as geometries change, and their energies must vary smoothly with changes in the nuclear potential (away from state crossings). The key questions now are how these training states are chosen such that they faithfully span the low-energy eigenstates as geometries change, as well as how the subspace Hamiltonian can be efficiently constructed without requiring the training data expressed in the exponentially large many-electron Hilbert space.

As indicated via the explicit dependence on R for certain quantities in eqn (1), the training vectors are defined such that their numerical values remain fixed in an abstract orthonormal Hilbert space, regardless of the test geometry R at which we are evaluating the subspace model. This critically ensures that the overlap metric between many-body training states, 〈C_{a}|C_{b}〉, is independent of R. However, if this training basis is to be interpreted as a set of physical wave functions at each geometry rather than abstract vectors, then it is worth stressing that their character does indeed change with R, since the underlying Hilbert space of electronic configurations will change. Therefore, it is essential that we have a consistent and orthonormal representation of the Hilbert space at each geometry, such that the numerically fixed training vectors are transferrable between geometries and still span a space of relevance for the low-energy states of interest, rather than them spanning increasingly irrelevant parts of the many-body Hilbert space.

To motivate a judicious choice for these training vectors, we assume that they come from the exact, full configuration interaction (FCI), solution for a small number of low-energy eigenstates of the Born–Oppenheimer electronic Hamiltonian at select ‘training’ geometries of the system. However, we also need to fix a consistent representation of the orbitals, χ_{i}(r; R), which define the many-body Hilbert space for these vectors, such that the probability amplitudes of states {|C_{a}〉} can be effectively transferred between geometries. To this end, we choose the orbital basis of the training states, χ_{i}(r; R), to be the symmetrically (Löwdin) orthonormalized atomic-orbital (SAO) basis.^{76,77} These are simply and uniquely derived from an underlying atom-centered atomic orbital basis, {ϕ_{α}(r; R)}, as

(2) |

(3) |

These SAOs are defined to remain (in a least-squares sense) as close as possible to the underlying local atomic orbital basis, while ensuring the required orthonormality of this abstract basis at each atomic geometry.‡ This choice of representation for the training states {|C_{a}〉} (obtained over a range of geometries) is motivated by the fact that much of the local character of the correlated many-electron state in this basis will remain similar as atoms move by small amounts. In addition, nearby atoms with similar chemical bonding will also have common features in their many-electron quantum fluctuations characterizing e.g. covalent bonding character. Therefore, the numerical values of the probability amplitudes of the states in this representation will plausibly remain ‘close’ to the states of interest at modified geometries, ensuring that the numerical values of the FCI many-electron states change least between the different electronic Hilbert spaces as the atomic positions change, and that the states can act as a general projector into the low-energy space as atoms move.

While this is a heuristic choice, we have the rigorous conditions that the inferred wave functions are strictly variational (for all geometries) with respect to increasing training data, as well as the exactness of all inferred states at test geometries which coincide with training geometries. From an ML perspective in its application to MD, the variationality of the model therefore ensures an inductive bias away from regions of the phase space which are poorly represented by the training data. The inferred states are at all points represented as a variationally optimal linear combination of the training states in their SAO representations, as

(4) |

No special structure is relied upon for the description of any of these states (other than the fact that their SAO-represented FCI vectors over the geometries of interest remain sufficiently close to a linear combination of the training states). Indeed, no mean-field information is used at any point in the framework. This provides confidence that strong electronically correlated states can be described, and that the procedure should be relatively unbiased for a description of multiple electronic states simultaneously, providing those states are equally represented in the training states. Furthermore, in contrast to the widespread machine-learning derived force fields now prevalent in molecular dynamics, no local energy decomposition is employed, and the fact that a valid many-electron state is propagated to all geometries means that all properties of interest can be predicted from the same model. This includes analytic nuclear forces, which are particularly straightforward due to the geometry-independence of the subspace definition and variational formulation.^{64} Furthermore, as we shall show in Sec. 2.3, nonadiabatic coupling vectors between inferred states are also straightforwardly obtainable, which places the approach as a suitable candidate to be an electronic structure solver for NAMD. Testing these assertions is at the heart of this work.

Finally, we note that FCI probability amplitudes in a local SAO representation are invariant to translation and rigid body rotations of the molecule (provided a consistent ordering of the underlying AOs). Furthermore, at dissociation the SAO probability amplitudes are unchanging for all states regardless of the extent of the dissociation, ensuring that all inferred wave functions in this important strongly-correlated limit should be consistently described. The appropriate choice of training geometries in which to support the model across e.g. a molecular dynamics trajectory is however critical to the success of the method. In Sec. 3.3 we develop an active-learning protocol to greedily update the training states in a self-consistent procedure across an MD trajectory, demonstrating convergence to near-exactness of NAMD over the trajectory.

(5) |

enabling the projection to be found as

(6) |

Crucially, once these tRDMs and overlaps between the training states are known, the simulation proceeds with the subspace Hamiltonian constructed via the contraction of eqn (6) in cost, where M is the number of training states and L is the number of basis functions in the system. This relatively low polynomial scaling contrasts with the generally exponential costs of accurate wave function solutions to the electronic structure problem to compute the training states, highlighting the significant speed up in this interpolation when many calculations are required across chemical space (and especially where multiple electronic states are required). Properties can then be extracted from the inferred state via its one- and two-body reduced density matrices (RDMs) represented in the SAO basis of each geometry, without any explicit recourse to the training states, as

(7) |

(8) |

This framework is demonstrated in Fig. 1, where we show a simple proof-of-principle for the one-dimensional phase space corresponding to the symmetric stretch of four Hydrogen atoms. The six lowest-energy FCI solutions are shown, including a state-crossing, and the training states supporting the interpolation in each panel are indicated by orange crosses. For the interpolated state, the span of the training space can be enlarged systematically by including either a larger number of higher-lying eigenstates at each geometry (which will necessarily be orthogonal at the same geometry), or a larger number of geometries (which will be non-orthogonal between different geometries). We show that the three lowest lying states can be smoothly interpolated to near-exactness (including a state crossing) with just three geometries, each contributing three states to the training space. This system is further benchmarked in Fig. 2. In all results of this work, we select the same number of training states at each geometry as the number of low-energy states of interest for the interpolation unless stated otherwise.

Finally, we note that while the framework was described with the use of exact (FCI) training states, other approximate electronic structure methods could also be used to define these training states. As long as N-representable tRDMs corresponding to physical wave functions, as well as overlaps, are defined in the method, then any approach could be used within the scheme and a variational state would be inferred at all points, motivating the framework as a tool to accelerate many different electronic structure methods. For the larger systems in Sec. 4, we therefore use the density matrix renormalization group (DMRG),^{78} which defines training states as matrix product states (MPS).

(9) |

(10) |

(11) |

The first order nonadiabatic coupling vectors between inferred states A and B are given, in its usual form, as

(12) |

The derivative of both the subspace expansion coefficients, x^{(A)}_{b} and the orbital basis need to be accounted for with two separate terms,

d_{AB} = d^{coef}_{AB} + d^{orb}_{AB}.
| (13) |

The coefficient dependent term can be found using the subspace expansion in eqn (4), the orthonormality condition between the inferred states, x^{T}_{(A)}Qx_{(B)} = δ_{A,B} (where Q = Q_{ab} = 〈C_{a}|C_{b}〉 is the overlap matrix between the training states), and the generalized Hellman–Feynman theorem, as

(14) |

(15) |

(16) |

(17) |

(18) |

Following a similar procedure to analytic CASSCF nonadiabatic coupling vectors,^{21,22} the orbital contribution, d^{orb}_{AB} is formulated as:

(19) |

(20) |

The electronic dynamics are propagated by the time-dependent Schrödinger equation as a superposition over the adiabatic surfaces at a given nuclear geometry as

(21) |

(22) |

(23) |

In this work we used the Newton-X^{85,86} package to perform the FSSH simulations, interfaced to our eigenvector continuation code which was called to obtain all interpolated electronic structure properties, and NACs between all pairs of interpolated states were included for the propagation of the electronic wave function and stochastic hopping. This also relied on PySCF for the computation of the required Hamiltonian, derivative integrals and FCI training data where used.^{81,82} This workflow was also initially benchmarked against the FCI Newton-X interface to OpenMolcas.^{62,87} A 5^{th} order multistep integrator of Butcher^{88} was used to integrate eqn (21) with 20 multisteps. Decoherence corrections were applied through the simplified decay of mixing approach^{11} with a decay parameter of 0.1 Ha. All trajectories were checked for energy conservation.

Fig. 3 Comparison of the interpolated trajectory (dotted lines) against the exact results (solid lines) of a four-atom hydrogen chain starting in the equilibrium equidistant S_{1} state of a STO-3G basis. (a) Adiabatic energies along the trajectory (shown with black stars) with a nonadiabatic hop to the ground state. Insets showing the nuclear geometries. (b) Populations of propagated electronic wavefunction over the low-energy adiabatic states considered. (c) The average interatomic distance of the full chain (R_{1},_{Natm}/(N_{atm} − 1)) and the distance between the middle hydrogens (R_{2,3}). (d) The transition probability from the current state of the trajectory to the other two states. The training wavefunctions used in the interpolation are taken from the three equidistant geometries shown in Fig. 2. |

We now consider the performance of the eigenvector continuation compared to the exact propagation in this toy system, where we select FCI training states only at the same three equidistant and evenly-spaced geometries as shown in Fig. 2. The key question is whether these non-equidistant, semi-dissociated nuclear geometries visited in the trajectory are well described in the scheme compared to FCI with these somewhat unrepresentative training points. On top of that, we want to see if it can capture the correct internal conversion time to the ground state as small changes in the stochastic hopping transition can result in significant differences in the dynamics. Fig. 3 demonstrates the success of the continuation even without any particular care in the training geometry selection. Despite the simplicity of this four-electron system, nine FCI training wavefunctions computed at three equidistant geometries seem to represent the low-energy Hilbert space (with a total of 256 Slater determinants) at all the relevant geometries. In addition to the energies, populations and nuclear geometries, the transition probabilities between different adiabatic states in Fig. 3(d) are also captured to a very high quantitative accuracy. This indicates the methods capabilities to extract accurate statistical averages over multiple surface hopping trajectories. It could perhaps be argued that the success here is only reflective of the simple ratio of the Hilbert space size to number of training states. We therefore address this question for larger systems in Sec. 4.

The crucial component of any active learning strategy is the selection criteria for the new data for the training set at each iteration. It is common to use a distance metric to choose the data that is most different from the ones in the training set. Since the main inputs for our inference are the 1- and 2-electron integrals at the geometries of interest, which determine all electronic states, we employed this ‘Hamiltonian distance’:

(24) |

as our distance metric where R_{t} are the geometries in the existing training set and, h^{(1)}_{ij} and h^{(2)}_{ijkl} are the electron integrals. This can be simply evaluated along with the inference over the MD trajectory, and satisfies the important property that the metric is zero for the interpolation at existing training geometries.

This metric was applied in a previous publication to converge the training states for ground state dynamics with eigenvector continuation, where the geometry with largest D_{min} over the trajectory was solved with the electronic structure solver and added to the training data for the next MD run, iteratively improving the trajectory until convergence.^{64} This ensures that geometries from the trajectory are added to the training set that are ‘furthest’ (in this Hamiltonian distance sense) from the nearest training state. However, simply taking the maximum D_{min} over the entire trajectory doesn’t yield the fastest convergence, as these points tend to correspond to the final geometries visited in the trajectory, where the furthest parts of the phase space are being explored. A better measure is to consider the turning points, i.e. peaks in the Hamiltonian distance, over the trajectory, as they signal two important scenarios. Firstly, they can point to true dynamical extrema such as the minimum and maximum separation of a nuclear oscillation. For an optimal interpolation (rather than extrapolation) of that motion, inclusion of these extrema is beneficial. The second scenario is when the continuation trajectory is in an explorative phase (i.e. we are searching for additional training geometries required, rather than running the final converged trajectory). As the scheme introduced in this paper necessarily overestimates the energies of atomic geometries in regions of phase space far from the training data due its variational nature, the MD has an inductive bias away from these spuriously high-energy regions. Following this, the trajectory hits an overestimated energy barrier as it moves towards these poorly described geometries and bounces back to a known part of the phase space. Thus, peaks in Hamiltonian distance also hint at these regions of the nuclear phase space where the addition of training data would be particularly beneficial to optimally improve the overall accuracy of the interpolation.

As the nuclear phase space explored during the dynamics changes as more training data is added to the model (since the inferred electronic potential changes), consideration needs to be made as to the trade-off between exploration and exploitation in the learning. Therefore some bias needs to be included in the heuristic metric for the choice of the training geometry to include, to preferentially select geometries from earlier times in the trajectory, before it has diverged too much from the exact converged path. This avoids unnecessary additional electronic structure calculations to include training geometries which may be unrepresentative of the phase space explored in the final converged trajectory. To account for this, the peaks in D_{min} are weighted according to eqn (25), to bias the selection of training geometries that occur earlier in the trajectory,

(25) |

Here, the exponent x is a hyperparameter that determines the degree of weighting towards earlier geometries. At its limiting cases, it can reduce to either selecting the highest peak (x = 0) or the first peak (x → ∞) in the Hamiltonian distance, D_{min}. We found x = 3 to result in a reasonably fast convergence, for the hydrogen chains studied in this work, providing a good balance between exploring the most unfamiliar geometries and including earlier peaks to ensure systematic convergence of the true trajectory from earlier to later times.

This learning strategy is demonstrated in Fig. 4 for converging the NAMD trajectory over five inferred states with respect to number of geometries used in the training (N) of an eight-atom hydrogen chain, using the five lowest-energy FCI states from each training geometry. The interpolated surfaces with the shown number of training geometries (N) are shown with orange dashed lines in panels (a)–(e), reflecting both the error in the interpolation and the divergence of the trajectories as a result, compared to the exact adiabatic surfaces from FCI (solid blue lines). All adiabatic surfaces were restricted to be from the same spatial symmetry as the ground state. The trajectory was started from the third excited state, S_{3}, at the equilibrium equidistant linear chain geometry with zero nuclear kinetic energy and propagated with a timestep of 0.1 fs in the minimal STO-3G basis. The interpolated trajectory agrees almost exactly with the FCI trajectory by the time N = 14, as shown in 4(e). In this, the propagation follows S_{3} for around 20 fs which pushes the system to form two separate H_{4} clusters, before the trajectory hops to the S_{1} state pushing the middle hydrogens of each of these H_{4} clusters to form energetic dimers while ejecting the end hydrogens in opposite directions. Both of these can be seen in 4(o) where the dimerization of the second and third hydrogen atoms and the rapid increase in the end-to-end distance is observed just after 20 fs. After this, the system stays in S_{1} for 15 fs before jumping to the ground state around 35 fs. This, in turn, stabilises the dimers and allows the formation of another dimer in the middle between the fourth and fifth hydrogens that were ejected from their respective clusters. This leads to a final configuration of three vibrating H_{2} dimers in the middle with two atomic hydrogen atoms dissociating from the chain. This complex NAMD trajectory explores a lot of different regions of the nuclear phase space that would be difficult to select as training states a priori.

Fig. 4 Convergence of the active learning scheme for the nonadiabatic molecular dynamics trajectory of an eight-atom hydrogen chain with respect to number of geometries selected for training the model (N). Top panels (a–e) show the adiabatic energies along this trajectory (exact as blue solid lines, interpolated from N training FCI calculations as orange dashed lines). Bottom panels (k–o) display the scaled end-to-end distance of the chain, R_{1,Natm}/(N_{atm} − 1) and the distance between the 2^{nd} and 3^{th} hydrogen atoms along the chain, R_{2,3}. Middle panels (f–j) show the minimum Hamiltonian distance (eqn (24)) over the geometries along the continuation trajectory with respect to the training geometries available at that iteration. The vertical lines in this panel represent the next geometry along the trajectory selected for inclusion in the model, based on the metric discussed in eqn (25). |

We can also use this to analyze the convergence of the active learning scheme for selected numbers of geometries up to the converged N = 14 trajectory, with the Hamiltonian-distance metric of eqn (24) shown in Fig. 4 panels (f)–(j). The geometries selected for subsequent inclusion in the training set from the path of the inferred trajectory are indicated by vertical lines in these panels, according to eqn (25). It can be seen that the peaks in the Hamiltonian distance serve as qualitative indicators for points of divergence between the trajectory on the inferred potential and the FCI trajectory. This is especially evident for N = 5 and N = 8 trajectories where the data selection in Fig. 4(g) and (h) corresponds to the geometries where the divergence starts in Fig. 4(b) and (c), respectively. Notably, the N = 5 trajectory underscores the significance of weighting the selection to earlier times since selecting the maximum D_{min} would have lead to configurations unexplored within the true trajectory. This allows for a systematic convergence of data selection towards the true trajectory from earlier to later times. Moreover, prioritizing geometries in this way also facilitates the exploration of relevant regions in the phase space at later times in the MD, as illustrated for N = 11 in Fig. 4(d).

The focus on improving the accuracy of the model initially for earlier times ensures that the transition region between S_{1} → S_{0} can be predicted correctly before trying to iteratively improve the later time sampling of the relevant post-transition ground state dynamics. These are targeted from N = 11 onwards (see the data selection at 35 fs in Fig. 4(i)) when relevant parts of the phase space in these trajectories are being sampled.

Overall, the proposed heuristics appear to address the challenges of selecting a compact set of training geometries in a black-box and rapidly convergent fashion, minimizing the number of explicit electronic structure calculations required. This balances the challenges of considering both the time and the magnitude of the peak in the heuristic error estimate in the interpolated energy surfaces over the trajectories sampled. Overall, in this system qualitative convergence is achieved by the 11^{th} training geometry, with quantitative accuracy attained after a total of 14 training geometries. Considering both the nuclear phase space explored and electronic complexity of this trajectory, the proposed learning scheme can train and replicate the true trajectory with remarkably few FCI training points. We anticipate however that further improvements could be made by including a description of the inferred states themselves in these heuristics since only the Hamiltonian distance to a single training point is considered. This will be investigated in future work.

In recent years, there has been progress in obtaining approximate DMRG-SCF excited state gradients and NACs^{33,91} as well as quantum dynamics simulations of realistic vibronic Hamiltonians with time-dependent DMRG.^{92–94} However, running full NAMD simulations with ab initio DMRG has practically been out of reach due to difficulties with exact analytical NACs, high computational cost and difficulty in ensuring a fully black-box and robust workflow over the many calculations. Here, we present the first NAMD simulation with fully ab initio MPS wavefunctions accelerated through the eigenvector continuation. We obtain DMRG training data via the spin-adapted and multi-state implementation in the block2 library, which can also straightforwardly obtain the required tRDMs and overlaps between the MPS training states.^{66,95,96}

We consider the 28-atom one-dimensional hydrogen chain in a STO-6G basis with FSSH, where we initialize the chain with an equilibrium equidistant nuclear configuration with zero nuclear kinetic energy, photo-excited to the first excited electronic S_{1} state, and we consider the electron and nuclear dynamics over these two interpolated states. The training MPS wavefunctions were optimized to near-exact accuracy by exponentially increasing the bond dimension at each training geometry while decreasing the noise during the DMRG sweeps, simultaneously optimizing the ground and S_{1} state. A timestep of 0.5 fs was used for the FSSH trajectories.

The active learning scheme described in Section 3.3 was applied to converge a single nonadiabatic molecular dynamics trajectory of the H_{28} system that explores the relevant phase space and samples the dynamics on both S_{0} and S_{1} potential energy surfaces. It should be stressed that the training points selected by the scheme will not (in general) be featured in the final trajectory, since they are taken from different potential energy surfaces as the inferred model is iteratively improved. This suggests that while it converges the trajectory from a single seed, it should exhibit little bias towards this single trajectory in the final data selection. Similar accuracy should be found over a stochastic ensemble of trajectories, with the selection relatively insensitive to the precise details of the trajectory. The convergence was achieved with N = 22 geometries as shown in Fig. 5(a). In this, we show the convergence of the average variational energy of the S_{0} and S_{1} states with respect to N over all geometries of the same final converged trajectory. We find that the addition of the final two training points changes this metric for both states by less than chemical accuracy (1 kcal mol^{−1}) across the entire trajectory, indicating confidence in convergence with respect to the training data. It is also worth noting that only two training geometries were sufficient to surpass the variational energy of Hartree–Fock over the trajectory, despite no mean-field information being considered in the interpolated states. The all-important S_{0} → S_{1} average excitation energy over the trajectory also converges faster than the absolute energies of the states, as shown in the inset.

For some context as to the accuracy of the interpolated surfaces, Fig. 5(b) illustrates the error in the interpolated S_{0} and S_{1} energies at the final N = 22 training geometries, compared to the explicit DMRG used in its training, along with additional S_{0} energy comparison to Hartree–Fock and Coupled Cluster Singles and Doubles (CCSD) theory. We find that the interpolated energies are exact (to numerical precision) compared to the DMRG calculations, as expected. However, at two geometries, the continuation scheme yields a slightly lower variational energy than the DMRG training energies for the S_{1} state, indicating that a small further optimization of this state with DMRG would be possible by increasing the bond dimension. The H_{28} system is close to an ideal system for DMRG due to its one-dimensional nature, so suboptimal convergence behavior was rare in this case. However, this showcases that convergence difficulties in the training for a particular geometry is not as problematic for the performance of the continuation model, as it would likely be during a traditional MD trajectory. This is because the model will ‘borrow’ electronic character from other training states to ensure that the surfaces remain smooth and therefore variationally improve upon unoptimized training states even at the training geometries themselves. This behavior was more evident when applied to the ground state dynamics of the Zundel cation reaction in the previous work.^{64}

This unreliability in convergence of multiple single-point calculations can even be evidenced in comparison CCSD energies at the training geometries – generally quite a robust electronic structure method, shown in Fig. 5(b). Although it is well known that CCSD will struggle to capture the multi-reference nature of the more dissociative geometries due to its intrinsic single-reference formulation (geometry 2), it can also converge to the wrong state (geometry 17) or fail to converge at all (geometry 22) with simple default simulation parameters in widely used software packages. Isolated convergence issues are common when running lots of single-point calculations with different electronic character as are necessary in MD with many-body methodologies, and the proposed scheme ensures the smoothness of the potential energy surface and thus the physicality of dynamics. Even without these points however, CCSD would not reach chemical accuracy (even for energy differences) across the trajectory.

Using these DMRG training wavefunctions for H_{28} we find the dynamics over 60 fs for an ensemble of 100 trajectories with different random seeds, all starting at the equilibrium geometry with equally separated hydrogen atoms, in the S_{1} electronic state. This is a total of 12000 single-point calculations, all inferred from the same 22 DMRG training calculations. The averages over this ensemble of trajectories for the electronic population of each state and atomic distances are shown in Fig. 6 to better understand the nature of the nonadiabatic molecular dynamics in this H_{28} system.

The average electronic state populations are shown in Fig. 6(a), with the fastest rate of internal conversion between 30–50 fs after release, indicated in particular by two sharp jumps, with almost all trajectories in the ground S_{0} state after this time. In addition, there exists a handful of trajectories that hop back to the higher-energy S_{1} state after hopping to S_{0}, especially among the trajectories that transition to S_{0} early (before 10 fs), which can be seen by the dips in S_{0} population around 5–15 fs. Considering the nuclear geometries along the trajectories, it is clear that the hydrogen atoms at the ends of the chains rapidly dimerize for all trajectories, which is shown in Fig. 6(b). This contrasts with the H_{4} and H_{8} systems, where their terminal atoms were ejected before being able to dimerize. In the longer chain dynamics, the frequency and amplitude of these terminal dimer vibrations also seems to change with time, which differ from the ground state behaviour where a constant frequency and amplitude is maintained throughout the dynamics. This might imply that the first excited state of H_{28} is not as rapidly dissociative (or dissociative at all) unlike the smaller chains, or that the smaller excitation energy in this longer chain leads to faster decay to the ground state which favors dimerization.

The end-to-end distance of the 28-atom hydrogen chain increases linearly with time as seen in Fig. 6(c) with relatively little scatter in this rate over the different trajectories. The rate of this increase is very similar to the one observed in ground state hydrogen chain dynamics in Rath et al.^{64} Note that the dynamics in that work started from a geometry that was 10% stretched from the equilibrium, while this work starts from the first excited state of the equilibrium equidistant geometry. This suggests that the electronic energy gained by that initial stretch results in a equivalent chain length behaviour as a photo-excitation to S_{1}. Fig. 6(e) shows the maximum distance between any two neighbouring hydrogen atoms, which essentially describes the distance between the terminal dimers in the chain which move apart from each other at the fastest rate (i.e. the H_{2}–H_{3} distance).

Unlike the consistent dimerization behavior at the ends of the chain, the behaviour in the middle of the chain seems to vary significantly depending on the trajectory, as seen in Fig. 6(d), exploring a large phase space. Continuous oscillations of some trajectories about 1.5 a_{0} are visible, indicating that a portion of trajectories form dimers between the 14^{th} and 15^{th} hydrogen atoms. This would result in a frustrated dimerization of the overall chain, as full dimerization would require the 14^{th} and 15^{th} hydrogen atoms to dimerize with their other neighbours, not with each other. This therefore requires other non-dimer configurations in the chain for those trajectories, either forming trimers, unbound hydrogen atoms, or other larger hydrogen complexes.

We analyze the geometries that emerge in the last 5 fs of the 100 trajectories computed in this work, to identify the different nuclear configurations that result. Of these trajectories, 27% form perfect dimerization throughout the chain, where the distance within dimers is always smaller than the distance to their next neighbour, leaving the majority without this expected order. 39% of all trajectories have hydrogen atoms being shared with two separate dimers where the bond separation within both of these dimers is less than the distance to their other neighbours. Only 9% of trajectories can be characterized as having a ‘free’ hydrogen where the distance to its neighbours is always more than the distance between the next consecutive hydrogens. Finally, the rest of the trajectories (25%) don’t fit any of these criteria and can be thought of as transition states between these configurations. Videos showcasing representative trajectories are available in the ESI.†

The variation in the trajectories all appears simply due to the stochastic nature of the hopping to the ground state in the FSSH approach, which can result in significant deviations in this system and the exploration of different local minima in the nuclear phase space. A consideration of nonadiabatic models beyond FSSH would be interesting, to see whether these differences persist, as well as to gain further insights into the electronic behaviour over time (e.g. (transition) dipole moments and absorption spectra), which are all accessible from the interpolated states. In addition, there are a number of other physical mechanisms to consider, in particular the effect of proton tunnelling, as well as the incomplete nature of the basis set for the electronic states, which will be considered in future work. While the electronic phase of the clamped equidistant hydrogen chain was found to exhibit surprisingly rich physics,^{90} it appears that the nuclear dynamics similarly exhibits significant intriguing and subtle physics. We suggest that this system could also be used as an effective benchmarking test bed for nuclear dynamics developments in correlated electron systems, as well as for the electronic structure solvers on which they so heavily depend.

We demonstrate the approach with fewest-switches surface hopping on one-dimensional hydrogen chains, which exhibits a surprisingly rich dynamical behaviour and wide range of trajectories. For the largest H_{28} system, we use 22 training states as matrix product states from single-point DMRG calculations, and infer 12000 points on a smooth multi-state electronic surface converged to chemical accuracy from this DMRG training (including their analytic atomic forces and NACs) to sample the nuclear trajectories at (hybrid) mean-field scaling. The electronic wavefunctions at each point are represented as variationally optimized linear combinations of the training states, which are themselves optimized in the training phase at selected nuclear geometries. This relies on a consistent and transferable representation of the many-body probability amplitudes, described in an atomic-local representation to facilitate this transferability of the low-energy probability amplitudes to different nuclear configurations. We discover a surprisingly broad spectrum of surface-hopping trajectories for this hydrogen chain, where the simple dimerization of all atomic pairs represent only a minority of the observed trajectories. While nuclear quantum effects are likely to contribute in this system and be an interesting direction for future work, the transitions are below the total initial energy and proceed energetically downhill and are therefore unlikely to be a leading order effect. The nonadiabatic dynamics of this simple system could therefore represent an effective sandpit in the further development and comparison of hybrid classical–quantum molecular dynamics schemes, as well as their dependence on underlying correlated electronic structure methodologies.

From a machine-learning perspective, this approach circumvents many of the traditional approximations and assumptions of local energy based decompositions for machine-learned force field parameterizations. The presence of an explicit many-body wavefunction and variational energy at each nuclear configuration ensures an inductive bias away from poorly represented regions of phase space. All desired observables are accessible within the same model, with a smoothly varying and physical electronic state at all times. The scope for accelerating the numerous situations where multiple sequential electronic structure calculations are required is clear – from molecular dynamics to geometry or transition path optimization, vibrational spectroscopy and beyond.

Of course, many questions still remain. Chief amongst them is the question of how the number of training points required for a given accuracy scales with the nuclear phase space sampled. While this is likely to be system-dependent, it builds on the fundamental question of how transferable the low-energy physics is between these locally represented many-body states. While the nuclear phase space of the H_{28} was relatively small given the one-dimensional and inversion-preserving configurations sampled due to the initial conditions, it was still large enough to make this a highly non-trivial interpolation from just 22 training points. This question of training set size must also necessarily consider the effect of more realistic basis set sizes. Currently, limitations still exist in obtaining the training data, and more approximate methods will need to be investigated, as well as circumventing the quartic scaling memory costs for the 2-tRDMs which are required. These are all critical questions for the long-term viability of this approach, which holds the promise for efficient acceleration of a wide number of correlated electronic structure methods.

- B. F. E. Curchod and T. J. Martínez, Chem. Rev., 2018, 118, 3305–3336 CrossRef PubMed .
- R. Crespo-Otero and M. Barbatti, Chem. Rev., 2018, 118, 7026–7068 CrossRef .
- T. R. Nelson, A. J. White, J. A. Bjorgaard, A. E. Sifain, Y. Zhang, B. Nebgen, S. Fernandez-Alberti, D. Mozyrsky, A. E. Roitberg and S. Tretiak, Chem. Rev., 2020, 120, 2215–2287 CrossRef .
- S. Matsika, Chem. Rev., 2021, 121, 9407–9449 CrossRef PubMed .
- S. Matsika and P. Krause, Annu. Rev. Phys. Chem., 2011, 62, 621–643 CrossRef CAS PubMed .
- M. Born and R. Oppenheimer, Ann. Phys., 1927, 389, 457–484 CrossRef .
- J. C. Tully, J. Chem. Phys., 1990, 93, 1061–1071 CrossRef CAS .
- J. E. Subotnik, A. Jain, B. Landry, A. Petit, W. Ouyang and N. Bellonzi, Annu. Rev. Phys. Chem., 2016, 67, 387–417 CrossRef CAS PubMed .
- A. Jain and J. E. Subotnik, J. Chem. Phys., 2015, 143, 134107 CrossRef .
- J. E. Subotnik, W. Ouyang and B. R. Landry, J. Chem. Phys., 2013, 139, 214107 CrossRef .
- G. Granucci and M. Persico, J. Chem. Phys., 2007, 126, 134114 CrossRef .
- G. Granucci, M. Persico and A. Zoccante, J. Chem. Phys., 2010, 133, 134111 CrossRef .
- M. Ben-Nun and T. J. Martínez, in Ab Initio Quantum Molecular Dynamics, John Wiley & Sons, Ltd, 2002, pp. 439–512 Search PubMed .
- F. Plasser, R. Crespo-Otero, M. Pederzoli, J. Pittner, H. Lischka and M. Barbatti, J. Chem. Theory Comput., 2014, 10, 1395–1405 CrossRef CAS .
- J. Janoš and P. Slavíček, J. Chem. Theory Comput., 2023, 19, 8273–8284 CrossRef .
- M. Barbatti and R. Crespo-Otero, in Surface Hopping Dynamics with DFT Excited States, ed. N. Ferré, M. Filatov and M. Huix-Rotllant, Springer International Publishing, Cham, 2016, pp. 415–444 Search PubMed .
- A. V. Akimov and O. V. Prezhdo, J. Chem. Theory Comput., 2013, 9, 4959–4972 CrossRef CAS PubMed .
- B. F. E. Curchod, U. Rothlisberger and I. Tavernelli, ChemPhysChem, 2013, 14, 1314–1340 CrossRef CAS .
- O. Christiansen, H. Koch and P. Jørgensen, Chem. Phys. Lett., 1995, 243, 409–418 CrossRef CAS .
- J. F. Stanton and R. J. Bartlett, J. Chem. Phys., 1993, 98, 7029–7039 CrossRef .
- I. Lengsfield, H. Byron, P. Saxe and D. R. Yarkony, J. Chem. Phys., 1984, 81, 4549–4553 CrossRef .
- I. F. Galván, M. G. Delcey, T. B. Pedersen, F. Aquilante and R. Lindh, J. Chem. Theory Comput., 2016, 12, 3636–3653 CrossRef PubMed .
- P. Slavíek and T. J. Martínez, J. Chem. Phys., 2010, 132, 234102 CrossRef PubMed .
- C. Angeli, J. Comput. Chem., 2009, 30, 1319–1333 CrossRef PubMed .
- Y. J. Bomble, K. W. Sattelmeyer, J. F. Stanton and J. Gauss, J. Chem. Phys., 2004, 121, 5236–5240 CrossRef PubMed .
- J. W. Park and T. Shiozaki, J. Chem. Theory Comput., 2017, 13, 2561–2570 CrossRef PubMed .
- J. W. Park and T. Shiozaki, J. Chem. Theory Comput., 2017, 13, 3676–3683 CrossRef PubMed .
- J. J. Szymczak, M. Barbatti and H. Lischka, Int. J. Quantum Chem., 2011, 111, 3307–3315 CrossRef CAS .
- F. Plasser, M. Barbatti, A. J. A. Aquino and H. Lischka, Theor. Chem. Acc., 2012, 131, 1073 Search PubMed .
- T. Iino, T. Shiozaki and T. Yanai, J. Chem. Phys., 2023, 158, 054107 CrossRef CAS .
- J. W. Park, R. Al-Saadon, M. K. MacLeod, T. Shiozaki and B. Vlaisavljevich, Chem. Rev., 2020, 120, 5878–5909 CrossRef CAS .
- A. Baiardi and M. Reiher, J. Chem. Phys., 2020, 152, 040903 CrossRef CAS PubMed .
- L. Freitag and M. Reiher, in The Density Matrix Renormalization Group for Strong Correlation in Ground and Excited States, John Wiley & Sons, Ltd, 2020, ch. 7, pp. 205–245 Search PubMed .
- G. K.-L. Chan and S. Sharma, Annu. Rev. Phys. Chem., 2011, 62, 465–481 CrossRef CAS .
- R. Cimiraglia and M. Persico, J. Comput. Chem., 1987, 8, 39–47 CrossRef CAS .
- A. A. Holmes, N. M. Tubman and C. J. Umrigar, J. Chem. Theory Comput., 2016, 12, 3674–3680 CrossRef CAS PubMed .
- G. H. Booth, A. J. W. Thom and A. Alavi, J. Chem. Phys., 2009, 131, 054106 CrossRef PubMed .
- K. Guther, R. J. Anderson, N. S. Blunt, N. A. Bogdanov, D. Cleland, N. Dattani, W. Dobrautz, K. Ghanem, P. Jeszenszki, N. Liebermann, G. L. Manni, A. Y. Lozovoi, H. Luo, D. Ma, F. Merz, C. Overy, M. Rampp, P. K. Samanta, L. R. Schwarz, J. J. Shepherd, S. D. Smart, E. Vitale, O. Weser, G. H. Booth and A. Alavi, J. Chem. Phys., 2020, 153, 034107 CrossRef PubMed .
- R. J. A. J. J. Halson and G. H. Booth, Mol. Phys., 2020, 118, e1802072 CrossRef .
- M. Motta and S. Zhang, Wiley Interdiscip. Rev. Comput. Mol. Sci., 2018, 8, e1364 CrossRef .
- J. Lee, H. Q. Pham and D. R. Reichman, J. Chem. Theory Comput., 2022, 18, 7024–7042 CrossRef PubMed .
- K. Choo, A. Mezzacapo and G. Carleo, Nat. Commun., 2020, 11, 2368 CrossRef PubMed .
- Y. Rath and G. H. Booth, Phys. Rev. B, 2023, 107, 205119 CrossRef .
- L. Otis and E. Neuscamman, Wiley Interdiscip. Rev. Comput. Mol. Sci., 2023, 13, e1659 CrossRef .
- A. Baiardi, A. K. Kelemen and M. Reiher, J. Chem. Theory Comput., 2022, 18, 415–430 CrossRef CAS .
- N. S. Blunt, S. D. Smart, G. H. Booth and A. Alavi, J. Chem. Phys., 2015, 143, 134117 CrossRef CAS .
- H. Lischka, D. Nachtigallová, A. J. A. Aquino, P. G. Szalay, F. Plasser, F. B. C. Machado and M. Barbatti, Chem. Rev., 2018, 118, 7293–7361 CrossRef CAS PubMed .
- W. Hu and G. K.-L. Chan, J. Chem. Theory Comput., 2015, 11, 3000–3009 CrossRef CAS .
- R. E. Thomas, D. Opalka, C. Overy, P. J. Knowles, A. Alavi and G. H. Booth, J. Chem. Phys., 2015, 143, 054108 CrossRef PubMed .
- T. Jiang, W. Fang, A. Alavi and J. Chen, General Analytical Nuclear Force and molecular potential energy surface from full configuration interaction quantum Monte Carlo, arXiv, 2022, preprint, arXiv:2204.13356v1 [physics.chem-ph], DOI:10.48550/arXiv.2204.13356.
- S. Chen and S. Zhang, Phys. Rev. B, 2023, 107, 195150 CrossRef CAS .
- Y. Luo and S. Sorella, Front. Mater., 2015, 2, 29 Search PubMed .
- J. Westermayr, M. Gastegger, M. F. S. J. Menger, S. Mai, L. González and P. Marquetand, Chem. Sci., 2019, 10, 8100–8107 RSC .
- J. Li, P. Reiser, B. R. Boswell, A. Eberhard, N. Z. Burns, P. Friederich and S. A. Lopez, Chem. Sci., 2021, 12, 5302–5314 RSC .
- W. Pronobis, K. T. Schütt, A. Tkatchenko and K.-R. Müller, Eur. Phys. J. B, 2018, 91, 178 CrossRef .
- B.-X. Xue, M. Barbatti and P. O. Dral, J. Phys. Chem. A, 2020, 124, 7199–7210 CrossRef CAS .
- P. O. Dral and M. Barbatti, Nat. Rev. Chem, 2021, 5, 388–405 CrossRef CAS .
- J. Westermayr, M. Gastegger and P. Marquetand, J. Phys. Chem. Lett., 2020, 11, 3828–3834 CrossRef CAS PubMed .
- S. Axelrod, E. Shakhnovich and R. Gómez-Bombarelli, Nat. Commun., 2022, 13, 3440 CrossRef CAS PubMed .
- J. Li and S. A. Lopez, Acc. Chem. Res., 2022, 55, 1972–1984 CrossRef CAS .
- M. T. do Casal, J. M. Toldo, M. Pinheiro Jr and M. Barbatti, Open Research Europe, 2022, 1, 49 Search PubMed .
- I. C. D. Merritt, D. Jacquemin and M. Vacher, J. Chem. Theory Comput., 2023, 19, 1827–1842 CrossRef CAS PubMed .
- J. Westermayr and P. Marquetand, Mach. Learn.: Sci. Technol., 2020, 1, 043001 Search PubMed .
- Y. Rath and G. H. Booth, Interpolating many-body wave functions for accelerated molecular dynamics on near-exact electronic surfaces, arXiv, 2024, preprint, arXiv:2402.11097v1 [physics.chem-ph], DOI:10.48550/arXiv.2402.11097.
- R. Olivares-Amaya, W. Hu, N. Nakatani, S. Sharma, J. Yang and G. K.-L. Chan, J. Chem. Phys., 2015, 142, 034102 CrossRef PubMed .
- H. Zhai and G. K.-L. Chan, J. Chem. Phys., 2021, 154, 224116 CrossRef CAS .
- D. Frame, R. He, I. Ipsen, D. Lee, D. Lee and E. Rrapaj, Phys. Rev. Lett., 2018, 121, 032501 CrossRef CAS .
- P. Demol, T. Duguet, A. Ekström, M. Frosini, K. Hebeler, S. König, D. Lee, A. Schwenk, V. Somà and A. Tichai, Phys. Rev. C, 2020, 101, 041302 CrossRef CAS .
- T. Duguet, A. Ekström, R. J. Furnstahl, S. König and D. Lee, Eigenvector Continuation and Projection-Based Emulators, 2023 Search PubMed .
- S. König, A. Ekström, K. Hebeler, D. Lee and A. Schwenk, Phys. Lett. B, 2020, 810, 135814 CrossRef .
- N. Yapa, K. Fossez and S. König, Phys. Rev. C, 2023, 107, 064316 CrossRef CAS .
- C. Drischler, M. Quinonez, P. Giuliani, A. Lovell and F. Nunes, Phys. Lett. B, 2021, 823, 136777 CrossRef CAS .
- S. E. Schrader and S. Kvaal, J. Chem. Phys., 2023, 158, 114116 CrossRef CAS .
- C. Mejuto-Zaera and A. F. Kemper, Electron. Struct., 2023, 5, 045007 CrossRef CAS .
- S.-G. Hwang, Am. Math. Mon., 2004, 111, 157–159 CrossRef .
- P. Löwdin, J. Chem. Phys., 2004, 18, 365–375 CrossRef .
- I. Mayer, Int. J. QuantumChem., 2002, 90, 63–65 CrossRef CAS .
- S. R. White, Phys. Rev. Lett., 1992, 69, 2863–2866 CrossRef .
- R. P. Feynman, Phys. Rev., 1939, 56, 340–343 CrossRef CAS .
- Q. Sun, J. Comput. Chem., 2015, 36, 1664–1671 CrossRef CAS .
- Q. Sun, T. C. Berkelbach, N. S. Blunt, G. H. Booth, S. Guo, Z. Li, J. Liu, J. D. McClain, E. R. Sayfutyarova, S. Sharma, S. Wouters and G. K.-L. Chan, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2018, 8, e1340 Search PubMed .
- Q. Sun, X. Zhang, S. Banerjee, P. Bao, M. Barbry, N. S. Blunt, N. A. Bogdanov, G. H. Booth, J. Chen, Z.-H. Cui, J. J. Eriksen, Y. Gao, S. Guo, J. Hermann, M. R. Hermes, K. Koh, P. Koval, S. Lehtola, Z. Li, J. Liu, N. Mardirossian, J. D. McClain, M. Motta, B. Mussard, H. Q. Pham, A. Pulkin, W. Purwanto, P. J. Robinson, E. Ronca, E. R. Sayfutyarova, M. Scheurer, H. F. Schurkus, J. E. T. Smith, C. Sun, S.-N. Sun, S. Upadhyay, L. K. Wagner, X. Wang, A. White, J. D. Whitfield, M. J. Williamson, S. Wouters, J. Yang, J. M. Yu, T. Zhu, T. C. Berkelbach, S. Sharma, A. Y. Sokolov and G. K.-L. Chan, J. Chem. Phys., 2020, 153, 024109 CrossRef CAS PubMed .
- B. Bamieh, A Tutorial on Matrix Perturbation Theory (Using Compact Matrix Notation), 2022 Search PubMed .
- M. Barbatti, J. Chem. Theory Comput., 2021, 17, 3010–3018 CrossRef CAS PubMed .
- M. Barbatti, M. Ruckenbauer, F. Plasser, J. Pittner, G. Granucci, M. Persico and H. Lischka, Wiley Interdiscip. Rev. Comput. Mol. Sci., 2014, 4, 26–33 CrossRef CAS .
- M. Barbatti, M. Bondanza, R. Crespo-Otero, B. Demoulin, P. O. Dral, G. Granucci, F. Kossoski, H. Lischka, B. Mennucci, S. Mukherjee, M. Pederzoli, M. Persico, M. Pinheiro Jr, J. Pittner, F. Plasser, E. Sangiogo Gil and L. Stojanovic, J. Chem. Theory Comput., 2022, 18, 6851–6865 CrossRef CAS PubMed .
- I. F. Galván, M. Vacher, A. Alavi, C. Angeli, F. Aquilante, J. Autschbach, J. J. Bao, S. I. Bokarev, N. A. Bogdanov, R. K. Carlson, L. F. Chibotaru, J. Creutzberg, N. Dattani, M. G. Delcey, S. S. Dong, A. Dreuw, L. Freitag, L. M. Frutos, L. Gagliardi, F. Gendron, A. Giussani, L. González, G. Grell, M. Guo, C. E. Hoyer, M. Johansson, S. Keller, S. Knecht, G. Kovačević, E. Källman, G. Li Manni, M. Lundberg, Y. Ma, S. Mai, J. P. Malhado, P. Å. Malmqvist, P. Marquetand, S. A. Mewes, J. Norell, M. Olivucci, M. Oppel, Q. M. Phung, K. Pierloot, F. Plasser, M. Reiher, A. M. Sand, I. Schapiro, P. Sharma, C. J. Stein, L. K. Sørensen, D. G. Truhlar, M. Ugandi, L. Ungur, A. Valentini, S. Vancoillie, V. Veryazov, O. Weser, T. A. Wesołowski, P.-O. Widmark, S. Wouters, A. Zech, J. P. Zobel and R. Lindh, J. Chem. Theory Comput., 2019, 15, 5925–5964 CrossRef .
- J. C. Butcher, J. Assoc. Comput. Mach., 1965, 12, 124–135 CrossRef .
- M. Motta, D. M. Ceperley, G. K.-L. Chan, J. A. Gomez, E. Gull, S. Guo, C. A. Jiménez-Hoyos, T. N. Lan, J. Li, F. Ma, A. J. Millis, N. V. Prokof’ev, U. Ray, G. E. Scuseria, S. Sorella, E. M. Stoudenmire, Q. Sun, I. S. Tupitsyn, S. R. White, D. Zgid and S. Zhang, Phys. Rev. X, 2017, 7, 031059 Search PubMed .
- M. Motta, C. Genovese, F. Ma, Z.-H. Cui, R. Sawaya, G. K.-L. Chan, N. Chepiga, P. Helms, C. Jiménez-Hoyos, A. J. Millis, U. Ray, E. Ronca, H. Shi, S. Sorella, E. M. Stoudenmire, S. R. White and S. Zhang, Phys. Rev. X, 2020, 10, 031058 CAS .
- L. Freitag, Y. Ma, A. Baiardi, S. Knecht and M. Reiher, J. Chem. Theory Comput., 2019, 15, 6724–6737 CrossRef .
- Y. Yao, K.-W. Sun, Z. Luo and H. Ma, J. Phys. Chem. Lett., 2018, 9, 413–419 CrossRef PubMed .
- A. Baiardi and M. Reiher, J. Chem. Theory Comput., 2019, 15, 3481–3498 CrossRef PubMed .
- J. Ren, W. Li, T. Jiang, Y. Wang and Z. Shuai, Wiley Interdiscip. Rev. Comput. Mol. Sci., 2022, 12, e1614 CrossRef .
- H. Zhai, H. R. Larsson, S. Lee, Z.-H. Cui, T. Zhu, C. Sun, L. Peng, R. Peng, K. Liao, J. Tölle, J. Yang, S. Li and G. K.-L. Chan, J. Chem. Phys., 2023, 159, 234801 CrossRef .
- J. J. Dorando, J. Hachmann and G. K.-L. Chan, J. Chem. Phys., 2007, 127, 084109 CrossRef PubMed .

## Footnotes |

† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4fd00062e |

‡ It is possible to work directly with a non-orthogonal AO representation of the abstract many-electron representation of the training states, but this then requires a rotation of the many-body states at each geometry. This was explored in ref. 74, but entails exponential complexity for each inference and therefore it primarily motivated application for quantum computers. |

This journal is © The Royal Society of Chemistry 2024 |