The importance of sampling the dynamical modes: Reevaluating benchmarks for invariant and equivariant features of machine learning potentials for simulation of free energy landscapes

G Perez-Lemus and YN Xu and YZ Jin and PZ Rico and J de Pablo, JOURNAL OF CHEMICAL PHYSICS, 161, 244703 (2024).

DOI: 10.1063/5.0237399

Machine learning interatomic potentials (MLIPs) are rapidly gaining interest for molecular modeling, as they provide a balance between quantum-mechanical level descriptions of atomic interactions and reasonable computational efficiency. However, questions remain regarding the stability of simulations using these potentials, as well as the extent to which the learned potential energy function can be extrapolated safely. Past studies have encountered challenges when MLIPs are applied to classical benchmark systems. In this work, we show that some of these challenges are related to the characteristics of the training datasets, particularly the inefficient exploration of the dynamical modes and the inclusion of rigid constraints. We demonstrate that long stability in simulations with MLIPs can be achieved by generating unconstrained datasets using unbiased classical simulations, provided that the important dynamical modes are correctly sampled. In addition, we emphasize that in order to achieve precise energy predictions, it is important to resort to enhanced sampling techniques for dataset generation, and we demonstrate that safe extrapolation of MLIPs depends on judicious choices related to the system's underlying free energy landscape and the symmetry features embedded within the machine learning models.

Return to Publications page