Atomistic machine learning with the LAMMPS-FitSNAP ecosystem
Machine learning (ML) enables the development of interatomic potentials that promise the accuracy of first principles methods while retaining the low cost and parallel efficiency of empirical potentials. While ML potentials traditionally use atom-centered descriptors as inputs, different models such as linear regression and neural networks can map these descriptors to atomic energies and forces. This begs the question: what is the improvement in accuracy due to model complexity irrespective of choice of descriptors? We curate three datasets to investigate this question in terms of ab initio energy and force errors: (1) solid and liquid silicon, (2) gallium nitride, and (3) the superionic conductor LGPS. We further investigate how these errors affect simulated properties with these models and verify if the improvement in fitting errors corresponds to measurable improvement in property prediction. Since linear and nonlinear regression models have different advantages and disadvantages, the results presented herein help researchers choose models for their particular application. By assessing different models, we observe correlations between fitting quantity (e.g. atomic force) error and simulated property error with respect to ab initio values. The tools needed to perform these analyses are implemented in the LAMMPS-FitSNAP ecosystem to aid researchers in determining the desired level of accuracy, and hence model complexity, needed for their particular systems of interest.