Contributed Talk

Natural Language to LAMMPS: LLMs as interfaces between researchers and scientific software


Juan Carlos Verduzco
Purdue University
Ethan Holbrook
Purdue University
Alejandro Strachan
Purdue University
  • TBA
  • TBA

Large language models (LLMs) offer new capabilities for bridging natural language and domain-specific languages in scientific computing. Here, we investigate the use of LLMs to translate from English task descriptions to LAMMPS input scripts. Our workflow pairs the LLM-generated input scripts with our newly developed parser to check structural correctness, validate syntax, and assist in debugging. We systematically evaluate model performance across tasks of increasing complexity, from basic thermalization to multi-fix, multi-region configurations. For simple simulations, most GPT-generated input files are runnable and accurate, with the remaining cases typically requiring minor, easily correctable adjustments.

Common errors include mismatched pair style definitions, unsupported arguments, or improper formatting. In more complex setups, additional issues arise from the ordering of region and fix declarations. Beyond file generation, the models also produce human-readable explanations that help users understand and document their simulations. While verification and validation remain the responsibility of the researcher, LLMs significantly reduce setup effort, support onboarding, and improve the clarity and reproducibility of computational studies.