Yu-Hang Tang
Lawrence Berkeley National Laboratory

Design and Application of A Graph Kernel Library for Learning on Molecules

Machine learning has shown great promise in accelerating the discovery, synthesis, and characteriza- tion of new materials with desired properties. Molecular structures, when represented at a level of detail of atoms and bonds, are inherently discrete and non-sequential. In this talk, I will present our recent work on the design, implementation, and application of a family of marginalized graph kernels that can directly consume structured discrete data and exploit topological information for similarity quantification, classification, and prediction of material properties. Using just-in-compilation, we implemented a GPU-accelerated python package for the margianlized graph kernel family, with potentially infinite numbers of 'flavors' due to base kernel composition, that achieves hundreds of times of speedup against existing single-flavor CPU counterparts. The package allows rapid training of accurate predictive models on graphs within a time frame of minutes. The kernels can be applied to implement active learning protocols on atomistic systems where new samples can be added on-the-fly to predict various molecular properties.