How AI is Revolutionizing the Discovery of Materials

Page 1

|
Materials Changed Societies and Enabled new Technology: Stone → Bronze → Iron → … → Silicon Age wikimedia.org, Google images | 2
Materials Discovery Precursor to Progress in Society
14 Grand Challenges for Humanity in the 21st Century www.engineeringchallenges.org | 3

How are Materials Discovered?

Stainless steel, vulcanized rubber (car tires), Teflon, Play-doh, Saccharin, Super Glue,…

Edisonian (Trial and Error) Approach: He tested over 6,000 plant materials to discover the final light bulb filament

wikimedia.org,

Google images | 4

Materials-Discovery over Time

Of all (solid state) materials that we know of today, how many were discovered in the last 10 years?

pollev.com/peterschindler

| 5

Materials-Discovery over Time

Materials in ICSD

Top 500 HPC

Doubles every ~22 yrs

1st : Empirical Science

2nd: Modelbased Science

Doubles every ~1.3 yrs

3rd: Computational

~33%
| 6
Experiments
Physical
Laws
DFT, MD
Science ML, Clustering
Science
4th: Data-driven

The 3rd Paradigm: Computational Discovery

“abinito”

Structural

Lattice constants

Bond length

Mechanical

Bulk modulus

Stress tensor

Optical / Electrical

Dielectric constant

Absorption spectra

Density of states

Band structure

Surface

Work function

Surface/cleavage energy

Adsorption energy

Magnetic

Magnetic ordering

Magnetic moment

Thermodynamic

Vibrational entropy

Phase stability (Hull diagram)

σij
| 7

Time Required for Experiment vs. Computation

Experiments (Synthesis) ~ weeks to months (to a Ph.D.)

First principles calculations ~ hours to days (to weeks)

Still too long to screen >100,000 candidates

Discovery Cluster

Over 20,000 CPU Cores and Over 200 GPUs

Hosted at MGHPCC (90,000-square-foot facility)

| 8

The 4th Paradigm: Data-Driven Discovery

Physical & Chemical insight/intuition

optimization Derived or measured properties

* = fingerprint = feature (vector)

* Surrogate ML model
Hyperparameter
Adapted from L. Himanen, et al. Comp. Phys. Comm., 247, (2020). Database | 9

High-Quality Data?

| 10

Types of Materials Data

Text Tshitoyan, V., et al. Nature 571, 95–98 (2019) | 11 Scientific Literature NLP and LLMs

Types of Materials Data

Spectra, Images

TEM, etc.

Scientific Literature

NLP and LLMs

Oviedo, F., et al. Comput Mater 5, 60 (2019).

Ziatdinov, M., et al. Nat Mach Intell 4, (2022).

Experimental Text Micrographs XRD,
| 12

Types of Materials Data

Experimental Computational

Big Data

Atomistic simulations

Small Data

Materials properties

High-throughput computations of properties

Spectra, Images

Micrographs

XRD, TEM, etc.

Text

Scientific Literature

NLP and LLMs

| 13

ML Paradigms in Materials Science

Model-centric AI

How change the model/architecture to improve performance?

Data-centric AI

How systematically change data (x/y) to improve performance?

Big Data “Good Data”

Shallow ML + feature engineering

Computational Experimental Big Small Spectra, Data Data Images

Active Learning

Deep Learning

Computational Experimental Big Small Spectra, Data Data Images

Computational Experimental Big Small Spectra, Data Data Images

Transfer Learning

Computational Experimental Big Small Spectra, Data Data Images

Todorović, et al. npj Comput Mater 5, 35 (2019) Choudhary & DeCost, npj Comput Mater 7, 185 (2021)
| 14

Materials Descriptors

Examples of Data-Driven Discovery

Industrial Perspective

Crystal Structures vs. Molecules

Crystal

Coordinates

Atom types

Lattice vectors

Periodic

Molecule

Coordinates

Atom types

Non-periodic

Space group symmetry

Point group symmetry

E(3) invariant

| 16

Requirements for an Ideal Materials

Descriptor

i. Meaningful and Universal (and fixed in number)

ii. Compact and Cheap(er) to Compute

iii. Invariant Under Crystal Symmetries (and atom permutations)

iv. Continuous (small change in atomic structure = small change in descriptor)

v. Reversible

vi. Unique

vii. Additive

viii. Uncorrelated

Musil et al., Chem. Rev. 2021 | 17

Hierarchy of Materials Descriptors

| 18

Graph Convolutional NNs

CGCNN

Crystal Graph Convolutional NN

ALIGNN

Atomistic Line Graph NN

Others: M3GNet, SchNet, PointNet, PAINN, DimeNet++, … Invariant to E(3)

| 19

E(3) Equivariant GNNs

Requires data augmentation

(inefficient & not physical)

No additional data required

Improved transferability and data efficiency

| 20

E(3) Equivariant GNNs

Interested in the math/CS details?

→ Prof. Robin Walters at Northeastern (Khoury College)

Prof. Boris Kozinsky and Dr. Simon Batzner
Batzner, S., et al. Nat. Comm. 13 (2022) | 21

“Affordable Accuracy”

| 22

HIV Capsid with 44 Million Atoms

| 23
Dr. Simon Batzner
Data-driven Discovery of High-Brightness Photocathodes E.R. Antoniuk, Y. Yue, Y. Zhou, P. Schindler, et al. Physical Review B, 101 (202 E.R. Antoniuk, P. Schindler, et al. Advanced Materials, 33, 44 (2021) Materials Descriptors Examples of Data-Driven Discovery Industrial Perspective Data-driven Discovery of Ultralow Work Function Materials P. Schindler, et al. (in preparation), preprint: arXiv:2011.10905

Majority of Energy Goes to Waste(-Heat)

| 25

Thermionic Energy Converter (TEC)

Anode

Vacuum gap

Cathode

Heat Input

• No moving parts

• Power output scales with area

| 26
Load

TEC Efficiency

| 27

High-Throughput DFT Workflow

| 28
P. Schindler, et al. (under preparation)

Work Function Database

| 29
P. Schindler, et al. (in
preparation)

Model Performance with Physics-motivated Descriptors

Elemental features

Eionization

nmendeleev

Structural features

200 features

15 features

~105 faster than DFT

χ 1
/ r
| 30
preparation)
P. Schindler, et al. (in

Promising New Low Work Function Surfaces

After ionic relaxation: Discovery of metallic surfaces with WF < 1.5 eV:

CsScCl3, Hexagonal, (100) Surface

BaX [X=Si, Sn, Ge], Orthorhombic, (110) Surface

| 31
P. Schindler, et al. (in preparation)
Data-driven Discovery of High-Brightness Photocathodes E.R. Antoniuk, Y. Yue, Y. Zhou, P. Schindler, et al. Physical Review B, 101 (202 E.R. Antoniuk, P. Schindler, et al. Advanced Materials, 33, 44 (2021) Materials Descriptors Examples of Data-Driven Discovery Industrial Perspective Data-driven Discovery of Ultralow Work Function Materials P. Schindler, et al. (in preparation), preprint: arXiv:2011.10905

Discovery of New High-Brightness Photocathodes for XFEL

Electron emission from Photocathodes Depends on Work Function

Work Function

Photocathode Brightness ∝ 1 / spread in transverse momentum of electrons

Intrinsic Emittance

Physical Review B, 101 (2020).

E.R.Antoniuk, Y. Yue, Y. Zhou, P. Schindler, W.A. Schroeder, B. Dunham, P. Pianetta, T. Vecchione, & E.J. Reed. Generalizable DFT-based photoemission model for the accelerated development of photocathodes and other photoemissive devices

| 33

Ab-initio Photoemission Model

| 34
Physical Review B, 101 (2020). E.R.Antoniuk, Y. Yue, Y. Zhou, P. Schindler, W.A. Schroeder, B. Dunham, P. Pianetta, T. Vecchione, & E.J. Reed.

Novel Ultra-bright and Air-Stable Photocathodes

Discovered through ML/DFT Driven Screening

11 materials with intrinsic emittance < 0.3 µm/mm

+ 3 air stable low intrinsic emittance materials M2O (M = Na, K, Rb)

Advanced Materials, 33, 44 (2021)

E. R. Antoniuk, P. Schindler, W. A. Schroeder, B. Dunham, P.

Pianetta, T. Vecchione, and E. J. Reed

| 35

Materials Descriptors

Examples of Data-Driven Discovery

Industrial Perspective

“Materials Informatics” in Industry?

| 37

Industry Perspective: Aionics Inc.

Structure generator leveraging AI potentials to construct crystals, surfaces, molecules, etc.

Database of 10B+ candidates, searchable by physical properties, safety, supply chain, price, etc.

Synthesize New Formulations

AI platform to incorporate latest data, train new model, and guide next selection

Cloud-based DFT to compute properties of candidate

Source at Production Scale

Co-innovation partnerships with electrochemical materials manufacturers:

Aionics tools used internally to lead client’s in-house R&D

| 38

Physics-informed ML & ML-informed Physics

| 39
Acknowledgments Curious? Write me: p.schindler@northeastern.edu www.d2r2group.com • The D2R2 Group members • Reed Group at Stanford • Aionics Inc. • Mentors and collaborators The late Prof. Evan Reed Prof. Ricardo Baeza-Yates, Liz Roderick, and EAI Team! | 40

Audience Questions – Peter Answers

Q: Can you say, perhaps again (or maybe you are about to say) — in addition to crystallinity and structure, which/how many different bulk material properties can be predicted using these models? are there are any material properties that are particularly easy or difficult to predict from crystallinity & structure?

A: Thereisvirtuallynolimittowhichpropertiescanbepredictedwiththisframework,solongtheground-truthtechnique(e.g.,density functionaltheory)thatisusedtogeneratethetrainingdatacanpredictthepropertyofinterestwithareasonableaccuracy.Forexample,DFT cannotaccuratelypredictexcitedstatepropertiesandhencethedatageneratedwouldnotbemeaningfulforcreatingaMLmodel.

Q: Is synthesizing these newly discovered materials easy, hard, … possible?

A: That’sagreatquestion.Ifyouusecomputationaldatatopredictpromisingnewmaterialscandidates,thesynthesisprocessitselfisnoteasier orharderthanbefore – however,thesearchspacehasdrasticallyreducedandhenceyourtimespentsynthesizinghasamuchhigherhit-ratein termsofexhibitingthetargetedproperty(giventhatonewasindeedabletosynthesizeit).WhatIhavenotmentionedinmytalk,isthatthere areapproachesinmaterialsinformaticstopredictthesynthesiabilityofnewcompoundsandevenpredictpotentialpathwaysforsynthesis.One paperinthatfieldIcanrecommend(acollaboratorofmine):https://www.nature.com/articles/s41524-023-01114-4

| 41

Audience Questions – Peter Answers

Q : Great talk! The domain of applicability of these methods can become significantly broader if they can be extended to materials theory development. Can you talk about extending the framework targeted toward searching for new materials behavior.

A: Interestingquestion!Itakethisquestiontwoways:1)CanphysicsdirectlybeusedtoconstrainmaterialsML?Ihaveseenthisinotherfields (likefluiddynamicswherea“Physics-loss”isaddedinadditiontotheconventionalmeanabsoluteerror - loss)butnotasmuchinmaterials science.IthinktheclosesttothisapproachwouldbeCHGNet(https://www.nature.com/articles/s42256-023-00716-3)wherethemagnetic momentofatomsisusedtoconstrainthechargestateofatomsinacrystalstructure.ApartfromthatIamnotawareofmanyotherexamples, butthisislikelyafieldthatisrapidlyevolving.2)CanAIbeusedtodevelopnewmaterialstheory?Ifwethinkoftheconceptof“affordable accuracy”Imentionedinmytalk:Extendingthetime/lengthscalesinwhichwecansimulatewithnearab-initioaccuracyindeedenablesnew physics/materialsinsights. IhavealsoseenapproachestolearnHamiltoniansdirectlywithMLandalsolearnaccurateexchange-correlation functionals(https://www.nature.com/articles/s41467-020-17265-7).

Q: Thanks for your lecture, we know that the integration of Machine learning is the future of computation material science, We note that machine learning alone will probably not be a solution to all problems, as it is fundamentally tied to limitations of the training datasets. Please, elaborate.

A: Yes,wearefundamentallylimitedbythetrainingset.However,therearefewlimitationswhenitcomestocreatingnew,larger/better datasets.ThisgoesbacktowhatImentionedaboutthetwoparadigms:1)Moredataandbiggermodelsarebetter(similarasobservedinLLMs currently)and2)Smaller,buthigh-fidelity(orverydomain-specific)dataisbetter.Dependingonhowmuchcomputing/dataresourcesonehas accessto,determineswhichparadigmmightbemorefeasibleandsuccessful.

| 42
Thank You! 41 Publications and Contact Info: d2r2group.com

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.