Abstract
Active learning—the field of machine learning (ML) dedicated to optimal experiment design—has played a part in science as far back as the 18th century when Laplace used it to guide his ...discovery of celestial mechanics. In this work, we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex synthesis-processes-structure-property landscape. We demonstrate an autonomous materials discovery methodology for functional inorganic compounds which allow scientists to fail smarter, learn faster, and spend less resources in their studies, while simultaneously improving trust in scientific results and machine learning tools. This robot science enables science-over-the-network, reducing the economic impact of scientists being physically separated from their labs. The real-time closed-loop, autonomous system for materials exploration and optimization (CAMEO) is implemented at the synchrotron beamline to accelerate the interconnected tasks of phase mapping and property optimization, with each cycle taking seconds to minutes. We also demonstrate an embodiment of human-machine interaction, where human-in-the-loop is called to play a contributing role within each cycle. This work has resulted in the discovery of a novel epitaxial nanocomposite phase-change memory material.
Abstract
The Joint Automated Repository for Various Integrated Simulations (JARVIS) is an integrated infrastructure to accelerate materials discovery and design using density functional theory (DFT), ...classical force-fields (FF), and machine learning (ML) techniques. JARVIS is motivated by the Materials Genome Initiative (MGI) principles of developing open-access databases and tools to reduce the cost and development time of materials discovery, optimization, and deployment. The major features of JARVIS are: JARVIS-DFT, JARVIS-FF, JARVIS-ML, and JARVIS-tools. To date, JARVIS consists of ≈40,000 materials and ≈1 million calculated properties in JARVIS-DFT, ≈500 materials and ≈110 force-fields in JARVIS-FF, and ≈25 ML models for material-property predictions in JARVIS-ML, all of which are continuously expanding. JARVIS-tools provides scripts and workflows for running and analyzing various simulations. We compare our computational data to experiments or high-fidelity computational methods wherever applicable to evaluate error/uncertainty in predictions. In addition to the existing workflows, the infrastructure can support a wide variety of other technologically important applications as part of the data-driven materials design paradigm. The JARVIS datasets and tools are publicly available at the website:
https://jarvis.nist.gov
.
Manual attribution of crystallographic phases from high-throughput x-ray diffraction studies is an arduous task, and represents a rate-limiting step in high-throughput exploration of new materials. ...Here, we demonstrate a semi-supervised machine learning technique, SS-AutoPhase, which uses a two-step approach to identify automatically phases from diffraction data. First, clustering analysis is used to select a representative subset of samples automatically for human analysis. Second, an AdaBoost classifier uses the labeled samples to identify the presence of the different phases in diffraction data. SS-AutoPhase was used to identify the metallographic phases in 278 diffraction patterns from a FeGaPd composition spread sample. The accuracy of SS-AutoPhase was >82.6% for all phases when 15% of the diffraction patterns were used for training. The SS-AutoPhase predicted phase diagram showed excellent agreement with human expert analysis. Furthermore it was able to determine and identify correctly a previously unreported phase.
With more than a hundred elements in the periodic table, a large number of potential new materials exist to address the technological and societal challenges we face today; however, without some ...guidance, searching through this vast combinatorial space is frustratingly slow and expensive, especially for materials strongly influenced by processing. We train a machine learning (ML) model on previously reported observations, parameters from physiochemical theories, and make it synthesis method-dependent to guide high-throughput (HiTp) experiments to find a new system of metallic glasses in the Co-V-Zr ternary. Experimental observations are in good agreement with the predictions of the model, but there are quantitative discrepancies in the precise compositions predicted. We use these discrepancies to retrain the ML model. The refined model has significantly improved accuracy not only for the Co-V-Zr system but also across all other available validation data. We then use the refined model to guide the discovery of metallic glasses in two additional previously unreported ternaries. Although our approach of iterative use of ML and HiTp experiments has guided us to rapid discovery of three new glass-forming systems, it has also provided us with a quantitatively accurate, synthesis method-sensitive predictor for metallic glasses that improves performance with use and thus promises to greatly accelerate discovery of many new metallic glasses. We believe that this discovery paradigm is applicable to a wider range of materials and should prove equally powerful for other materials and properties that are synthesis path-dependent and that current physiochemical theories find challenging to predict.
Extensive efforts to gather materials data have largely overlooked potential data redundancy. In this study, we present evidence of a significant degree of redundancy across multiple large datasets ...for various material properties, by revealing that up to 95% of data can be safely removed from machine learning training with little impact on in-distribution prediction performance. The redundant data is related to over-represented material types and does not mitigate the severe performance degradation on out-of-distribution samples. In addition, we show that uncertainty-based active learning algorithms can construct much smaller but equally informative datasets. We discuss the effectiveness of informative data in improving prediction performance and robustness and provide insights into efficient data acquisition and machine learning training. This work challenges the "bigger is better" mentality and calls for attention to the information richness of materials data rather than a narrow emphasis on data volume.
Insufficient availability of molten salt corrosion‐resistant alloys severely limits the fruition of a variety of promising molten salt technologies that could otherwise have significant societal ...impacts. To accelerate alloy development for molten salt applications and develop fundamental understanding of corrosion in these environments, here an integrated approach is presented using a set of high‐throughput (HTP) alloy synthesis, corrosion testing, and modeling coupled with automated characterization and machine learning. By using this approach, a broad range of CrFeMnNi alloys are evaluated for their corrosion resistances in molten salt simultaneously demonstrating that corrosion‐resistant alloy development can be accelerated by 2 to 3 orders of magnitude. Based on the obtained results, a sacrificial protection mechanism is unveiled in the corrosion of CrFeMnNi alloys in molten salts which can be applied to protect the less unstable elements in the alloy from being depleted, and provided new insights on the design of high‐temperature molten salt corrosion‐resistant alloys.
An integrated approach using a set of high‐throughput alloy synthesis, corrosion testing, and modeling coupled with automated characterization and machine learning is developed to accelerate the molten salt corrosion‐resistant alloy development for molten salt applications by 2 to 3 orders of magnitude.
The creation of composition-processing-structure relationships currently represents a key bottleneck for data analysis for high-throughput experimental (HTE) material studies. Here we propose an ...automated phase diagram attribution algorithm for HTE data analysis that uses a graph-based segmentation algorithm and Delaunay tessellation to create a crystal phase diagram from high throughput libraries of X-ray diffraction (XRD) patterns. We also propose the sample-pair based objective evaluation measures for the phase diagram prediction problem. Our approach was validated using 278 diffraction patterns from a Fe-Ga-Pd composition spread sample with a prediction precision of 0.934 and a Matthews Correlation Coefficient score of 0.823. The algorithm was then applied to the open Ni-Mn-Al thin-film composition spread sample to obtain the first predicted phase diagram mapping for that sample.
With their ability to rapidly elucidate composition-structure-property relationships, high-throughput experimental studies have revolutionized how materials are discovered, optimized, and ...commercialized. It is now possible to synthesize and characterize high-throughput libraries that systematically address thousands of individual cuts of fabrication parameter space. An unresolved issue remains transforming structural characterization data into phase mappings. This difficulty is related to the complex information present in diffraction and spectroscopic data and its variation with composition and processing. We review the field of automated phase diagram attribution and discuss the impact that emerging computational approaches will have in the generation of phase diagrams and beyond.