Abstract
Active learning—the field of machine learning (ML) dedicated to optimal experiment design—has played a part in science as far back as the 18th century when Laplace used it to guide his ...discovery of celestial mechanics. In this work, we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex synthesis-processes-structure-property landscape. We demonstrate an autonomous materials discovery methodology for functional inorganic compounds which allow scientists to fail smarter, learn faster, and spend less resources in their studies, while simultaneously improving trust in scientific results and machine learning tools. This robot science enables science-over-the-network, reducing the economic impact of scientists being physically separated from their labs. The real-time closed-loop, autonomous system for materials exploration and optimization (CAMEO) is implemented at the synchrotron beamline to accelerate the interconnected tasks of phase mapping and property optimization, with each cycle taking seconds to minutes. We also demonstrate an embodiment of human-machine interaction, where human-in-the-loop is called to play a contributing role within each cycle. This work has resulted in the discovery of a novel epitaxial nanocomposite phase-change memory material.
Abstract
Superconductivity has been the focus of enormous research effort since its discovery more than a century ago. Yet, some features of this unique phenomenon remain poorly understood; prime ...among these is the connection between superconductivity and chemical/structural properties of materials. To bridge the gap, several machine learning schemes are developed herein to model the critical temperatures (
T
c
) of the 12,000+ known superconductors available via the SuperCon database. Materials are first divided into two classes based on their
T
c
values, above and below 10 K, and a classification model predicting this label is trained. The model uses coarse-grained features based only on the chemical compositions. It shows strong predictive power, with out-of-sample accuracy of about 92%. Separate regression models are developed to predict the values of
T
c
for cuprate, iron-based, and low-
T
c
compounds. These models also demonstrate good performance, with learned predictors offering potential insights into the mechanisms behind superconductivity in different families of materials. To improve the accuracy and interpretability of these models, new features are incorporated using materials data from the AFLOW Online Repositories. Finally, the classification and regression models are combined into a single-integrated pipeline and employed to search the entire Inorganic Crystallographic Structure Database (ICSD) for potential new superconductors. We identify >30 non-cuprate and non-iron-based oxides as candidate materials.
The structural solution problem can be a daunting and time‐consuming task. Especially in the presence of impurity phases, current methods, such as indexing, become more unstable. In this work, the ...novel approach of semi‐supervised learning is applied towards the problem of identifying the Bravais lattice and the space group of inorganic crystals. The reported semi‐supervised generative deep‐learning model can train on both labeled data, i.e. diffraction patterns with the associated crystal structure, and unlabeled data, i.e. diffraction patterns that lack this information. This approach allows the models to take advantage of the troves of unlabeled data that current supervised learning approaches cannot, which should result in models that can more accurately generalize to real data. In this work, powder diffraction patterns are classified into all 14 Bravais lattices and 144 space groups (the number is limited due to sparse coverage in crystal structure databases), which covers more crystal classes than other studies. The reported models also outperform current deep‐learning approaches for both space group and Bravais lattice classification using fewer training data.
A semi‐supervised model to predict crystal structures from powder neutron diffraction patterns has been developed. The models have higher accuracies than current approaches while covering more space groups.
With their ability to rapidly elucidate composition-structure-property relationships, high-throughput experimental studies have revolutionized how materials are discovered, optimized, and ...commercialized. It is now possible to synthesize and characterize high-throughput libraries that systematically address thousands of individual cuts of fabrication parameter space. An unresolved issue remains transforming structural characterization data into phase mappings. This difficulty is related to the complex information present in diffraction and spectroscopic data and its variation with composition and processing. We review the field of automated phase diagram attribution and discuss the impact that emerging computational approaches will have in the generation of phase diagrams and beyond.
Abstract
Machine learning techniques have proven invaluable to manage the ever growing volume of materials research data produced as developments continue in high-throughput materials simulation, ...fabrication, and characterization. In particular, machine learning techniques have been demonstrated for their utility in rapidly and automatically identifying potential composition–phase maps from structural data characterization of composition spread libraries, enabling rapid materials fabrication-structure-property analysis and functional materials discovery. A key issue in development of an automated phase-diagram determination method is the choice of dissimilarity measure, or kernel function. The desired measure reduces the impact of confounding structural data issues on analysis performance. The issues include peak height changes and peak shifting due to lattice constant change as a function of composition. In this work, we investigate the choice of dissimilarity measure in X-ray diffraction-based structure analysis and the choice of measure’s performance impact on automatic composition-phase map determination. Nine dissimilarity measures are investigated for their impact in analyzing X-ray diffraction patterns for a Fe–Co–Ni ternary alloy composition spread. The cosine, Pearson correlation coefficient, and Jensen–Shannon divergence measures are shown to provide the best performance in the presence of peak height change and peak shifting (due to lattice constant change) when the magnitude of peak shifting is unknown. With prior knowledge of the maximum peak shifting, dynamic time warping in a normalized constrained mode provides the best performance. This work also serves to demonstrate a strategy for rapid analysis of a large number of X-ray diffraction patterns in general beyond data from combinatorial libraries.
Bayesian optimization (BO) is a well-developed machine learning (ML) field for black-box function optimization. In BO, a surrogate predictive model, here a Gaussian process, is used to approximate ...the black-box function. The estimated mean and uncertainty of the surrogate model are paired with an acquisition function to decide where to sample next. In this study, we applied this technique to known ferromagnetic thin-film materials such as ferromagnetic (Fe 100− y Ga y ) 1− x B x ( x = 0−21 and y = 9−17) to demonstrate optimization of structure-property relationships, specifically the dopant concentration or stoichiometry effect on magnetostriction and ferromagnetic resonance linewidth. Our results demonstrated that BO can be deployed to optimize structure-property relationships in FeGaB and FeGaC thin films. We have shown through simulation that using BO methods to guide experiments reduced the number of samples required to statistically determine the maximum or minimum by 50% compared to traditional methods. Our results suggest that BO can be used to save time and resources to optimize ferromagnetic films. This method is transferrable to other ferromagnetic material structure-property relationships, providing an accessible implementation of ML to magnetic materials development.
Today's cities generate tremendous amounts of data, thanks to a boom in affordable smart devices and sensors. The resulting big data creates opportunities to develop diverse sets of context-aware ...services and systems, ensuring smart city services are optimized to the dynamic city environment. Critical resources in these smart cities will be more rapidly deployed to regions in need, and those regions predicted to have an imminent or prospective need. For example, crime data analytics may be used to optimize the distribution of police, medical, and emergency services. However, as smart city services become dependent on data, they also become susceptible to disruptions in data streams, such as data loss due to signal quality reduction or due to power loss during data collection. This paper presents a dynamic network model for improving service resilience to data loss. The network model identifies statistically significant shared temporal trends across multivariate spatiotemporal data streams and utilizes these trends to improve data prediction performance in the case of data loss. Dynamics also allow the system to respond to changes in the data streams such as the loss or addition of new information flows. The network model is demonstrated by city-based crime rates reported in Montgomery County, MD, USA. A resilient network is developed utilizing shared temporal trends between cities to provide improved crime rate prediction and robustness to data loss, compared with the use of single city-based auto-regression. A maximum improvement in performance of 7.8 % for Silver Spring is found and an average improvement of 5.6 % among cities with high crime rates. The model also correctly identifies all the optimal network connections, according to prediction error minimization. City-to-city distance is designated as a predictor of shared temporal trends in crime and weather is shown to be a strong predictor of crime in Montgomery County.
Thin film libraries of Fe-Co-V were fabricated by combinatorial sputtering to study magnetic and structural properties over wide ranges of composition and thickness by high-throughput methods: ...synchrotron X-ray diffraction, magnetometry, composition, and thickness were measured across the Fe-Co-V libraries. In-plane magnetic hysteresis loops were shown to have a coercive field of 23.9 kA m
-1
(300 G) and magnetization of 1000 kA m
-1
. The out-of-plane direction revealed enhanced coercive fields of 207 kA m
-1
(2.6 kG) which was attributed to the shape anisotropy of column grains observed with electron microscopy. Angular dependence of the switching field showed that the magnetization reversal mechanism is governed by 180° domain wall pinning. In the thickness-dependent combinatorial study, co-sputtered composition spreads had a thickness ranging from 50 to 500 nm and (Fe
70
Co
30
)
100-x
V
x
compositions of x = 2-80. Comparison of high-throughput magneto-optical Kerr effect and traditional vibrating sample magnetometer measurements show agreement of trends in coercive fields across large composition and thickness regions.
Abstract
Analyzing large X-ray diffraction (XRD) datasets is a key step in high-throughput mapping of the compositional phase diagrams of combinatorial materials libraries. Optimizing and automating ...this task can help accelerate the process of discovery of materials with novel and desirable properties. Here, we report a new method for pattern analysis and phase extraction of XRD datasets. The method expands the Nonnegative Matrix Factorization method, which has been used previously to analyze such datasets, by combining it with custom clustering and cross-correlation algorithms. This new method is capable of robust determination of the number of basis patterns present in the data which, in turn, enables straightforward identification of any possible peak-shifted patterns. Peak-shifting arises due to continuous change in the lattice constants as a function of composition and is ubiquitous in XRD datasets from composition spread libraries. Successful identification of the peak-shifted patterns allows proper quantification and classification of the basis XRD patterns, which is necessary in order to decipher the contribution of each unique single-phase structure to the multi-phase regions. The process can be utilized to determine accurately the compositional phase diagram of a system under study. The presented method is applied to one synthetic and one experimental dataset and demonstrates robust accuracy and identification abilities.