Accelerating materials research by integrating automation with artificial intelligence is increasingly recognized as a grand scientific challenge to discover and develop materials for emerging and ...future technologies. While the solid state materials science community has demonstrated a broad range of high throughput methods and effectively leveraged computational techniques to accelerate individual research tasks, revolutionary acceleration of materials discovery has yet to be fully realized. This perspective review presents a framework and ontology to outline a materials experiment lifecycle and visualize materials discovery workflows, providing a context for mapping the realized levels of automation and the next generation of autonomous loops in terms of scientific and automation complexity. Expanding autonomous loops to encompass larger portions of complex workflows will require integration of a range of experimental techniques as well as automation of expert decisions, including subtle reasoning about data quality, responses to unexpected data, and model design. Recent demonstrations of workflows that integrate multiple techniques and include autonomous loops, combined with emerging advancements in artificial intelligence and high throughput experimentation, signal the imminence of a revolution in materials discovery.
Integrating automation with artificial intelligence will enable scientists to spend more time identifying important problems and communicating critical insights, accelerating discovery and development of materials for emerging and future technologies.
Abstract
The photocatalytic conversion of the greenhouse gas CO
2
to chemical fuels such as hydrocarbons and alcohols continues to be a promising technology for renewable generation of energy. Major ...advancements have been made in improving the efficiencies and product selectiveness of currently known CO
2
reduction electrocatalysts, nonetheless, materials discovery is needed to enable economically viable, industrial-scale CO
2
reduction. We report here the largest CO
2
photocathode search to date, starting with 68860 candidate materials, using a rational first-principles computation-based screening strategy to evaluate synthesizability, corrosion resistance, visible-light absorption, and compatibility of the electronic structure with fuel synthesis. The results confirm the observation of the literature that few materials meet the stringent CO
2
photocathode requirements, with only 52 materials meeting all requirements. The results are well validated with respect to the literature, with 9 of these materials having been studied for CO
2
reduction, and the remaining 43 materials are discoveries from our pipeline that merit further investigation.
We present a first-principles-based formalism to provide a quantitative measure of the thermodynamic instability and propensity for electrochemical stabilization, passivation, or corrosion of ...metastable materials in aqueous media. We demonstrate that this formalism can assess the relative Gibbs free energy of candidate materials in aqueous media as well as their decomposition products, combining solid and aqueous phases, as a function of pH and potential. On the basis of benchmarking against 20 stable as well as metastable materials reported in the literature and also our experimental characterization of metastable triclinic-FeVO4, we present quantitative estimates for the relative Gibbs free energy and corresponding aqueous regimes where these materials are most likely to be stable, form inert passivating films, or steadily corrode to aqueous species. Furthermore, we show that the structure and composition of the passivating films formed on triclinic-FeVO4 are also in excellent agreement with the Point Defect Model, as proposed by the corrosion community. An open-source web application based on the formalism is made available at https://materialsproject.org.
High-throughput experimentation provides efficient mapping of composition–property relationships, and its implementation for the discovery of optical materials enables advancements in solar energy ...and other technologies. In a high throughput pipeline, automated data processing algorithms are often required to match experimental throughput, and we present an automated Tauc analysis algorithm for estimating band gap energies from optical spectroscopy data. The algorithm mimics the judgment of an expert scientist, which is demonstrated through its application to a variety of high throughput spectroscopy data, including the identification of indirect or direct band gaps in Fe2O3, Cu2V2O7, and BiVO4. The applicability of the algorithm to estimate a range of band gap energies for various materials is demonstrated by a comparison of direct-allowed band gaps estimated by expert scientists and by automated algorithm for 60 optical spectra.
As the materials science community seeks to capitalize on recent advancements in computer science, the sparsity of well-labelled experimental data and limited throughput by which it can be generated ...have inhibited deployment of machine learning algorithms to date. Several successful examples in computational chemistry have inspired further adoption of machine learning algorithms, and in the present work we present autoencoding algorithms for measured optical properties of metal oxides, which can serve as an exemplar for the breadth and depth of data required for modern algorithms to learn the underlying structure of experimental materials science data. Our set of 178 994 distinct materials samples spans 78 distinct composition spaces, includes 45 elements, and contains more than 80 000 unique quinary oxide and 67 000 unique quaternary oxide compositions, making it the largest and most diverse experimental materials set utilized in machine learning studies. The extensive dataset enabled training and validation of 3 distinct models for mapping between sample images and absorption spectra, including a conditional variational autoencoder that generates images of hypothetical materials with tailored absorption properties. The absorption patterns auto-generated from sample images capture the salient features of ground truth spectra, and band gap energies extracted from these auto-generated patterns are quite accurate with a mean absolute error of 180 meV, which is the approximate uncertainty from traditional extraction of the band gap energy from measurements of the full transmission and reflection spectra. Optical properties of materials are not only ubiquitous in materials applications but also emblematic of the confluence of underlying physical phenomena yielding the type of complex data relationships that merit and benefit from neural network-type modelling.
Sequential learning (SL) strategies,
i.e.
iteratively updating a machine learning model to guide experiments, have been proposed to significantly accelerate materials discovery and research. ...Applications on computational datasets and a handful of optimization experiments have demonstrated the promise of SL, motivating a quantitative evaluation of its ability to accelerate materials discovery, specifically in the case of physical experiments. The benchmarking effort in the present work quantifies the performance of SL algorithms with respect to a breadth of research goals: discovery of any "good" material, discovery of all "good" materials, and discovery of a model that accurately predicts the performance of new materials. To benchmark the effectiveness of different machine learning models against these goals, we use datasets in which the performance of all materials in the search space is known from high-throughput synthesis and electrochemistry experiments. Each dataset contains all pseudo-quaternary metal oxide combinations from a set of six elements (chemical space), the performance metric chosen is the electrocatalytic activity (overpotential) for the oxygen evolution reaction (OER). A diverse set of SL schemes is tested on four chemical spaces, each containing 2121 catalysts. The presented work suggests that research can be accelerated by up to a factor of 20 compared to random acquisition in specific scenarios. The results also show that certain choices of SL models are ill-suited for a given research goal resulting in substantial deceleration compared to random acquisition methods. The results provide quantitative guidance on how to tune an SL strategy for a given research goal and demonstrate the need for a new generation of materials-aware SL algorithms to further accelerate materials discovery.
Benchmarking metrics for materials discovery
via
sequential learning are presented, to assess the efficacy of existing algorithms and to be scientific in our assessment of accelerated science.
Machine learning for materials discovery has largely focused on predicting an individual scalar rather than multiple related properties, where spectral properties are an important example. ...Fundamental spectral properties include the phonon density of states (phDOS) and the electronic density of states (eDOS), which individually or collectively are the origins of a breadth of materials observables and functions. Building upon the success of graph attention networks for encoding crystalline materials, we introduce a probabilistic embedding generator specifically tailored to the prediction of spectral properties. Coupled with supervised contrastive learning, our materials-to-spectrum (Mat2Spec) model outperforms state-of-the-art methods for predicting ab initio phDOS and eDOS for crystalline materials. We demonstrate Mat2Spec's ability to identify eDOS gaps below the Fermi energy, validating predictions with ab initio calculations and thereby discovering candidate thermoelectrics and transparent conductors. Mat2Spec is an exemplar framework for predicting spectral properties of materials via strategically incorporated machine learning techniques.
The photocatalytic conversion of the greenhouse gas CO
to chemical fuels such as hydrocarbons and alcohols continues to be a promising technology for renewable generation of energy. Major ...advancements have been made in improving the efficiencies and product selectiveness of currently known CO
reduction electrocatalysts, nonetheless, materials discovery is needed to enable economically viable, industrial-scale CO
reduction. We report here the largest CO
photocathode search to date, starting with 68860 candidate materials, using a rational first-principles computation-based screening strategy to evaluate synthesizability, corrosion resistance, visible-light absorption, and compatibility of the electronic structure with fuel synthesis. The results confirm the observation of the literature that few materials meet the stringent CO
photocathode requirements, with only 52 materials meeting all requirements. The results are well validated with respect to the literature, with 9 of these materials having been studied for CO
reduction, and the remaining 43 materials are discoveries from our pipeline that merit further investigation.
Multimetallic nanoclusters (MMNCs) offer unique and tailorable surface chemistries that hold great potential for numerous catalytic applications. The efficient exploration of this vast chemical space ...necessitates an accelerated discovery pipeline that supersedes traditional “trial-and-error” experimentation while guaranteeing uniform microstructures despite compositional complexity. Herein, we report the high-throughput synthesis of an extensive series of ultrafine and homogeneous alloy MMNCs, achieved by 1) a flexible compositional design by formulation in the precursor solution phase and 2) the ultrafast synthesis of alloy MMNCs using thermal shock heating (i.e., ∼1,650 K, ∼500 ms). This approach is remarkably facile and easily accessible compared to conventional vapor-phase deposition, and the particle size and structural uniformity enable comparative studies across compositionally different MMNCs. Rapid electrochemical screening is demonstrated by using a scanning droplet cell, enabling us to discover two promising electrocatalysts, which we subsequently validated using a rotating disk setup. This demonstrated high-throughput material discovery pipeline presents a paradigm for facile and accelerated exploration of MMNCs for a broad range of applications.
The limited number of known low-band-gap photoelectrocatalytic materials poses a significant challenge for the generation of chemical fuels from sunlight. Using high-throughput ab initio theory with ...experiments in an integrated workflow, we find eight ternary vanadate oxide photoanodes in the target band-gap range (1.2–2.8 eV). Detailed analysis of these vanadate compounds reveals the key role of VO₄ structural motifs and electronic band-edge character in efficient photoanodes, initiating a genome for such materials and paving the way for a broadly applicable high-throughput-discovery and materials-by-design feedback loop. Considerably expanding the number of known photoelectrocatalysts for water oxidation, our study establishes ternary metal vanadates as a prolific class of photoanode materials for generation of chemical fuels from sunlight and demonstrates our high-throughput theory–experiment pipeline as a prolific approach to materials discovery.