Data are a crucial raw material of this century. The amount of data that have been created in materials science thus far and that continues to be created every day is immense. Without a proper ...infrastructure that allows for collecting and sharing data, the envisioned success of big data-driven materials science will be hampered. For the field of computational materials science, the NOMAD (Novel Materials Discovery) Center of Excellence (CoE) has changed the scientific culture toward comprehensive and findable, accessible, interoperable, and reusable (FAIR) data, opening new avenues for mining materials science big data. Novel data-analytics concepts and tools turn data into knowledge and help in the prediction of new materials and in the identification of new properties of already known materials.
We present a parameter-free method for an accurate determination of long-range van der Waals interactions from mean-field electronic structure calculations. Our method relies on the summation of ...interatomic C6 coefficients, derived from the electron density of a molecule or solid and accurate reference data for the free atoms. The mean absolute error in the C6 coefficients is 5.5% when compared to accurate experimental values for 1225 intermolecular pairs, irrespective of the employed exchange-correlation functional. We show that the effective atomic C6 coefficients depend strongly on the bonding environment of an atom in a molecule. Finally, we analyze the van der Waals radii and the damping function in the C6R(-6) correction method for density-functional theory calculations.
The Novel Materials Discovery (NOMAD) Laboratory is a user-driven platform for sharing and exploiting computational materials science data. It accounts for the various aspects of data being a crucial ...raw material and most relevant to accelerate materials research and engineering. NOMAD, with the NOMAD Repository, and its code-independent and normalized form, the NOMAD Archive, comprises the worldwide largest data collection of this field. Based on its findable accessible, interoperable, reusable data infrastructure, various services are offered, comprising advanced visualization, the NOMAD Encyclopedia, and artificial-intelligence tools. The latter are realized in the NOMAD Analytics Toolkit. Prerequisite for all this is the NOMAD metadata, a unique and thorough description of the data, that are produced by all important computer codes of the community. Uploaded data are tagged by a persistent identifier, and users can also request a digital object identifier to make data citable. Developments and advancements of parsers and metadata are organized jointly with users and code developers. In this work, we review the NOMAD concept and implementation, highlight its orthogonality to and synergistic interplay with other data collections, and provide an outlook regarding ongoing and future developments.
The understanding of adsorption and reactions of (large) organic molecules at metal surfaces plays an increasingly important role in modern surface science and technology. Such hybrid ...inorganic/organic systems (HIOS) are relevant for many applications in catalysis, light-emitting diodes, single-molecule junctions, molecular sensors and switches, and photovoltaics. Obviously, the predictive modeling and understanding of the structure and stability of such hybrid systems is an essential prerequisite for tuning their electronic properties and functions. At present, density-functional theory (DFT) is the most promising approach to study the structure, stability, and electronic properties of complex systems, because it can be applied to both molecules and solids comprising thousands of atoms. However, state-of-the-art approximations to DFT do not provide a consistent and reliable description for HIOS, which is largely due to two issues: (i) the self-interaction of the electrons with themselves arising from the Hartree term of the total energy that is not fully compensated in approximate exchange-correlation functionals, and (ii) the lack of long-range part of the ubiquitous van der Waals (vdW) interactions. The self-interaction errors sometimes lead to incorrect description of charge transfer and electronic level alignment in HIOS, although for molecules adsorbed on metals these effects will often cancel out in total energy differences. Regarding vdW interactions, several promising vdW-inclusive DFT-based methods have been recently demonstrated to yield remarkable accuracy for intermolecular interactions in the gas phase. However, the majority of these approaches neglect the nonlocal collective electron response in the vdW energy tail, an effect that is particularly strong in condensed phases and at interfaces between different materials. Here we show that the recently developed DFT+vdWsurf method that accurately accounts for the collective electronic response effects enables reliable modeling of structure and stability for a broad class of organic molecules adsorbed on metal surfaces. This method was demonstrated to achieve quantitative accuracy for aromatic hydrocarbons (benzene, naphthalene, anthracene, and diindenoperylene), C60, and sulfur/oxygen-containing molecules (thiophene, NTCDA, and PTCDA) on close-packed and stepped metal surfaces, leading to an overall accuracy of 0.1 Å in adsorption heights and 0.1 eV in binding energies with respect to state-of-the-art experiments. An unexpected finding is that vdW interactions contribute more to the binding of strongly bound molecules on transition-metal surfaces than for molecules physisorbed on coinage metals. The accurate inclusion of vdW interactions also significantly improves tilting angles and adsorption heights for all the studied molecules, and can qualitatively change the potential-energy surface for adsorbed molecules with flexible functional groups. Activation barriers for molecular switches and reaction precursors are modified as well.
The random-phase approximation (RPA) as an approach for computing the electronic correlation energy is reviewed. After a brief account of its basic concept and historical development, the paper is ...devoted to the theoretical formulations of RPA, and its applications to realistic systems. With several illustrating applications, we discuss the implications of RPA for computational chemistry and materials science. The computational cost of RPA is also addressed which is critical for its widespread use in future applications. In addition, current correction schemes going beyond RPA and directions of further development will be discussed.
Computational methods that automatically extract knowledge from data are critical for enabling data-driven materials science. A reliable identification of lattice symmetry is a crucial first step for ...materials characterization and analytics. Current methods require a user-specified threshold, and are unable to detect average symmetries for defective structures. Here, we propose a machine learning-based approach to automatically classify structures by crystal symmetry. First, we represent crystals by calculating a diffraction image, then construct a deep learning neural network model for classification. Our approach is able to correctly classify a dataset comprising more than 100,000 simulated crystal structures, including heavily defective ones. The internal operations of the neural network are unraveled through attentive response maps, demonstrating that it uses the same landmarks a materials scientist would use, although never explicitly instructed to do so. Our study paves the way for crystal structure recognition of-possibly noisy and incomplete-three-dimensional structural data in big-data materials science.
An efficient method is developed for the microscopic description of the frequency-dependent polarizability of finite-gap molecules and solids. This is achieved by combining the Tkatchenko-Scheffler ...van der Waals (vdW) method Phys. Rev. Lett. 102, 073005 (2009) with the self-consistent screening equation of classical electrodynamics. This leads to a seamless description of polarization and depolarization for the polarizability tensor of molecules and solids. The screened long-range many-body vdW energy is obtained from the solution of the Schrödinger equation for a system of coupled oscillators. We show that the screening and the many-body vdW energy play a significant role even for rather small molecules, becoming crucial for an accurate treatment of conformational energies for biomolecules and binding of molecular crystals. The computational cost of the developed theory is negligible compared to the underlying electronic structure calculation.
Statistical learning of materials properties or functions so far starts with a largely silent, nonchallenged step: the choice of the set of descriptive parameters (termed descriptor). However, when ...the scientific connection between the descriptor and the actuating mechanisms is unclear, the causality of the learned descriptor-property relation is uncertain. Thus, a trustful prediction of new promising materials, identification of anomalies, and scientific advancement are doubtful. We analyze this issue and define requirements for a suitable descriptor. For a classic example, the energy difference of zinc blende or wurtzite and rocksalt semiconductors, we demonstrate how a meaningful descriptor can be found systematically.
Abstract
Although machine learning (ML) models promise to substantially accelerate the discovery of novel materials, their performance is often still insufficient to draw reliable conclusions. ...Improved ML models are therefore actively researched, but their design is currently guided mainly by monitoring the average model test error. This can render different models indistinguishable although their performance differs substantially across materials, or it can make a model appear generally insufficient while it actually works well in specific sub-domains. Here, we present a method, based on subgroup discovery, for detecting domains of applicability (DA) of models within a materials class. The utility of this approach is demonstrated by analyzing three state-of-the-art ML models for predicting the formation energy of transparent conducting oxides. We find that, despite having a mutually indistinguishable and unsatisfactory average error, the models have DAs with distinctive features and notably improved performance.
The random-phase approximation (RPA) for the electron correlation energy, combined with the exact-exchange (EX) energy, represents the state-of-the-art exchange-correlation functional within ...density-functional theory. However, the standard RPA practice--evaluating both the EX and the RPA correlation energies using Kohn-Sham (KS) orbitals from local or semilocal exchange-correlation functionals--leads to a systematic underbinding of molecules and solids. Here we demonstrate that this behavior can be corrected by adding a "single excitation" contribution, so far not included in the standard RPA scheme. A similar improvement can also be achieved by replacing the non-self-consistent EX total energy by the corresponding self-consistent Hartree-Fock total energy, while retaining the RPA correlation energy evaluated using KS orbitals. Both schemes achieve chemical accuracy for a standard benchmark set of noncovalent intermolecular interactions.