Machine learning methods are becoming integral to scientific inquiry in numerous disciplines. We demonstrated that machine learning can be used to predict the performance of a synthetic reaction in ...multidimensional chemical space using data obtained via high-throughput experimentation. We created scripts to compute and extract atomic, molecular, and vibrational descriptors for the components of a palladium-catalyzed Buchwald-Hartwig cross-coupling of aryl halides with 4-methylaniline in the presence of various potentially inhibitory additives. Using these descriptors as inputs and reaction yield as output, we showed that a random forest algorithm provides significantly improved predictive performance over linear regression analysis. The random forest model was also successfully applied to sparse training sets and out-of-sample prediction, suggesting its value in facilitating adoption of synthetic methodology.
Innovations in synthetic chemistry have enabled the discovery of many breakthrough therapies that have improved human health over the past century. In the face of increasing challenges in the ...pharmaceutical sector, continued innovation in chemistry is required to drive the discovery of the next wave of medicines. Novel synthetic methods not only unlock access to previously unattainable chemical matter, but also inspire new concepts as to how we design and build chemical matter. We identify some of the most important recent advances in synthetic chemistry as well as opportunities at the interface with partner disciplines that are poised to transform the practice of drug discovery and development.
Conspectus The structural complexity of pharmaceuticals presents a significant challenge to modern catalysis. Many published methods that work well on simple substrates often fail when attempts are ...made to apply them to complex drug intermediates. The use of high-throughput experimentation (HTE) techniques offers a means to overcome this fundamental challenge by facilitating the rational exploration of large arrays of catalysts and reaction conditions in a time- and material-efficient manner. Initial forays into the use of HTE in our laboratories for solving chemistry problems centered around screening of chiral precious-metal catalysts for homogeneous asymmetric hydrogenation. The success of these early efforts in developing efficient catalytic steps for late-stage development programs motivated the desire to increase the scope of this approach to encompass other high-value catalytic chemistries. Doing so, however, required significant advances in reactor and workflow design and automation to enable the effective assembly and agitation of arrays of heterogeneous reaction mixtures and retention of volatile solvents under a wide range of temperatures. Associated innovations in high-throughput analytical chemistry techniques greatly increased the efficiency and reliability of these methods. These evolved HTE techniques have been utilized extensively to develop highly innovative catalysis solutions to the most challenging problems in large-scale pharmaceutical synthesis. Starting with Pd- and Cu-catalyzed cross-coupling chemistry, subsequent efforts expanded to other valuable modern synthetic transformations such as chiral phase-transfer catalysis, photoredox catalysis, and C–H functionalization. As our experience and confidence in HTE techniques matured, we envisioned their application beyond problems in process chemistry to address the needs of medicinal chemists. Here the problem of reaction generality is felt most acutely, and HTE approaches should prove broadly enabling. However, the quantities of both time and starting materials available for chemistry troubleshooting in this space generally are severely limited. Adapting to these needs led us to invest in smaller predefined arrays of transformation-specific screening “kits” and push the boundaries of miniaturization in chemistry screening, culminating in the development of “nanoscale” reaction screening carried out in 1536-well plates. Grappling with the problem of generality also inspired the exploration of cheminformatics-driven HTE approaches such as the Chemistry Informer Libraries. These next-generation HTE methods promise to empower chemists to run orders of magnitude more experiments and enable “big data” informatics approaches to reaction design and troubleshooting. With these advances, HTE is poised to revolutionize how chemists across both industry and academia discover new synthetic methods, develop them into tools of broad utility, and apply them to problems of practical significance.
Over the past two decades, there have been major developments in transition metal–catalyzed aminations of aryl halides to form anilines, a common structure found in drug agents, natural product ...isolates, and fine chemicals. Many of these approaches have enabled highly efficient and selective coupling through the design of specialized ligands, which facilitate reductive elimination from a destabilized metal center. We postulated that a general and complementary method for carbon–nitrogen bond formation could be developed through the destabilization of a metal amido complex via photoredox catalysis, thus providing an alternative approach to the use of structurally complex ligand systems. Here, we report the development of a distinct mechanistic paradigm for aryl amination using ligand-free nickel(II) salts, in which facile reductive elimination from the nickel metal center is induced via a photoredox-catalyzed electron-transfer event.
We demonstrate that the chemical-feature model described in our original paper is distinguishable from the nongeneralizable models introduced by Chuang and Keiser. Furthermore, the chemical-feature ...model significantly outperforms these models in out-of-sample predictions, justifying the use of chemical featurization from which machine learning models can extract meaningful patterns in the dataset, as originally described.
At the forefront of new synthetic endeavors, such as drug discovery or natural product synthesis, large quantities of material are rarely available and timelines are tight. A miniaturized automation ...platform enabling high-throughput experimentation for synthetic route scouting to identify conditions for preparative reaction scale-up would be a transformative advance. Because automated, miniaturized chemistry is difficult to carry out in the presence of solids or volatile organic solvents, most of the synthetic "toolkit" cannot be readily miniaturized. Using palladium-catalyzed cross-coupling reactions as a test case, we developed automation-friendly reactions to run in dimethyl sulfoxide at room temperature. This advance enabled us to couple the robotics used in biotechnology with emerging mass spectrometry–based high-throughput analysis techniques. More than 1500 chemistry experiments were carried out in less than a day, using as little as 0.02 milligrams of material per reaction.
Although metal-catalyzed direct arylation reactions of non- or weakly acidic C–H bonds have recently received much attention, chemists have relied heavily on substrates with appropriately placed ...directing groups to steer reactivity. To date, examples of intermolecular arylation of unactivated C(sp3)–H bonds in the absence of a directing group remain scarce. We report herein the first general, high-yielding, and scalable method for palladium-catalyzed C(sp3)–H arylation of simple diarylmethane derivatives with aryl bromides at room temperature. This method facilitates access to a variety of sterically and electronically diverse hetero- and nonheteroaryl-containing triarylmethanes, a class of compounds with various applications and interesting biological activity. Key to the success of this approach is an in situ metalation of the substrate via C–H deprotonation under catalytic cross-coupling conditions, which is referred to as a deprotonative-cross-coupling process (DCCP). Base and catalyst identification were performed by high-throughput experimentation (HTE) and led to a unique base/catalyst combination KN(SiMe3)2/Pd–NiXantphos that proved to efficiently promote the room-temperature DCCP of diarylmethanes. Additionally, the DCCP exhibits remarkable chemoselectivity in the presence of substrates that are known to undergo O-, N-, enolate-, and C(sp2)–H arylation.
Conspectus The synthetic chemistry literature traditionally reports the scope of new methods using simple, nonstandardized test molecules that have uncertain relevance in applied synthesis. In ...addition, published examples heavily favor positive reaction outcomes, and failure is rarely documented. In this environment, synthetic practitioners have inadequate information to know whether any given method is suitable for the task at hand. Moreover, the incomplete nature of published data makes it poorly suited for the creation of predictive reactivity models via machine learning approaches. In 2016, we reported the concept of chemistry informer libraries as standardized sets of medium- to high-complexity substrates with relevance to pharmaceutical synthesis as demonstrated using a multidimensional principle component analysis (PCA) comparison to the physicochemical properties of marketed drugs. We showed how informer libraries could be used to evaluate leading synthetic methods with the complete capture of success and failure and how this knowledge could lead to improved reaction conditions with a broader scope with respect to relevant applications. In this Account, we describe the progress made and lessons learned in subsequent studies using informer libraries to profile eight additional reaction classes. Examining broad trends across multiple types of bond disconnections against a standardized chemistry “measuring stick” has enabled comparisons of the relative potential of different methods for applications in complex synthesis and has identified opportunities for further development. Furthermore, the powerful combination of informer libraries and 1536-well-plate nanoscale reaction screening has allowed the parallel evaluation of scores of synthetic methods in the same experiment and as such illuminated an important role for informers as part of a larger data generation workflow for predictive reactivity modeling. Using informer libraries as problem-dense, strong filters has allowed broad sets of reaction conditions to be narrowed down to those that display the highest tolerance to complex substrates. These best conditions can then be used to survey broad swaths of substrate space using nanoscale chemistry approaches. Our experiences and those of our collaborators from several academic laboratories applying informer libraries in these contexts have helped us identify several areas for potential improvements to the approach that would increase their ease of use, utility in generating interpretable results, and resulting uptake by the broader community. As we continue to evolve the informer library concept, we believe it will play an ever-increasing role in the future of the democratization of high-throughput experimentation and data science-driven synthetic method development.
Although much current research focuses on developing new boron reagents and identifying robust catalytic systems for the cross-coupling of these reagents, the fundamental preparations of the ...nucleophilic partners (i.e., boronic acids and derivatives) has been studied to a lesser extent. Most current methods to access boronic acids are indirect and require harsh conditions or expensive reagents. A simple and efficient palladium-catalyzed, direct synthesis of arylboronic acids from the corresponding aryl chlorides using an underutilized reagent, tetrahydroxydiboron B2(OH)4, is reported. To ensure preservation of the carbon−boron bond, the boronic acids were efficiently converted to the trifluoroborate derivatives in good to excellent yields without the use of a workup or isolation. Further, the intermediate boronic acids can be easily converted to a wide range of useful boronates. Finally, a two-step, one-pot method was developed to couple two aryl chlorides efficiently in a Suzuki−Miyaura-type reaction.
The Open Reaction Database Kearnes, Steven M; Maser, Michael R; Wleklinski, Michael ...
Journal of the American Chemical Society,
11/2021, Letnik:
143, Številka:
45
Journal Article
Recenzirano
Odprti dostop
Chemical reaction data in journal articles, patents, and even electronic laboratory notebooks are currently stored in various formats, often unstructured, which presents a significant barrier to ...downstream applications, including the training of machine-learning models. We present the Open Reaction Database (ORD), an open-access schema and infrastructure for structuring and sharing organic reaction data, including a centralized data repository. The ORD schema supports conventional and emerging technologies, from benchtop reactions to automated high-throughput experiments and flow chemistry. The data, schema, supporting code, and web-based user interfaces are all publicly available on GitHub. Our vision is that a consistent data representation and infrastructure to support data sharing will enable downstream applications that will greatly improve the state of the art with respect to computer-aided synthesis planning, reaction prediction, and other predictive chemistry tasks.