Development of new products often relies on the discovery of novel molecules. While conventional molecular design involves using human expertise to propose, synthesize, and test new molecules, this ...process can be cost and time intensive, limiting the number of molecules that can be reasonably tested. Generative modeling provides an alternative approach to molecular discovery by reformulating molecular design as an inverse design problem. Here, we review the recent advances in the state‐of‐the‐art of generative molecular design and discusses the considerations for integrating these models into real molecular discovery campaigns. We first review the model design choices required to develop and train a generative model including common 1D, 2D, and 3D representations of molecules and typical generative modeling neural network architectures. We then describe different problem statements for molecular discovery applications and explore the benchmarks used to evaluate models based on those problem statements. Finally, we discuss the important factors that play a role in integrating generative models into experimental workflows. Our aim is that this review will equip the reader with the information and context necessary to utilize generative modeling within their domain.
This article is categorized under:
Data Science > Artificial Intelligence/Machine Learning
Generative modeling approaches can be used to discover novel and diverse compounds.
•Developed a framework to identify resins that remove orthogonal sets of impurities.•Screened protein/resin libraries and quantified each resin’s performance.•Cation and anion exchangers were ...orthogonal.•Strong and salt tolerant anion ion exchangers were not orthogonal.•Salt tolerant and multimodal cation exchangers were orthogonal.•Our framework facilitates engineering resins to optimize separability/orthogonality.
Recent studies have shown that by combining orthogonal, non-affinity chromatography steps, it is possible to rapidly develop efficient purification processes for molecules of interest. Here, we build upon previous work to develop a flexible framework for identifying resins that remove optimally orthogonal sets of impurities for a wide variety of products. Our approach involves screening a library of proteins with diverse properties (pI ranging from 5.0–11.4 and varying hydrophobicity measured by retention in a HIC gradient) on a library of resins and quantifying each resin’s ability to separate every protein pair in the library. Orthogonality is then defined as the degree to which two resins separate mutually exclusive sets of protein pairs. We applied this approach to a library of model proteins and a series of strong, salt tolerant, and multimodal ion exchangers and evaluated which resin combinations performed well and which performed poorly. In particular, we found that strong cation and strong anion exchangers were orthogonal, while strong and salt tolerant anion exchangers were not orthogonal. Interestingly, salt tolerant and multimodal cation exchangers were found to be orthogonal and the best resin combination included a multimodal cation exchange resin and a tentacular anion exchange resin. This approach for quantifying orthogonality is valuable in that it can be used both as a criteria for resin design as well as process design. We envision that, using this framework, it will be possible to design a set of next generation chromatography ligands that are explicitly engineered to optimize separability and orthogonality.
We investigated gramicidin A (gA) subunit dimerization in lipid bilayers using microsecond-long replica-exchange umbrella sampling simulations, millisecond-long unbiased molecular dynamics ...simulations, and machine learning. Our simulations led to a dimer structure that is indistinguishable from the experimentally determined gA channel structures, with the two gA subunits joined by six hydrogen bonds (6HB). The simulations also uncovered two additional dimer structures, with different gA-gA stacking orientations that were stabilized by four or two hydrogen bonds (4HB or 2HB). When examining the temporal evolution of the dimerization, we found that two bilayer-inserted gA subunits can form the 6HB dimer directly, with no discernible intermediate states, as well as through paths that involve the 2HB and 4HB dimers.