Development of new products often relies on the discovery of novel molecules. While conventional molecular design involves using human expertise to propose, synthesize, and test new molecules, this ...process can be cost and time intensive, limiting the number of molecules that can be reasonably tested. Generative modeling provides an alternative approach to molecular discovery by reformulating molecular design as an inverse design problem. Here, we review the recent advances in the state‐of‐the‐art of generative molecular design and discusses the considerations for integrating these models into real molecular discovery campaigns. We first review the model design choices required to develop and train a generative model including common 1D, 2D, and 3D representations of molecules and typical generative modeling neural network architectures. We then describe different problem statements for molecular discovery applications and explore the benchmarks used to evaluate models based on those problem statements. Finally, we discuss the important factors that play a role in integrating generative models into experimental workflows. Our aim is that this review will equip the reader with the information and context necessary to utilize generative modeling within their domain.
This article is categorized under:
Data Science > Artificial Intelligence/Machine Learning
Generative modeling approaches can be used to discover novel and diverse compounds.
Rapid antimicrobial susceptibility testing (AST) is an integral tool to mitigate the unnecessary use of powerful and broad-spectrum antibiotics that leads to the proliferation of multi-drug-resistant ...bacteria. Using a sensor platform composed of surface-enhanced Raman scattering (SERS) sensors with control of nanogap chemistry and machine learning algorithms for analysis of complex spectral data, bacteria metabolic profiles post antibiotic exposure are correlated with susceptibility. Deep neural network models are able to discriminate the responses of Escherichia coli and Pseudomonas aeruginosa to antibiotics from untreated cells in SERS data in 10 min after antibiotic exposure with greater than 99% accuracy. Deep learning analysis is also able to differentiate responses from untreated cells with antibiotic dosages up to 10-fold lower than the minimum inhibitory concentration observed in conventional growth assays. In addition, analysis of SERS data using a generative model, a variational autoencoder, identifies spectral features in the P. aeruginosa lysate data associated with antibiotic efficacy. From this insight, a combinatorial dataset of metabolites is selected to extend the latent space of the variational autoencoder. This culture-free dataset dramatically improves classification accuracy to select effective antibiotic treatment in 30 min. Unsupervised Bayesian Gaussian mixture analysis achieves 99.3% accuracy in discriminating between susceptible versus resistant to antibiotic cultures in SERS using the extended latent space. Discriminative and generative models rapidly provide high classification accuracy with small sets of labeled data, which enormously reduces the amount of time needed to validate phenotypic AST with conventional growth assays. Thus, this work outlines a promising approach toward practical rapid AST.
To provide precise recommendations, traditional recommender systems (RS) collect personal data, user preference and feedback, which are sensitive to each user if such information is maliciously used ...for extra analysis. In recent years, differential privacy (DP) has been widely applied in RS to provide privacy protection for sensitive information. Prior studies explored the combination of DP and RS, while neglecting the disparate effect on model accuracy of imbalanced subgroups as large user groups control the trained model, and DP can worsen the disparate effect of degrading the performance of recommender systems significantly. Besides, the number of uploaded contributions can differ among users for training a recommender system, so it is necessary to set the user-level privacy guarantee.
In this paper, we make four contributions. First, we propose an efficient way of constructing datasets for training a recommender system based on prior theories. Second, we compute the user-level priors based on user metadata to optimize the VAE model. Besides, we add noise into the calculation process to protect user metadata. Third, we analyze and propose a tighter theoretical bound on gradient updates for DP Stochastic Gradient Descent (DPSGD). Finally, we exploit these theoretical results and propose a novel DP-VAE based recommender system. Extensive experimental results on multiple datasets show that our system can achieve high recommendation precision while maintaining a reasonable privacy guarantee.
Data-driven fault diagnostics of safety–critical systems often faces the challenge of a complete lack of labeled data from faulty system conditions at training time. Since faults of unknown types can ...arise during deployment, fault diagnostics in this scenario is an open-set learning problem. Without labels and samples from the possible fault types, the open-set diagnostics problem is typically reformulated as fault detection and fault segmentation tasks. Traditional approaches to these tasks, such as one-class classification and unsupervised clustering, do not typically leverage all the available labeled and unlabeled data in the learning algorithm. As a result, their performance is sub-optimal. In this work, we propose an adapted version of the variational autoencoder (VAE), which leverages all available data at training time and has two new design features: 1) implicit supervision on the latent representation of the healthy conditions and 2) implicit bias in the sampling process. The proposed method induces a compact and informative latent representation, thus enabling good detection and segmentation of previously unseen fault types. In an extensive comparison using two turbofan engine datasets, we demonstrate that the proposed method outperforms other learning strategies and deep learning algorithms, yielding significant performance improvements in fault detection and fault segmentation.
Variational Autoencoders is one of the most valuable generative models in the field of unsupervised learning. Due to its own construction characteristics, Variational Autoencoders has insufficient ...precision for high-resolution image reconstruction. In this paper, the priori variant model of Variational Autoencoders based on the Gaussian Cloud Model is proposed to optimize the sampling method of latent variables, network structure and loss function. First, the Gaussian Cloud Model is used to replace the prior distribution of Variational Autoencoders. Second, the sampling process is changed into two consecutive Gaussian distributions. Finally, a new loss function based on the envelope curve of the Gaussian Cloud Model is presented for approximating the real data distribution. The method is evaluated qualitatively and quantitatively on several datasets to fully demonstrate the correctness and effectiveness of the method.
•GCMVAE add representation learning for reconstructed data.•The Gaussian cloud can be understood as two consecutive Gaussian distributions.•GCMVAE increasing the probability of detailing the latent variables in the picking.•The data generated by GCMVAE is smoother and more continuous.
The cartoon style transfer has attracted widespread attention. Although many researchers have proposed various methods to develop this field, there are still two areas for improvement: (1) existing ...image-to-image cartoon style transfer methods can only perform domain-to-domain cartoon style transfer, neglecting the specific color and texture of the cartoon images, and (2) arbitrary style transfer methods only transfer the style of the style image onto the content image, neglecting the style information of the style domain. To address these issues, we observe that artists often refer to specific paintings to fine-tune the color of their artworks. This behavior inspires us to propose a method that can dynamically encode the style information of a specific cartoon image based on Variational Autoencoders, allowing the style feature to be cast onto the content feature dynamically. We also introduce a cartoon contrastive learning loss to push the cartoon stylized image closer to the same cartoon stylized image and otherwise pull away. Extensive experiments demonstrate that our proposed method, Caster, can generate a high-quality stylized image with specific and domain cartoon style information than state-of-the-art cartoon style transfer methods.
Large datasets are necessary for deep learning as the performance of the algorithms used increases as the size of the dataset increases. Poor data management practices and the low level of ...digitisation of the construction industry represent a big hurdle to compiling big datasets; which in many cases can be prohibitively expensive. In other fields, such as computer vision, data augmentation techniques and synthetic data have been used successfully to address issues with limited datasets. In this study, undercomplete, sparse, deep and variational autoencoders are investigated as methods for data augmentation and generation of synthetic data. Two financial datasets of underground and overhead power transmission projects are used as case studies. The datasets were augmented using the autoencoders, and the project cost was predicted using a deep neural network regressor. All the augmented datasets yielded better results than the original dataset. On average the autoencoders provide a model score improvement of 7.2% and 11.5% for the underground and overhead datasets, respectively. MAE and RMSE are lower for all autoencoders as well. The average error improvement for the underground and overhead datasets is 22.9% and 56.5%, respectively. Variational autoencoders provided more robust results and represented better the non-linear correlations among the attributes in both datasets. The novelty of this study is that presents an approach to improve existing datasets and thus improve the generalisation of deep learning models when other approaches are not feasible. Moreover, this study provides practitioners with methods to address the limited access to big datasets, a visualisation method to extract insights from non-linear correlations in data, and a way to improve data privacy and to enable sharing sensitive data using analogous synthetic data. The main contribution to knowledge of this study is that it presents a data augmentation technique for transformation variant data. Many techniques have been developed for transformation invariant data that contributed to improving the performance of deep learning models. This study showed that autoencoders are a good option for data augmentation for transformation variant data.
•Small datasets limit the adoption of deep learning.•Compiling big datasets is prohibitively expensive for most companies.•Autoencoders are used for data augmentation to improve model performance.•Variational autoencoders proved a robust method for data augmentation.•Latent representation visualisations for project classifications are presented.
•Modelling framwork that enables Bayesian semi-supervised learning.•Bayesian semi-supervised improves overall performance and uncertainty calibration.•Models generalize standard deep generative ...models for semi-supervised learning.
Generative models can be used for a wide range of tasks, and have the appealing ability to learn from both labelled and unlabelled data. In contrast, discriminative models cannot learn from unlabelled data, but tend to outperform their generative counterparts in supervised tasks. We develop a framework to jointly train deep generative and discriminative models, enjoying the benefits of both. The framework allows models to learn from labelled and unlabelled data, as well as naturally account for uncertainty in predictive distributions, providing the first Bayesian approach to semi-supervised learning with deep generative models. We demonstrate that our blended discriminative and generative models outperform purely generative models in both predictive performance and uncertainty calibration in a number of semi-supervised learning tasks.
Lifelong Teacher-Student Network Learning Ye, Fei; Bors, Adrian G.
IEEE transactions on pattern analysis and machine intelligence,
2022-Oct.-1, 2022-10-1, 20221001, Volume:
44, Issue:
10
Journal Article
Peer reviewed
Open access
A unique cognitive capability of humans consists in their ability to acquire new knowledge and skills from a sequence of experiences. Meanwhile, artificial intelligence systems are good at learning ...only the last given task without being able to remember the databases learnt in the past. We propose a novel lifelong learning methodology by employing a Teacher-Student network framework. While the Student module is trained with a new given database, the Teacher module would remind the Student about the information learnt in the past. The Teacher, implemented by a Generative Adversarial Network (GAN), is trained to preserve and replay past knowledge corresponding to the probabilistic representations of previously learnt databases. Meanwhile, the Student module is implemented by a Variational Autoencoder (VAE) which infers its latent variable representation from both the output of the Teacher module as well as from the newly available database. Moreover, the Student module is trained to capture both continuous and discrete underlying data representations across different domains. The proposed lifelong learning framework is applied in supervised, semi-supervised and unsupervised training.
•Energy disaggregation is performed using a variational autoencoders framework.•Skip connections have been introduced to the NILM model to enhance load reconstruction performance.•The performance has ...been evaluated on two public datasets (e.g., UK-DALE and REFIT).•Tests have been conducted on houses held out during training.•On average, the proposed approach outperforms five state-of-the-art algorithms.
Non-intrusive load monitoring (NILM) is a technique that uses a single sensor to measure the total power consumption of a building. Using an energy disaggregation method, the consumption of individual appliances can be estimated from the aggregate measurement. Recent disaggregation algorithms have significantly improved the performance of NILM systems. However, the generalization capability of these methods to different houses as well as the disaggregation of multi-state appliances are still major challenges. In this paper we address these issues and propose an energy disaggregation approach based on the variational autoencoders framework. The probabilistic encoder makes this approach an efficient model for encoding information relevant to the reconstruction of the target appliance consumption. In particular, the proposed model accurately generates more complex load profiles, thus improving the power signal reconstruction of multi-state appliances. Moreover, its regularized latent space improves the generalization capabilities of the model across different houses. The proposed model is compared to state-of-the-art NILM approaches on the UK-DALE and REFIT datasets, and yields competitive results. The mean absolute error reduces by 18% on average across all appliances compared to the state-of-the-art. The F1-Score increases by more than 11%, showing improvements for the detection of the target appliance in the aggregate measurement.