Learning disentangled and interpretable representations is an important aspect of information understanding. In this paper, we propose a novel deep learning model representing both discrete and ...continuous latent variable spaces which can be used in either supervised or unsupervised learning. The proposed model is trained using an optimization function employing the mutual information maximization criterion. For the unsupervised learning setting we define a lower bound to the mutual information between the joint distribution of the latent variables corresponding to the real data and those generated by the model. The maximization of this lower bound during the training induces the learning of disentangled and interpretable data representations. Such representations can be used for attribute manipulation and image editing tasks.
We propose a new class of physics-informed neural networks, called the Physics-Informed Variational Auto-Encoder (PI-VAE), to solve stochastic differential equations (SDEs) or inverse problems ...involving SDEs. In these problems the governing equations are known but only a limited number of measurements of system parameters are available. PI-VAE consists of a variational autoencoder (VAE), which generates samples of system variables and parameters. This generative model is integrated with the governing equations. In this integration, the derivatives of VAE outputs are readily calculated using automatic differentiation, and used in the physics-based loss term. In this work, the loss function is chosen to be the Maximum Mean Discrepancy (MMD) for improved performance, and neural network parameters are updated iteratively using the stochastic gradient descent algorithm. We first test the proposed method on approximating stochastic processes. Then we study three types of problems related to SDEs: forward and inverse problems together with mixed problems where system parameters and solutions are simultaneously calculated. The satisfactory accuracy and efficiency of the proposed method are numerically demonstrated in comparison with physics-informed Wasserstein generative adversarial network (PI-WGAN).
Identifying drug-target interactions has been a key step in drug discovery. Many computational methods have been proposed to directly determine whether drugs and targets can interact or not. ...Drug-target binding affinity is another type of data which could show the strength of the binding interaction between a drug and a target. However, it is more challenging to predict drug-target binding affinity, and thus a very few studies follow this line. In our work, we propose a novel co-regularized variational autoencoders (Co-VAE) to predict drug-target binding affinity based on drug structures and target sequences. The Co-VAE model consists of two VAEs for generating drug SMILES strings and target sequences, respectively, and a co-regularization part for generating the binding affinities. We theoretically prove that the Co-VAE model is to maximize the lower bound of the joint likelihood of drug, protein and their affinity. The Co-VAE could predict drug-target affinity and generate new drugs which share similar targets with the input drugs. The experimental results on two datasets show that the Co-VAE could predict drug-target affinity better than existing affinity prediction methods such as DeepDTA and DeepAffinity, and could generate more new valid drugs than existing methods such as GAN and VAE.
In this paper, we propose a new method to perform data augmentation in a reliable way in the High Dimensional Low Sample Size (HDLSS) setting using a geometry-based variational autoencoder (VAE). Our ...approach combines the proposal of 1) a new VAE model, the latent space of which is modeled as a Riemannian manifold and which combines both Riemannian metric learning and normalizing flows and 2) a new generation scheme which produces more meaningful samples especially in the context of small data sets. The method is tested through a wide experimental study where its robustness to data sets, classifiers and training samples size is stressed. It is also validated on a medical imaging classification task on the challenging ADNI database where a small number of 3D brain magnetic resonance images (MRIs) are considered and augmented using the proposed VAE framework. In each case, the proposed method allows for a significant and reliable gain in the classification metrics. For instance, balanced accuracy jumps from 66.3% to 74.3% for a state-of-the-art convolutional neural network classifier trained with 50 MRIs of cognitively normal (CN) and 50 Alzheimer disease (AD) patients and from 77.7% to 86.3% when trained with 243 CN and 210 AD while improving greatly sensitivity and specificity metrics.
In this paper, we present a learning scheme for Joint Source-Channel Coding (JSCC) over analog independent additive noise channels. We formulate the learning problem by showing that the minimization ...loss function from rate-distortion theory, is upper bounded by the loss function of the Variational Autoencoder (VAE). We show that when the source dimension is greater than the channel dimension, the encoding of two source samples in the neighborhood of each other need not be near each other. Such discontinuous projection needs to be accounted for by using multiple encoders and selecting an encoder to encode samples on a particular side of the discontinuity. We explore two selection methodologies, one based on an intuitive rule and the other where it is posed as a learning task in a Mixture-of-Experts (MoE) setup. We analyze the gradients of these methods and reason why the latter is better at avoiding local optima. We show the efficacy of the proposed methodology by simulating the performance of the system for JSCC of Gaussian sources over AWGN channels and showing that the learned solutions are close to or better than the ones proposed earlier. The proposed methodology is also naturally capable of generalizing to other source distributions which we showcase by simulating for Laplace sources. The learned systems are also robust to changes in channel conditions. Further, a single system can be trained to generalize over a range of channel conditions provided the channel conditions are known at both the transmitter and the receiver. Finally, we evaluate our proposed methodology on three different image datasets and showcase consistent improvement over existing methods due to the VAE formulation.
The formation of distorted lamellar phases, distinguished by their arrangement of crumpled, stacked layers, is frequently accompanied by the disruption of long-range order, leading to the formation ...of interconnected network structures commonly observed in the sponge phase. Nevertheless, traditional scattering functions grounded in deterministic modeling fall short of fully representing these intricate structural characteristics. Our hypothesis posits that a deep learning method, in conjunction with the generalized leveled wave approach used for describing structural features of distorted lamellar phases, can quantitatively unveil the inherent spatial correlations within these phases.
This report outlines a novel strategy that integrates convolutional neural networks and variational autoencoders, supported by stochastically generated density fluctuations, into a regression analysis framework for extracting structural features of distorted lamellar phases from small angle neutron scattering data. To evaluate the efficacy of our proposed approach, we conducted computational accuracy assessments and applied it to the analysis of experimentally measured small angle neutron scattering spectra of AOT surfactant solutions, a frequently studied lamellar system.
The findings unambiguously demonstrate that deep learning provides a dependable and quantitative approach for investigating the morphology of wide variations of distorted lamellar phases. It is adaptable for deciphering structures from the lamellar to sponge phase including intermediate structures exhibiting fused topological features. This research highlights the effectiveness of deep learning methods in tackling complex issues in the field of soft matter structural analysis and beyond.
With cyberattacks growing in frequency and sophistication, effective anomaly detection is critical for securing networks and systems. This study provides a comparative evaluation of deep generative ...models for detecting anomalies in network intrusion data. The key objective is to determine the most accurate model architecture. Variational autoencoders (VAEs), VAE-GANs, and adversarial autoencoders (AAEs) are tested on the NSL-KDD dataset containing normal traffic and different attack types. Results show that AAEs significantly outperform VAEs and VAE-GANs, achieving AUC scores up to 0.96 and F1 scores of 0.76 on novel attacks. The adversarial regularization of AAEs enables superior generalization capabilities compared to standard VAEs. VAE-GANs exhibit better accuracy than VAEs, demonstrating the benefits of adversarial training. However, VAE-GANs have higher computational requirements. The findings provide strong evidence that AAEs are the most effective deep anomaly detection technique for intrusion detection systems. This study delivers novel insights into optimizing deep learning architectures for cyber defense. The comparative evaluation methodology and results will aid researchers and practitioners in selecting appropriate models for operational network security.
•We combined a dual-VAE structure with GAN to build a D-Vae/Gan framework.•Gan-based inter-modality knowledge distillation was introduced for feature learning.•Model training process was divided into ...cascade stages with a three-stage strategy.•Reconstructions on four fMRI datasets were objectively and subjectively identifiable.
Reconstructing perceived stimulus (image) only from human brain activity measured with functional Magnetic Resonance Imaging (fMRI) is a significant task in brain decoding. However, the inconsistent distribution and representation between fMRI signals and visual images cause great ‘domain gap’. Moreover, the limited fMRI data instances generally suffer from the issues of low signal noise ratio (SNR), extremely high dimensionality, and limited spatial resolution. Existing methods are often affected by these issues so that a satisfactory reconstruction is still an open problem. In this paper, we show that it is possible to obtain a promising solution by learning visually-guided latent cognitive representations from the fMRI signals, and inversely decoding them to the image stimuli. The resulting framework is called Dual-Variational Autoencoder/ Generative Adversarial Network (D-Vae/Gan), which combines the advantages of adversarial representation learning with knowledge distillation. In addition, we introduce a novel three-stage learning strategy which enables the (cognitive) encoder to gradually distill useful knowledge from the paired (visual) encoder during the learning process. Extensive experimental results on both artificial and natural images have demonstrated that our method could achieve surprisingly good results and outperform the available alternatives.
Modal-decomposition techniques are computational frameworks based on data aimed at identifying a low-dimensional space for capturing dominant flow features: the so-called modes. We propose a deep ...probabilistic-neural-network architecture for learning a minimal and near-orthogonal set of non-linear modes from high-fidelity turbulent-flow data useful for flow analysis, reduced-order modeling and flow control. Our approach is based on β-variational autoencoders (β-VAEs) and convolutional neural networks (CNNs), which enable extracting non-linear modes from multi-scale turbulent flows while encouraging the learning of independent latent variables and penalizing the size of the latent vector. Moreover, we introduce an algorithm for ordering VAE-based modes with respect to their contribution to the reconstruction. We apply this method for non-linear mode decomposition of the turbulent flow through a simplified urban environment, where the flow-field data is obtained based on well-resolved large-eddy simulations (LESs). We demonstrate that by constraining the shape of the latent space, it is possible to motivate the orthogonality and extract a set of parsimonious modes sufficient for high-quality reconstruction. Our results show the excellent performance of the method in the reconstruction against linear-theory-based decompositions, where the energy percentage captured by the proposed method from five modes is equal to 87.36% against 32.41% of the POD. Moreover, we compare our method with available AE-based models. We show the ability of our approach in the extraction of near-orthogonal modes with the determinant of the correlation matrix equal to 0.99, which may lead to interpretability.
•Learning a minimal and near-orthogonal set of non-linear modes from turbulent flows.•Based on variational autoencoders (VAEs) and convolutional neural networks (CNNs).•Ranking VAE-based modes with respect to their contribution to the reconstruction.•Leading to the extraction of interpretable non-linear modes.