Advances in Variational Inference Zhang, Cheng; Butepage, Judith; Kjellstrom, Hedvig ...
IEEE transactions on pattern analysis and machine intelligence,
08/2019, Volume:
41, Issue:
8
Journal Article
Peer reviewed
Open access
Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational ...inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully applied to various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, (c) accurate VI, which includes variational models beyond the mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.
The most puzzling aspect of the ‘strange metal’ behavior of correlated electron compounds is that the linear in temperature resistivity often extends down to low temperatures, lower than natural ...microscopic energy scales. We consider recently proposed deconfined critical points (or phases) in models of electrons in large dimension lattices with random nearest-neighbor exchange interactions. The criticality is in the class of Sachdev–Ye–Kitaev models, and exhibits a time reparameterization soft mode representing gravity in dual holographic theories. We compute the low temperature resistivity in a large M limit of models with SU(M) spin symmetry, and find that the dominant temperature dependence arises from this soft mode. The resistivity is linear in temperature down to zero temperature at the critical point, with a co-efficient universally proportional to the product of the residual resistivity and the co-efficient of the linear in temperature specific heat. We argue that the time reparameterization soft mode offers a promising and generic mechanism for resolving the strange metal puzzle.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Recent research has successfully adapted vision-based convolutional neural network (CNN) architectures for audio recognition tasks using Mel-Spectrograms. However, these CNNs have high computational ...costs and memory requirements, limiting their deployment on low-end edge devices. Motivated by the success of efficient vision models like InceptionNeXt and ConvNeXt, we propose AudioRepInceptionNeXt, a single-stream architecture. Its basic building block breaks down the parallel multi-branch depth-wise convolutions with descending scales of k×k kernels into a cascade of two multi-branch depth-wise convolutions. The first multi-branch consists of parallel multi-scale 1×k depth-wise convolutional layers followed by a similar multi-branch employing parallel multi-scale k×1 depth-wise convolutional layers. This reduces computational and memory footprint while separating time and frequency processing of Mel-Spectrograms. The large kernels capture global frequencies and long activities, while small kernels get local frequencies and short activities. We also reparameterize the multi-branch design during inference to further boost speed without losing accuracy. Experiments show that AudioRepInceptionNeXt reduces parameters and computations by 50%+ and improves inference speed 1.28× over state-of-the-art CNNs like the Slow–Fast while maintaining comparable accuracy. It also learns robustly across a variety of audio recognition tasks.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ
Generative Adversarial Network (GAN) is a thriving generative model and considerable efforts have been made to enhance the generation capabilities via designing a different adversarial framework of ...GAN (e.g., the discriminator and the generator) or redesigning the penalty function. Although existing models have been demonstrated to be very effective, their generation capabilities have limitations. Existing GAN variants either result in identical generated instances or generate simulation data with low quality when the training data are diverse and extremely limited (a dataset consists of a set of classes but each class holds several or even one single sample) or extremely imbalanced (a category holds a set of samples and other categories hold one single sample). In this paper, we present an innovative approach to tackle this issue, which jointly employs joint distribution and reparameterization method to reparameterize the randomized space as a mixture model and learn the parameters of this mixture model along with that of GAN. In this way, we term our approach Joint Distribution GAN (JDGAN). In our work, we show that the JDGAN can not only generate high quality simulation data with diversity, but also increase the overlapping area between the generating distribution and the raw data distribution. We proceed to conduct extensive experiments, utilizing MNIST, CIFAR10 and Mass Spectrometry datasets, all using extremely limited amounts of data, to demonstrate the significant performance of JDGAN in both achieving the smallest Fréchet Inception Distance (FID) score and producing diverse generated data.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ
In recent times, there has been notable and swift advancement in the field of image dehazing. Several deep learning techniques have demonstrated remarkable proficiency in resolving homogeneous ...dehazing issues. Nonetheless, the current dehazing approaches are generally formulated to deal with homogeneous haze, which is often undermined in real-world scenarios due to the uncertain haze dispersion. In this paper, we propose a dehazing model named RepDehazeNet by combining a structurally Reparameterization Encoder-Decoder subnet and a Full-Resolution Attention subnet. To be specific, the structural reparameterization idea is introduced into the encoder–decoder subnet to strengthen the feature extraction of dehazed images and improve the feature extraction speed. RepDehazeNet is compared with seven SOTA models on different datasets in terms of PSNR, SSIM, parameter quantity, and inference time. Compared to the DW-GAN model, the proposed RepDehazeNet model reduces the number of parameters by 2.7 million, and improves the inference speed by 90.3%, while achieving a higher PSNR of 0.5 dB on the NH-Haze2021 dataset. The experimental results demonstrate that the proposed RepDehazeNet model can effectively improve the real-time performance, accuracy of dehazing synthesized and nonhomogeneous haze images.
Display omitted
•Structural reparameterization dehazenet: outstanding performance, faster speed.•Replacing Tanh with ReLU leads to better results.•Transfer learning addresses the problem of insufficient samples.•Dual subnets method proves highly effective in datasets of different scales.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
•Using a Bayesian classifier reparameterization for classifier uncertainty analysis.•A label augmentation strategy models the distribution for label uncertainty.•Our strategies significantly improve ...the backbones under different SGG tasks.
As a challenging task in computer vision, Scene graph generation (SGG) aims to model the underlying semantic relationships among objects in a given image for scene understanding. Due to the increasing scale and subjectivity, the annotations of existing SGG benchmarks inevitably suffer from some uncertainty issues, resulting in the models hardly learning the relationships comprehensively. In this work, we address the uncertainty from the perspectives of both classifier parameters and relationship labels. On one hand, we handle the classifier uncertainty via learning a Bayesian classifier reparameterization, of which the weights are sampled from a latent space spanned with a prior distribution. On the other hand, we assume that each relationship label is sampled from a latent label space and mitigate the label uncertainty via estimating the latent relationship distribution. As a result, the distribution of the classifier parameters are comprehensively learned under the supervision of the estimated relationship labels, thus improving the model’s generalization ability. Experimental results on the popular benchmark demonstrate that the proposed strategies significantly improve different baseline models on different SGG tasks.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ
•Modeling the uncertainty of attention modules.•Improving the generalization ability of attention models.•Mitigating the degradation issue that appears in the reparameterized attention.•Improving the ...image classification performance of different attention models on different datasets consistently.
The attention mechanism has been widely explored for neural networks as it could effectively model the interdependencies among channels, spatial positions, and frames. A neural network with attention modules has uncertainties in its parameters, but training the models deterministically hardly captures the uncertainties. Modeling the parameters’ uncertainty of the attention module could facilitate flexibly capturing the representative patterns, thus promoting the generalization of the models. In this work, we propose a novel reparameterized attention strategy by modeling the uncertainty of the parameters in the attention module and performing uncertainty-aware optimization. Instead of learning deterministic parameters for the attention modules, our strategy learns variational posterior distributions. The experimental results show that our strategy could consistently improve different models’ accuracy and reduce the generalization gap without extra computation.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ
Latent Dirichlet allocation model (LDA) has been widely used in topic modeling. Recent works have shown the effectiveness of integrating neural network mechanisms with this generative model for ...learning text representation. However, one of the significant setbacks of LDA is that it is based on a Dirichlet prior that has a restrictive covariance structure. All its variables are considered to be negatively correlated, which makes the model restrictive. In a practical sense, topics can be positively or negatively correlated. To address this problem, we proposed a generalized Dirichlet variational autoencoder (GD-VAE) for topic modeling. The Generalized Dirichlet (GD) distribution has a more general covariance structure than the Dirichlet distribution because it takes into account both positively and negatively correlated topics in the corpus. Our proposed model leverages rejection sampling variational inference using a reparameterization trick for effective training. GD-VAE compares favorably to recent works on topic models on several benchmark corpora. Experiments show that accounting for topics’ positive and negative correlations results in better performance. We further validate the superiority of our proposed framework on two image data sets. GD-VAE demonstrates its significance as an integral part of a classification architecture. For reproducibility and further research purposes, code for this work can be found at https://github.com/hormone03/GD-VAE.
•We propose GD-VAE to capture correlations and learn complex distributions.•We show that capturing all correlations leads to improved performance in GD-VAE.•We address training instability by introducing a weighted objective function.•The comprehensive experiments show that GD-VAE outperformed state-of-the-art models.•We demonstrate the effectiveness of GD-VAE on data augmentation with image data sets.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ
In the context of medical image segmentation, the segmentation of polyps holds significant importance for early cancer screening and preoperative planning. Polyp images encapsulate substantial ...semantic information, including the main body of the colon, internal wall folds, and the polyp itself. These components exhibit numerous similar features, while the characteristics of polyps vary significantly in different locations. This paper proposes an improved U-Net neural network model designed to address the issue of unbalanced segmentation accuracy and generalization capability in existing models. The incorporation of a reparameterization module as the backbone network integrates multiscale features while adopting a training-prediction model separation pattern ensures the accuracy of the model. To avoid compromising global spatial information and simultaneously enhance the global perceptual capability, we employ a convolutional block attention module to compensate for the feature loss generated by skip connections. Additionally, we devise a loss computation method specific to this model, named CDLoss, to achieve more effective gradient optimization and enhance the model’s ability to segment polyp boundaries. Our model undergoes comprehensive validation on the Kavirs-SEG and CVC-ClinicDB datasets. The segmentation Intersection over Union (IoU) and Dice values reach 96.56% and 98.16%, respectively. The generalization capability achieves 98.57% and 97.33% in CVC-ClinicDB, surpassing other current state-of-the-art polyp image segmentation models.
•The reparameterization module helps balance the model’s extraction of detailed and broad features.•The skip-connected CBAM addresses semantic gaps in multiscale polyp features, enhancing RCNU-Net’s polyp segmentation with richer content.•The CDLoss function merges pixel-level classification accuracy and target overlap measurement to enhance model generalization.•Experimental results demonstrate the superiority of this model over previous research in polyp segmentation.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Physics-based differentiable rendering is becoming increasingly crucial for tasks in inverse rendering and machine learning pipelines. To address discontinuities caused by geometric boundaries and ...occlusion, two classes of methods have been proposed: 1) the edge-sampling methods that directly sample light paths at the scene discontinuity boundaries, which require nontrivial data structures and precomputation to select the edges, and 2) the reparameterization methods that avoid discontinuity sampling but are currently limited to hemispherical integrals and unidirectional path tracing. We introduce a new mathematical formulation that enjoys the benefits of both classes of methods. Unlike previous reparameterization work that focused on hemispherical integral, we derive the reparameterization in the path space. As a result, to estimate derivatives using our formulation, we can apply advanced Monte Carlo rendering methods, such as bidirectional path tracing, while avoiding explicit sampling of discontinuity boundaries. We show differentiable rendering and inverse rendering results to demonstrate the effectiveness of our method.