Akademska digitalna zbirka SLovenije - logo
E-resources
Full text
Peer reviewed
  • A topic modeling and image ...
    Ojo, Akinlolu Oluwabusayo; Bouguila, Nizar

    Pattern recognition, February 2024, 2024-02-00, Volume: 146
    Journal Article

    Latent Dirichlet allocation model (LDA) has been widely used in topic modeling. Recent works have shown the effectiveness of integrating neural network mechanisms with this generative model for learning text representation. However, one of the significant setbacks of LDA is that it is based on a Dirichlet prior that has a restrictive covariance structure. All its variables are considered to be negatively correlated, which makes the model restrictive. In a practical sense, topics can be positively or negatively correlated. To address this problem, we proposed a generalized Dirichlet variational autoencoder (GD-VAE) for topic modeling. The Generalized Dirichlet (GD) distribution has a more general covariance structure than the Dirichlet distribution because it takes into account both positively and negatively correlated topics in the corpus. Our proposed model leverages rejection sampling variational inference using a reparameterization trick for effective training. GD-VAE compares favorably to recent works on topic models on several benchmark corpora. Experiments show that accounting for topics’ positive and negative correlations results in better performance. We further validate the superiority of our proposed framework on two image data sets. GD-VAE demonstrates its significance as an integral part of a classification architecture. For reproducibility and further research purposes, code for this work can be found at https://github.com/hormone03/GD-VAE. •We propose GD-VAE to capture correlations and learn complex distributions.•We show that capturing all correlations leads to improved performance in GD-VAE.•We address training instability by introducing a weighted objective function.•The comprehensive experiments show that GD-VAE outperformed state-of-the-art models.•We demonstrate the effectiveness of GD-VAE on data augmentation with image data sets.