Massive multiple-input multiple-output (MIMO) systems operating in millimeter wave (mmWave) frequency bands are considered to be one of the key enablers of beyond-fifth-generation cellular systems. ...Although the highest spectral efficiency in such systems would be achieved using fully-digital precoding, the large number of antennas in massive MIMO systems makes using a radio frequency (RF) chain for each antenna expensive and currently infeasible in practice. A common alternative solution is hybrid beamforming, which combines analog beamforming and digital precoding and reduces the required number of RF chains. The primary goal of hybrid beamforming is to provide precoding performance as close as possible to that of a fully-digital precoder. In this work, we consider two variants of a generative adversarial network (GAN), namely a conditional GAN (CGAN) and Wasserstein CGAN (WCGAN) to develop the hybrid precoder. The CGAN is used to implement the (partially-connected) analog beamformer and the WCGAN is used for the digital precoder. Our simulation results demonstrate the proposed method yields an improvement in spectral efficiency of about 12-19% over some existing hybrid beamforming schemes and achieves up to 87% of the performance of fully-digital precoding.
High-quality, diverse, and photorealistic images can now be generated by unconditional GANs (e.g., StyleGAN). However, limited options exist to control the generation process using (semantic) ...attributes while stillpreserving the quality of the output. Further, due to the entangled nature of the GAN latent space, performing edits along one attribute can easily result in unwanted changes along other attributes. In this article, in the context of conditional exploration of entangled latent spaces, we investigate the two sub-problems of attribute-conditioned sampling and attribute-controlled editing. We present StyleFlow as a simple, effective, and robust solution to both the sub-problems by formulating conditional exploration as an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features. We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images. For example, for faces, we vary camera pose, illumination variation, expression, facial hair, gender, and age. Finally, via extensive qualitative and quantitative comparisons, we demonstrate the superiority of StyleFlow over prior and several concurrent works. Project Page and Video: https://rameenabdal.github.io/StyleFlow.
State-of-the-art methods in the image-to-image translation are capable of learning a mapping from a source domain to a target domain with unpaired image data. Though the existing methods have ...achieved promising results, they still produce visual artifacts, being able to translate low-level information but not high-level semantics of input images. One possible reason is that generators do not have the ability to perceive the most discriminative parts between the source and target domains, thus making the generated images low quality. In this article, we propose a new Attention-Guided Generative Adversarial Networks (AttentionGAN) for the unpaired image-to-image translation task. AttentionGAN can identify the most discriminative foreground objects and minimize the change of the background. The attention-guided generators in AttentionGAN are able to produce attention masks, and then fuse the generation output with the attention masks to obtain high-quality target images. Accordingly, we also design a novel attention-guided discriminator which only considers attended regions. Extensive experiments are conducted on several generative tasks with eight public datasets, demonstrating that the proposed method is effective to generate sharper and more realistic images compared with existing competitive models. The code is available at https://github.com/Ha0Tang/AttentionGAN .
On the cover: The cover image is based on the Research Article Regeneration of pavement surface textures using M‐sigmoid‐normalized generative adversarial networks by Jiale Lu et al., ...https://doi.org/10.1111/mice.12987.
Neural networks are commonly used for post-stack and pre-stack seismic inversion. With sufficient labelled data, the neural network-based seismic inversion results are more accurate than that use ...traditional seismic inversion methods. However, in the case of insufficient labeled data, the accuracy of neural networks-based seismic inversion results decreases and is even lower than those based on traditional inversion methods. In addition, the seismic inversion results based on neural networks generally suffer from lateral discontinuity. It further reduces the accuracy of the inversion results. To tackle these problems, we propose a pre-stack seismic amplitude variation with offset (AVO) inversion method based on Closed-Loop Multi-task conditional Wasserstein Generative Adversarial Network (CMcWGAN), which is a GAN-based AVO inversion method. CMcWGAN enables simultaneous and accurate inversion of P-wave velocity ( V p ), S-wave velocity ( V s ), and density ( ρ ). Moreover, it uses the low-frequency information of elastic parameters as a conditional input to alleviate the problem of lateral discontinuity in inversion results. Experimental results of simulated data show that the inversion results based on CMcWGAN have higher accuracy than those based on traditional AVO inversion methods. In addition, when the seismic angle gather is noisy, CMcWGAN has better robustness than the traditional methods. CMcWGAN can also obtain reasonable AVO inversion results in field seismic angle gather data.inversion results. Experimental results of simulated data show that the inversion results based on CMcWGAN have higher accuracy than those based on traditional AVO inversion method. In addition, when the seismic angle gather is noisy, CMcWGAN has better robustness than traditional method. CMcWGAN can also get reasonable AVO inversion results in field seismic angle gather data.
Despite the recent advance of Generative Adversarial Networks (GANs) in high-fidelity image synthesis, there lacks enough understanding of how GANs are able to map a latent code sampled from a random ...distribution to a photo-realistic image. Previous work assumes the latent space learned by GANs follows a distributed representation but observes the vector arithmetic phenomenon. In this work, we propose a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs. In this framework, we conduct a detailed study on how different semantics are encoded in the latent space of GANs for face synthesis. We find that the latent code of well-trained generative models actually learns a disentangled representation after linear transformations. We explore the disentanglement between various semantics and manage to decouple some entangled semantics with subspace projection, leading to more precise control of facial attributes. Besides manipulating gender, age, expression, and the presence of eyeglasses, we can even vary the face pose as well as fix the artifacts accidentally generated by GAN models. The proposed method is further applied to achieve real image manipulation when combined with GAN inversion methods or some encoder-involved models. Extensive results suggest that learning to synthesize faces spontaneously brings a disentangled and controllable facial attribute representation.
Deep learning has shown its huge potential in the field of hyperspectral image (HSI) classification. However, most of the deep learning models heavily depend on the quantity of available training ...samples. In this article, we propose a multitask generative adversarial network (MTGAN) to alleviate this issue by taking advantage of the rich information from unlabeled samples. Specifically, we design a generator network to simultaneously undertake two tasks: the reconstruction task and the classification task. The former task aims at reconstructing an input hyperspectral cube, including the labeled and unlabeled ones, whereas the latter task attempts to recognize the category of the cube. Meanwhile, we construct a discriminator network to discriminate the input sample coming from the real distribution or the reconstructed one. Through an adversarial learning method, the generator network will produce real-like cubes, thus indirectly improving the discrimination and generalization ability of the classification task. More importantly, in order to fully explore the useful information from shallow layers, we adopt skip-layer connections in both reconstruction and classification tasks. The proposed MTGAN model is implemented on three standard HSIs, and the experimental results show that it is able to achieve higher performance than other state-of-the-art deep learning models.
Recently, generative adversarial networks (GANs) have progressed enormously, which makes them able to learn complex data distributions in particular faces. More and more efficient GAN architectures ...have been designed and proposed to learn the different variations of faces, such as cross pose, age, expression, and style. These GAN-based approaches need to be reviewed, discussed, and categorized in terms of architectures, applications, and metrics. Several reviews that focus on the use and advances of GAN in general have been proposed. However, to the best of our knowledge, the GAN models applied to the face, which we call facial GANs, have never been addressed. In this article, we review facial GANs and their different applications. We mainly focus on architectures, problems, and performance evaluation with respect to each application and used datasets. More precisely, we review the progress of architectures and discuss the contributions and limits of each. Then, we expose the encountered problems of facial GANs and propose solutions to handle them. Additionally, as GAN evaluation has become a notable current defiance, we investigate the state-of-the-art quantitative and qualitative evaluation metrics and their applications. We conclude this work with a discussion on the face generation challenges and propose open research issues.
Given two video frames <inline-formula> <tex-math notation="LaTeX">X_{0} </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">X_{n+1} </tex-math></inline-formula>, we aim to ...generate a series of intermediate frames <inline-formula> <tex-math notation="LaTeX">Y_{1}, Y_{2}, \ldots, Y_{n} </tex-math></inline-formula>, such that the resulting video consisting of frames <inline-formula> <tex-math notation="LaTeX">X_{0}, Y_{1}-Y_{n}, and X_{n+1} </tex-math></inline-formula> appears realistic to a human watcher. Such video generation has numerous important applications, including video compression, movie production, slow-motion filming, video surveillance, and forensic analysis. Yet, video generation is highly challenging due to the vast search space of possible frames. Previous methods, mostly based on video prediction and/or video interpolation, tend to generate poor-quality videos with severe motion blur. This paper proposes a novel, end-to-end approach to video generation using generative adversarial networks (GANs). In particular, our design involves two concatenated GANs, one capturing motions and the other generating frame details. The loss function is also carefully engineered to include adversarial loss, gradient difference (for motion learning), and normalized product correlation loss (for frame details). Experiments using three video datasets, namely, Google Robotic Push, KTH human actions, and UCF101, demonstrate that the proposed solution generates high-quality, realistic, and sharp videos, whereas all previous solutions output noisy and blurry results.