Portrait stylization is a long-standing task enabling extensive applications. Although 2D-based methods have made great progress in recent years, real-world applications such as metaverse and games ...often demand 3D content. On the other hand, the requirement of 3D data, which is costly to acquire, significantly impedes the development of 3D portrait stylization methods. In this paper, inspired by the success of 3D-aware GANs that bridge 2D and 3D domains with 3D fields as the intermediate representation for rendering 2D images, we propose a novel method, dubbed HyperStyle3D, based on 3D-aware GANs for 3D portrait stylization. At the core of our method is a hyper-network learned to manipulate the parameters of the generator in a single forward pass. It not only offers a strong capacity to handle multiple styles with a single model, but also enables flexible fine-grained stylization that affects only texture, shape, or local part of the portrait. While the use of 3D-aware GANs bypasses the requirement of 3D data, we further alleviate the necessity of style images with the CLIP model being the style guidance. We conduct an extensive set of experiments across the style, attribute, and shape, and meanwhile, measure the 3D consistency. These experiments demonstrate the superior capability of our HyperStyle3D model in rendering 3D-consistent images in diverse styles, deforming the face shape, and editing various attributes.
Existing CNN-based style transfer frameworks have suffered from inaccurate control of pixel-wise stylization. As the CNN operation is designed based on kernel-wise operation, such a design ...unavoidably makes pixels affect each other. To mitigate this problem, we propose a controllable style transfer framework that leverages Implicit Neural Representation to encode each pixel respectively and optimize each style and content pair via test-time training. Unlike previous CNN-based style transfer frameworks, this formulation naturally enables accurate pixel-wise stylization control. In addition, to give explicit controllability on the degree of stylization, we define two vectors that represent the content and style respectively, enabling control by interpolating these vectors. We further demonstrate that, after being test-time trained once, our framework can show a various range of applications by precisely controlling the stylized images pixel-wise and freely adjusting image resolution and degree of stylization without further optimization or training.
•A novel controllable style transfer framework based on Implicit Neural Representation that pixel-wisely controls the stylized output via test-time training.•Exponential reweighting technique for mitigating undesirable entanglement of content and style.•Demonstrating the various applications, such as resolution control, pixel-wise control of stylization degree, spatial control using masks, and gradation style transfer effect.
Style Transfer Based on VGG Network Zhao, Zhe; Zhang, Shifang
International journal of advanced network, monitoring, and controls,
05/2023, Volume:
7, Issue:
1
Journal Article
Peer reviewed
Open access
With the rapid development of computer computing power, as an important method in the field of artificial intelligence, deep learning has amazing learning ability, especially in dealing with massive ...data, which makes deep learning in the fields of image recognition, image classification, natural language processing, data mining and unmanned driving, Has shown an extraordinary role. In previous studies, the style transfer algorithm has not developed well due to the poor computing power of Computer, the basic configuration of computer hardware can not meet the minimum requirements and the poor image effect after migration. However, with the development of computer hardware and the rapid change of GPU computing power, the style transfer network based on deep learning has become a hot issue in the study of style transfer in recent years. According to the research, although the traditional style transfer method can obtain the texture, color and other information of the style image, the model needs to be learned every time a new target image is generated, and the time cost during this period is very high. In this way, the trained model is not repeatable, and the generated image is often very random and can not get good results. Therefore, the emergence of style transfer methods based on deep learning solves the limitations of traditional style transfer methods. Style transfer methods based on deep learning are faster than traditional style transfer methods, and the generalization of the model is better. The style transfer algorithms of main neural networks are divided into two categories, Slow style transfer based on image iteration and fast style transfer based on model iteration. VGG network model can combine style image and content image, and greatly improve the style transfer efficiency of image.
Style transfer describes the rendering of an image's semantic content as different artistic styles. Recently, generative adversarial networks (GANs) have emerged as an effective approach in style ...transfer by adversarially training the generator to synthesize convincing counterfeits. However, traditional GAN suffers from the mode collapse issue, resulting in unstable training and making style transfer quality difficult to guarantee. In addition, the GAN generator is only compatible with one style, so a series of GANs must be trained to provide users with choices to transfer more than one kind of style. In this paper, we focus on tackling these challenges and limitations to improve style transfer. We propose adversarial gated networks (Gated-GAN) to transfer multiple styles in a single model. The generative networks have three modules: an encoder, a gated transformer, and a decoder. Different styles can be achieved by passing input images through different branches of the gated transformer. To stabilize training, the encoder and decoder are combined as an auto-encoder to reconstruct the input images. The discriminative networks are used to distinguish whether the input image is a stylized or genuine image. An auxiliary classifier is used to recognize the style categories of transferred images, thereby helping the generative networks generate images in multiple styles. In addition, Gated-GAN makes it possible to explore a new style by investigating styles learned from artists or genres. Our extensive experiments demonstrate the stability and effectiveness of the proposed model for multi-style transfer.
Manually re-drawing an image in a certain artistic style takes a professional artist a long time. Doing this for a video sequence single-handedly is beyond imagination. We present two computational ...approaches that transfer the style from one image (for example, a painting) to a whole video sequence. In our first approach, we adapt to videos the original image style transfer technique by Gatys et al. based on energy minimization. We introduce new ways of initialization and new loss functions to generate consistent and stable stylized video sequences even in cases with large motion and strong occlusion. Our second approach formulates video stylization as a learning problem. We propose a deep network architecture and training procedures that allow us to stylize arbitrary-length videos in a consistent and stable way, and nearly in real time. We show that the proposed methods clearly outperform simpler baselines both qualitatively and quantitatively. Finally, we propose a way to adapt these approaches also to 360
∘
images and videos as they emerge with recent virtual reality hardware.
•We propose FontTransformer, a novel few-shot Chinese font synthesis model, using stacked transformers to synthesize high-resolution (e.g., 256×256 or 1024×1024) glyph images. To the best of our ...knowledge, this is the first work that effectively applies Transformers on the task of few-shot Chinese font synthesis.•We design a novel chunked glyph image encoding scheme to encode glyph images into token sequences. With this encoding scheme, our method can synthesize arbitrarily high-resolution glyph images by keeping the length of the token sequence a constant.•Extensive experiments have been conducted to demonstrate that our method is capable of synthesizing high-quality glyph images in the target font style from a few input samples, outperforming the state of the art both quantitatively and qualitatively.
Automatic generation of high-quality Chinese fonts from a few online training samples is a challenging task, especially when the amount of samples is very small. Existing few-shot font generation methods can only synthesize low-resolution glyph images that often possess incorrect topological structures or/and incomplete strokes. To address the problem, this paper proposes FontTransformer, a novel few-shot learning model, for high-resolution Chinese glyph image synthesis by using stacked Transformers. The key idea is to apply the parallel Transformer to avoid the accumulation of prediction errors and utilize the serial Transformer to enhance the quality of synthesized strokes. Meanwhile, we also design a novel encoding scheme to feed more glyph information and prior knowledge to our model, which further enables the generation of high-resolution and visually-pleasing glyph images. Both qualitative and quantitative experimental results demonstrate the superiority of our method compared to other existing approaches in few-shot Chinese font synthesis task.
Although the existing steganography methods can successfully embed secret information into the carrier image without introducing distortion into the appearance of the carrier image, the difference in ...distribution between the carrier image and stego image still cannot resist the detection by statistics-based steganalysis algorithms. To improve the capability of resisting steganalysis algorithms, an image steganography scheme based on style transfer and quaternion exponent moments is proposed in this paper. First, the geometric invariance of quaternion exponent moments is combined to accomplish the task of embedding secret information. Next, the style transfer is performed on a stego image embedded with the secret information, and the stylized image is used to transmit through the common channel. Then, the receiver attempts to remove the style from the stylized image and to restore the stylized image back to its original appearance. For this purpose, a de-stylized network is designed to reconstruct the stego image from the stylized image. Finally, an extracting algorithm is used to extract the transmitted secret image from the reconstructed stego image. In the steganography process, the main goal is to achieve that even when the appearance and distribution of the carrier image have been changed, that should also look like an independent and normal behavior to an eavesdropper. Extensive experiments are conducted to verify the feasibility of the proposed scheme. Experimental and analysis results indicate that the proposed scheme can generate an independent and meaningful image and successfully transmit a secret image and has the ability to extract a secret image at a low bit error rate. In addition, the proposed scheme provides high security.
•We perform style transformation on the cover image embedded with secret information, and then use the stylized image as the steg image transmitted on the common channel.•The main idea of this paper is that even if the appearance and distribution of the cover image have changed, it is reasonable for the eavesdropper.•The geometric invariance of quaternion exponent moments is combined to accomplish the task of embedding secret information.•A de-stylized network is built to reconstruct the cover image embedded with secret information to the greatest extent possible.•The extensive experiments are conducted to verify the quality of reconstructed images is higher than those of the attacked images in many quantitative indicators.
Various data hiding methods have been suggested to hide secret images within stego images. However, many of them could be easily detected by steganalytic tools due to their large hidden information. ...In this paper, we enhance the undetectability of image hiding network by mapping latent representation conditional on secret information. We extend the idea of image generation-based steganography and propose a transformer-based image hiding network that can hide a secret image with the same size as the target image. The proposed scheme uses style transferring to help map latent representation. The hiding network of the proposed scheme consists of three modules: encoding, transfer, and synthesis modules. The encoding module extracts the latent representations from content and secret images, the transfer module stylizes the latent representation, and the synthesis module fuses the latent representations to synthesize a target image with the secret image hidden in it. A new synthesis module and corresponding extraction network are developed to enhance recovery accuracy. The proposed scheme shows high image quality on both target images and recovered secret images. Furthermore, it can resist steganalytic tools and thus provide good security.