We present a hybrid approach for generating a character by independently controlling its shape and texture using an input face and a styled face. To effectively produce the shape of a character, we ...propose an anthropometry-based approach that defines and extracts 37 explicit facial features. The shape of a character’s face is generated by extracting these explicit facial features from both faces and matching their corresponding features, which enables the synthesis of the shape with different poses and scales. We control this shape generation process by manipulating the features of the input and styled faces. For the style of the character, we devise a warping field-based style transfer method using the features of the character’s face. This method allows an effective application of style while maintaining the character’s shape and minimizing artifacts. Our approach yields visually pleasing results from various combinations of input and styled faces.
The training of video fire detection models based on deep learning relies on a large number of positive and negative samples, namely, fire video and scenario video with other disturbances similar to ...fire. Due to the prohibition of ignition in lots of indoor occasions, the fire video samples in the scene are insufficient. In this paper, a method based on generative adversarial network is proposed to generate flame images which are then migrated into specified scenes, thus increasing fire video samples in those restricted situations. Flame kernel is pre-implanted into the specified scene to keep its characteristics intact. The flame and scene are blended together by adding styling information such as blurry edge and ground reflection. This method overcomes background distortion which is caused by existing multimodal image translation on as a result of information loss and is able to guarantee the diversity of flames in specified scenes and produce perceptually realistic results. Compared with other multimodal image-to-image translation schemes, the FID and LPIPS values of images generated by our method are the highest, reaches 118.4 and 0.1322 respectively. In addition, Unet and the SA-Unet, in which a self-attention mechanism is involved, are used as fire segmenting networks to evaluate the enhancement of the augmented data on improving the accuracy of segmented network. Their F1-scores reaches 0.8905 and 0.9082 respectively after Unet and SA-Unet are trained with GAN-based augmented dataset generated by our model. The F1-scores are second only to 0.9259 and 0.9291 which are obtained when Unet and SA-Unet are trained with real picture serving as augmented dataset.
In this paper, we study the domain adaptive person re-identification(re-ID) problem: train a re-ID model on the labeled source domain and test it on the unlabeled target domain. It's known ...challenging due to the feature distribution bias between the source domain and target domain. The previous methods directly reduce the bias by image-to-image style translation between the source and the target domain in an unsupervised manner. However, these methods only consider the rough bias between the source domain and the target domain but neglect the detailed bias between the source domain and the target camera domains (divided by camera views), which contain critical factors influencing the testing performance of re-ID model. In this work, we particularly focus on the bias between the source domain and the target camera domains. To overcome this problem, a multi-domain image-to-image translation network, termed Identity Preserving Generative Adversarial Network (IPGAN) is proposed to learn the mapping relationship between the source domain and the target camera domains. IPGAN can translate the styles of images from the source domain to the target camera domains and generate many images with styles of target camera domains. Then the re-ID model is trained with the translated images generated by IPGAN. During the training of the re-ID model, we aim to learn the discriminative feature. We design and train a novel re-ID model, termed IBN-reID, in which Instance and Batch Normalization block (IBN-block) are introduced. Experimental results on Market-1501, DukeMTMC-reID and MSMT17 show that the images generated by IPGAN are more suitable for cross-domain re-ID. Very competitive re-ID accuracy is achieved by our method.
Creative product design is becoming critical to the success of many enterprises. However, the conventional product innovation process is hindered by two major challenges: the difficulty to capture ...users’ preferences and the lack of intuitive approaches to visually inspire the designer, which is especially true in fashion design and form design of many other types of products. In this paper, we propose to combine Kansei engineering and the deep learning for product innovation (KENPI) framework, which can transfer color, pattern, etc. of a style image in real time to a product’s shape automatically. To capture user preferences, we combine Kansei engineering with back-propagation neural networks to establish a mapping model between product properties and styles. To address the inspiration issue in product innovation, the convolutional neural network-based neural style transfer is adopted to reconstruct and merge color and pattern features of the style image, which are then migrated to the target product. The generated new product image can not only preserve the shape of the target product but also have the features of the style image. The Kansei analysis shows that the semantics of the new product have been enhanced on the basis of the target product, which means that the new product design can better meet the needs of users. Finally, implementation of this proposed method is demonstrated in detail through a case study of female coat design.
Today, the rapid development of computer technology changes with each passing day. In the computer field, computer animation has rapidly grown from a new thing to a leading industry, and animation ...has entered the era of three-dimensional animation and computer graphics. This article aims to study the application of artificial intelligence-based style transfer algorithm in animation special effects design. It proposes methods such as adaptive loss function, style transfer process, animation special effect design, etc., and conducts related experiments on the application of style transfer algorithm in animation special effect design in the article. The experimental results show that the style transfer algorithm based on AI can effectively improve the effect of animation special effects. In this survey, more than 80% of the people are satisfied with the animation special effects design based on the style transfer algorithm.
Intelligent matching of heterogeneous remote sensing images is a common basic problem in the field of intelligent remote sensing image processing. Aiming at the difficulty of matching ...satellite-aerial remote sensing images, this article proposes an intelligent matching method for heterogeneous remote sensing images based on style transfer. First, based on the idea of image style transfer of a generative adversarial networks, this method improves the conversion effect of the model on heterogeneous images by constructing a new generative network loss function and converts satellite images into aerial images. Then, the advanced deep learning-based matching algorithms D2-Net and LoFTR are used to achieve matching between the generated aerial image and the original aerial image. Finally, this transformation relationship is mapped to the corresponding satellite-aerial image pair to obtain the final matching result. The image style transfer experiments and the matching experiments we carry out under different test datasets show that the smooth cycle-consistent generative adversarial networks proposed in this article can effectively reduce the complexity of the algorithm and improve the quality of image generation. In addition, combining it with deep learning-based feature-matching methods can effectively improve the accuracy and robustness of the matching algorithm. Our code and data can be found at: https://gitee.com/AZQZ/intelligent-matching .
Retrograde intrarenal surgery (RIRS) is a widely utilized diagnostic and therapeutic tool for multiple upper urinary tract pathologies. The image‐guided navigation system can assist the surgeon to ...perform precise surgery by providing the relative position between the lesion and the instrument after the intraoperative image is registered with the preoperative model. However, due to the structural complexity and diversity of multi‐branched organs such as kidneys, bronchi, etc., the consistency of the intensity distribution of virtual and real images will be challenged, which makes the classical pure intensity registration method prone to bias and random results in a wide search domain. In this paper, we propose a structural feature similarity‐based method combined with a semantic style transfer network, which significantly improves the registration accuracy when the initial state deviation is obvious. Furthermore, multi‐view constraints are introduced to compensate for the collapse of spatial depth information and improve the robustness of the algorithm. Experimental studies were conducted on two models generated from patient data to evaluate the performance of the method and competing algorithms. The proposed method obtains mean target error (mTRE) of 0.971 ± 0.585 mm and 1.266 ± 0.416 mm respectively, with better accuracy and robustness overall. Experimental results demonstrate that the proposed method has the potential to be applied to RIRS and extended to other organs with similar structures.