It is a challenging task to create realistic 3D avatars that accurately replicate individuals' speech and unique talking styles for speech-driven facial animation. Existing techniques have made ...remarkable progress but still struggle to achieve lifelike mimicry. This paper proposes "TalkingStyle", a novel method to generate personalized talking avatars while retaining the talking style of the person. Our approach uses a set of audio and animation samples from an individual to create new facial animations that closely resemble their specific talking style, synchronized with speech. We disentangle the style codes from the motion patterns, allowing our method to associate a distinct identifier with each person. To manage each aspect effectively, we employ three separate encoders for style, speech, and motion, ensuring the preservation of the original style while maintaining consistent motion in our stylized talking avatars. Additionally, we propose a new style-conditioned transformer decoder, offering greater flexibility and control over the facial avatar styles. We comprehensively evaluate TalkingStyle through qualitative and quantitative assessments, as well as user studies demonstrating its superior realism and lip synchronization accuracy compared to current state-of-the-art methods. To promote transparency and further advancements in the field, we also make the source code publicly available at https://github.com/wangxuanx/TalkingStyle.
The purposes of this study are to classify body types of Korean women in their twenties and thirties for the creation of the 3D avatars and to propose the representative body size of each body type ...by analyzing the body size of Korean women in their twenties and thirties, to propose a 3D avatar modeling process design that reflects the body shapes of Korean women in their twenties and thirties, and to present standard 3D avatars of each body type of Korean women in their twenties and thirties which are verified with measurement suitability. The 3D anthropometric data of the Korean Anthropometric Survey (6th Size Korea) conducted in 2010 was used in this study. The collected subjects were 410 Korean women in their twenties and thirties. The 3D avatar modeling process using Maya 2013 was proposed to create the representative 3D avatars show superior measurement suitability. This process includes four steps; Analyzing body size measurements, 2D Image plane design, 3D avatar modeling, and 3D avatar evaluation. The 3D avatars created with this process showed the acceptable range of error. The factor analysis was performed on fifty-five body measurements chosen from the measurements of the 6th Size Korea anthropometric survey. Seven factors were extracted. With the seven extracted factors, body shapes of 406 Korean women in their twenties and thirties are classified into four groups by cluster analysis. The classified groups were named Full & Short, Slim & Short, Full & Tall, and Slim & Tall.
•This study can be the start of developing various 3D avatars that represent Koreans.•The 3D avatar modeling process using Maya 2013 was proposed to create the representative 3D avatars show superior measurement suitability.•Body shapes of 406 Korean women in their twenties and thirties are classified into four groups by cluster analysis. The classified groups were named Body Type 1 (Full & Short), Body Type 2 (Slim & Short), Body Type 3 (Full & Tall), and Body Type 4 (Slim & Tall)..
Summary
Generating immersive virtual reality avatars is a challenging task in VR/AR applications, which maps physical human body poses to avatars in virtual scenes for an immersive user experience. ...However, most existing work is time‐consuming and limited by datasets, which does not satisfy immersive and real‐time requirements of VR systems. In this paper, we aim to generate 3D real‐time virtual reality avatars based on a monocular camera to solve these problems. Specifically, we first design a self‐attention distillation network (SADNet) for effective human pose estimation, which is guided by a pre‐trained teacher. Secondly, we propose a lightweight pose mapping method for human avatars that utilizes the camera model to map 2D poses to 3D avatar keypoints, generating real‐time human avatars with pose consistency. Finally, we integrate our framework into a VR system, displaying generated 3D pose‐driven avatars on Helmet‐Mounted Display devices for an immersive user experience. We evaluate SADNet on two publicly available datasets. Experimental results show that SADNet achieves a state‐of‐the‐art trade‐off between speed and accuracy. In addition, we conducted a user experience study on the performance and immersion of virtual reality avatars. Results show that pose‐driven 3D human avatars generated by our method are smooth and attractive.
Generating immersive virtual reality avatars is a challenging task in VR/AR applications, which maps physical human body poses to avatars in virtual scenes for an immersive user experience. However, most existing work is time‐consuming and limited by datasets, which does not satisfy immersive and real‐time requirements of VR systems. In this paper, we aim to generate 3D real‐time virtual reality avatars based on a monocular camera to solve these problems. Specifically, we first design a self‐attention distillation network (SADNet) for effective human pose estimation, which is guided by a pre‐trained teacher. Secondly, we propose a lightweight pose mapping method for human avatars that utilizes the camera model to map 2D poses to 3D avatar keypoints, generating real‐time human avatars with pose consistency. Finally, we integrate our framework into a VR system, displaying generated 3D pose‐driven avatars on Helmet‐Mounted Display devices for an immersive user experience. We evaluate SADNet on two publicly available datasets. Experimental results show that SADNet achieves a state‐of‐the‐art trade‐off between speed and accuracy. In addition, we conducted a user experience study on the performance and immersion of virtual reality avatars. Results show that pose‐driven 3D human avatars generated by our method are smooth and attractive.
Arabic sign language (ArSL) is the natural language of the deaf community in Arabic countries. Deaf people have a set of difficulties due to poor services available. They have problems accessing ...essential information or receiving an education, communicating with other communities, and engaging in activities. Thus, a machine translation system of Arabic to ArSL has been developed using avatar technologies. Firstly, a dictionary of ArSL was constructed using eSign editor Software. The constructed dictionary has three thousand signs. It can be adopted for the translation system in which written text can be transformed into sign language. The dictionary will be available as a free resource for researchers. It is complex and time-consuming, but it is an essential step in the machine translation of whole Arabic text to ArSL with 3D animations. Secondly, the translator has been developed. It performs syntactic and morphological analysis and then applies a set of rules to translate an Arabic text into ArSL text based on the structure and grammar of ArSL. The system is evaluated according to the parallel corpus that consists of 180 sentences using the metric for evaluation of translation with explicit ordering metric for evaluation of translation with explicit ordering (METEOR) our system achieves a relative score of (86%).
The “global village” predicted by Marshall McLuhan referred among others, to the individual’s sense of intimate belonging to a planetary collective identity. Compared to the informational content of ...writing on social media, the emotional component has at least the same importance, which is why completing the text with suggestive icons has become more than a trend. The transition from emoticon to
and then to emoji captures an important evolution in terms of the individual’s identity in online media. The widening of the iconic register with objective and corporeal details indicates an articulation of the hybrid language with a reminiscent figurative morphology of ancient glyphs. Further on, bitmoji foresees the need for individual identity in the virtual world of Metaverse, where the NFT avatar will represent, most likely, the unique and idealized version of the user.