We present a method to create personalized anatomical models ready for physics-based animation, using only a set of 3D surface scans. We start by building a template anatomical model of an average ...male which supports deformations due to both 1) subject-specific variations: shapes and sizes of bones, muscles, and adipose tissues and 2) skeletal poses. Next, we capture a set of 3D scans of an actor in various poses. Our key contribution is formulating and solving a large-scale optimization problem where we compute both subject-specific and pose-dependent parameters such that our resulting anatomical model explains the captured 3D scans as closely as possible. Compared to data-driven body modeling techniques that focus only on the surface, our approach has the advantage of creating physics-based models, which provide realistic 3D geometry of the bones and muscles, and naturally supports effects such as inertia, gravity, and collisions according to Newtonian dynamics.
Considerable research has been conducted in the areas of audio-driven virtual character gestures and facial animation with some degree of success. However, few methods exist for generating full-body ...animations, and the portability of virtual character gestures and facial animations has not received sufficient attention.
Therefore, we propose a deep-learning-based audio-to-animation-and-blendshape (Audio2AB) network that generates gesture animations andARK it’s 52 facial expression parameter blendshape weights based on audio, audio-corresponding text, emotion labels, and semantic relevance labels to generate parametric data for full- body animations. This parameterization method can be used to drive full-body animations of virtual characters and improve their portability. In the experiment, we first downsampled the gesture and facial data to achieve the same temporal resolution for the input, output, and facial data. The Audio2AB network then encoded the audio, audio- corresponding text, emotion labels, and semantic relevance labels, and then fused the text, emotion labels, and semantic relevance labels into the audio to obtain better audio features. Finally, we established links between the body, gestures, and facial decoders and generated the corresponding animation sequences through our proposed GAN-GF loss function.
By using audio, audio-corresponding text, and emotional and semantic relevance labels as input, the trained Audio2AB network could generate gesture animation data containing blendshape weights. Therefore, different 3D virtual character animations could be created through parameterization.
The experimental results showed that the proposed method could generate significant gestures and facial animations.
In this paper, we present a technique for generating animation from a variety of user-defined constraints. We pose constraint-based motion synthesis as a
maximum a posterior
(MAP) problem and develop ...an optimization framework that generates natural motion satisfying user constraints. The system automatically learns a statistical dynamic model from motion capture data and then enforces it as a motion prior. This motion prior, together with user-defined constraints, comprises a trajectory optimization problem. Solving this problem in the low-dimensional space yields optimal natural motion that achieves the goals specified by the user. We demonstrate the effectiveness of this approach by generating whole-body and facial motion from a variety of spatial-temporal constraints.
Nowadays there are many real-time applications such as robotic motion, driver-less vehicle, intelligent target shooter(bullets and missiles), traffic routing in which human intervention is avoided. ...This paper proposes an exciting and generalized approach for intelligent target hitting in an obstructed path using physical body animation and genetic algorithm. This approach uses the concepts of the genetic algorithm to train the object for finding the right path to target and concepts of physical body animation to provide the motion and to react as per the collision with obstacles. Physical body animation provides a very natural feel of a real-time environment as we deal with all the external natural forces such as gravity, wind resistance the object and so on. Proposed approach deals not only with the static target but also deals with the dynamic target during the simulation.
We present a system for finding and tracking a face and extract global and local animation parameters from a video sequence. The system uses an initial colour processing step for finding a rough ...estimate of the position, size, and inplane rotation of the face, followed by a refinement step drived by an active model. The latter step refines the previous estimate, and also extracts local animation parameters. The system is able to track the face and some facial features in near real-time, and can compress the result to a bitstream compliant to MPEG-4 face and body animation.
Live Free-View Video for Soccer Games Amar, Islam; Elsana, Abdalla; El-Sana, Jihad
Proceedings of the 28th International ACM Conference on 3D Web Technology,
10/2023
Conference Proceeding
Open access
Free-view video lets viewers choose their camera parameters when watching a recorded or live event; they can interactively control the camera view and choose to focus on different parts of the scene. ...This paper presents a novel client-server architecture approach for free-view videos of sports. The clients obtain a detailed 3D representation of the players and the game field from the server of a shared repository. The server receives video streams from several cameras around the game field, detects the players, determines the camera with the best view, extracts the poses of each player, and encodes this data with a timestamp into a snapshot, which is streamed to the clients. A client receives a stream of snapshots, applies each pose to the appropriate player’s 3D model (avatar), and renders the scene according to the user’s virtual camera. We have implemented our approach while using VIBE Kocabas et al. 2020 for pose extraction and obtained promising results. We transferred a soccer game into a 3D representation supporting free-view with a reconstruction error below . Our unoptimized implementation is nearly real-time; it runs at about 30 frames/second.
Automatic control of conversational agents has applications from animation, through human-computer interaction, to robotics. In interactive communication, an agent must move to express its own ...discourse, and also react naturally to incoming speech. In this paper we propose a Flow Variational Autoencoder (Flow-VAE) deep learning architecture for transforming conversational speech to body gesture, during both speaking and listening. The model uses a normalising flow to perform variational inference in an autoencoder framework and is a more expressive distribution than the Gaussian approximation of conventional variational autoencoders. Our model is non-deterministic, so can produce variations of plausible gestures for the same speech. Our evaluation demonstrates that our approach produces expressive body motion that is close to the ground truth using a fraction of the trainable parameters compared with previous state of the art.
In this paper, we present a technique for generating animation from a variety of user-defined constraints. We pose constraint-based motion synthesis as a maximum a posterior (MAP) problem and develop ...an optimization framework that generates natural motion satisfying user constraints. The system automatically learns a statistical dynamic model from motion capture data and then enforces it as a motion prior. This motion prior, together with user-defined constraints, comprises a trajectory optimization problem. Solving this problem in the low-dimensional space yields optimal natural motion that achieves the goals specified by the user. We demonstrate the effectiveness of this approach by generating whole-body and facial motion from a variety of spatial-temporal constraints.
This paper introduces a real-time model-based human motion tracking and analysis method for human computer interface (HCI). This method tracks and analyzes the human motion from two orthogonal views ...without using any markers. The motion parameters are estimated by pattern matching between the extracted human silhouette and the human model. First, the human silhouette is extracted and then the body definition parameters (BDPs) can be obtained. Second, the body animation parameters (BAPs) are estimated by a hierarchical tritree overlapping searching algorithm. To verify the performance of our method, we demonstrate different human posture sequences and use hidden Markov model (HMM) for posture recognition testing.