Abstract Optical motion capture is commonly used in biomechanics to measure human kinematics. However, no studies have yet examined the accuracy of optical motion capture in a large capture volume ...(>100 m3 ), or how accuracy varies from the center to the extreme edges of the capture volume. This study measured the dynamic 3D errors of an optical motion capture system composed of 42 OptiTrack Prime 41 cameras (capture volume of 135 m3 ) by comparing the motion of a single marker to the motion reported by a ThorLabs linear motion stage. After spline interpolating the data, it was found that 97% of the capture area had error below 200 μm. When the same analysis was performed using only half (21) of the cameras, 91% of the capture area was below 200 μm of error. The only locations that exceeded this threshold were at the extreme edges of the capture area, and no location had a mean error exceeding 1 mm. When measuring human kinematics with skin-mounted markers, uncertainty of marker placement relative to underlying skeletal features and soft tissue artifact produce errors that are orders of magnitude larger than the errors attributed to the camera system itself. Therefore, the accuracy of this OptiTrack optical motion capture system was found to be more than sufficient for measuring full-body human kinematics with skin-mounted markers in a large capture volume (>100 m3 ).
This paper addresses the problem of radar-based human gait recognition based on the dual-channel deep convolutional neural network (DC-DCNN). To enrich the limited radar data set of human gaits and ...provide a benchmark for classifier training, evaluation, and comparison, it proposes an effective method for radar echo generation from the infrared, publicly accessible motion capture (MOCAP) data set. According to the different nonstationary characteristics of micro-Doppler (m-D) for the torso and limbs, it enhances their distinguishable joint time-frequency (JTF) features by applying the short-time Fourier transforms (SFTFs) with varying sliding window length and then designs the DC-DCNN structure to achieve refined human gait recognition by separate feature extraction and fusion. Experiments have shown that compared with the traditional single-channel deep convolutional neural network (SC-DCNN), the proposed method achieves higher recognition accuracy in refined human gait classification without incurring additional radar resources and could be readily extended to refined recognition of other human activities.
Due to recent advances in virtual reality (VR) technology, the development of immersive VR applications that track body motions and visualize a full-body avatar is attracting increasing research ...interest. This paper reviews related research to gather and to critically analyze recent improvements regarding the potential of full-body motion reconstruction in VR applications. We conducted a systematic literature search, matching VR and full-body tracking related keywords on IEEE Xplore, PubMed, ACM, and Scopus. Fifty-three publications were included and assigned in three groups: studies using markerless and marker-based motion tracking systems as well as systems using inertial measurement units. All analyzed research publications track the motions of the user wearing a head-mounted display and visualize a full-body avatar. The analysis confirmed that a full-body avatar can enhance the sense of embodiment and can improve the immersion within the VR. The results indicated that the Kinect device is still the most frequently used sensor (27 out of 53). Furthermore, there is a trend to track the movements of multiple users simultaneously. Many studies that enable multiplayer mode in VR use marker-based systems (7 out of 17) because they are much more robust and can accurately track full-body movements of multiple users in real-time.
We propose a multi-sensor fusion method for capturing challenging 3D human motions with accurate consecutive local poses and global trajectories in large-scale scenarios, only using single LiDAR and ...4 IMUs, which are set up conveniently and worn lightly. Specifically, to fully utilize the global geometry information captured by LiDAR and local dynamic motions captured by IMUs, we design a two-stage pose estimator in a coarse-to-fine manner, where point clouds provide the coarse body shape and IMU measurements optimize the local actions. Furthermore, considering the translation deviation caused by the view-dependent partial point cloud, we propose a pose-guided translation corrector. It predicts the offset between captured points and the real root locations, which makes the consecutive movements and trajectories more precise and natural. Moreover, we collect a LiDAR-IMU multi-modal mocap dataset, LIPD, with diverse human actions in long-range scenarios. Extensive quantitative and qualitative experiments on LIPD and other open datasets all demonstrate the capability of our approach for compelling motion capture in large-scale scenarios, which outperforms other methods by an obvious margin. We will release our code and captured dataset to stimulate future research.
To investigate the validity and test-retest reliability of a customized markerless motion capture (MMC) system that used iPad Pros with a Light Detection And Ranging scanner at two different viewing ...angles to measure the active range of motion (AROM) and the angular waveform of the upper-limb-joint angles of healthy adults performing functional tasks.
Participants were asked to perform shoulder and elbow actions for the investigator to take AROM measurements, followed by four tasks that simulated daily functioning. Each participant attended 2 experimental sessions, which were held at least 2 days and at most 14 days apart.
A Vicon system and 2 iPad Pros installed with our MMC system were placed at 2 different angles to the participants and recorded their movements concurrently during each task.
Thirty healthy adults (mean age: 28.9, M/F ratio: 40/60).
Not applicable.
The AROM and the angular waveform of the upper-limb-joint angles.
The iPad Pro MMC system underestimated the shoulder joint and elbow joint angles in all four simulated functional tasks. The MMC demonstrated good to excellent test-retest reliability for the shoulder joint AROM measurements in all 4 tasks.
The maximal AROM measurements calculated by the MMC system had consistently smaller values than those measured by the goniometer. An MMC in iPad Pro system might not be able to replace conventional goniometry for clinical ROM measurements, but it is still suggested for use in home-based and telerehabilitation training for intra-subject measurements because of its good reliability, low cost, and portability. Further development to improve its performance in motion capture and analysis in disease populations is warranted.
Character controllers using motion VAEs Ling, Hung Yu; Zinno, Fabio; Cheng, George ...
ACM transactions on graphics,
07/2020, Volume:
39, Issue:
4
Journal Article
Peer reviewed
Open access
A fundamental problem in computer animation is that of realizing purposeful and realistic human movement given a sufficiently-rich set of motion capture clips. We learn data-driven generative models ...of human movement using autoregressive conditional variational autoencoders, or Motion VAEs. The latent variables of the learned autoencoder define the action space for the movement and thereby govern its evolution over time. Planning or control algorithms can then use this action space to generate desired motions. In particular, we use deep reinforcement learning to learn controllers that achieve goal-directed movements. We demonstrate the effectiveness of the approach on multiple tasks. We further evaluate system-design choices and describe the current limitations of Motion VAEs.
Optical marker-based motion capture is the dominant way for obtaining high-fidelity human body animation for special effects, movies, and video games. However, motion capture has seen limited ...application to the human hand due to the difficulty of automatically identifying (or labeling) identical markers on self-similar fingers. We propose a technique that frames the labeling problem as a keypoint regression problem conducive to a solution using convolutional neural networks. We demonstrate robustness of our labeling solution to occlusion, ghost markers, hand shape, and even motions involving two hands or handheld objects. Our technique is equally applicable to sparse or dense marker sets and can run in real-time to support interaction prototyping with high-fidelity hand tracking and hand presence in virtual reality.
MoGlow Henter, Gustav Eje; Alexanderson, Simon; Beskow, Jonas
ACM transactions on graphics,
11/2020, Volume:
39, Issue:
6
Journal Article
Peer reviewed
Open access
Data-driven modelling and synthesis of motion is an active research area with applications that include animation, games, and social robotics. This paper introduces a new class of probabilistic, ...generative, and controllable motion-data models based on normalising flows. Models of this kind can describe highly complex distributions, yet can be trained efficiently using exact maximum likelihood, unlike GANs or VAEs. Our proposed model is autoregressive and uses LSTMs to enable arbitrarily long time-dependencies. Importantly, is is also causal, meaning that each pose in the output sequence is generated without access to poses or control inputs from future time steps; this absence of algorithmic latency is important for interactive applications with real-time motion control. The approach can in principle be applied to any type of motion since it does not make restrictive, task-specific assumptions regarding the motion or the character morphology. We evaluate the models on motion-capture datasets of human and quadruped locomotion. Objective and subjective results show that randomly-sampled motion from the proposed method outperforms task-agnostic baselines and attains a motion quality close to recorded motion capture.
New vision sensors, such as the dynamic and active-pixel vision sensor (DAVIS), incorporate a conventional global-shutter camera and an event-based sensor in the same pixel array. These sensors have ...great potential for high-speed robotics and computer vision because they allow us to combine the benefits of conventional cameras with those of event-based sensors: low latency, high temporal resolution, and very high dynamic range. However, new algorithms are required to exploit the sensor characteristics and cope with its unconventional output, which consists of a stream of asynchronous brightness changes (called “events”) and synchronous grayscale frames. For this purpose, we present and release a collection of datasets captured with a DAVIS in a variety of synthetic and real environments, which we hope will motivate research on new algorithms for high-speed and high-dynamic-range robotics and computer-vision applications. In addition to global-shutter intensity images and asynchronous events, we provide inertial measurements and ground-truth camera poses from a motion-capture system. The latter allows comparing the pose accuracy of ego-motion estimation algorithms quantitatively. All the data are released both as standard text files and binary files (i.e. rosbag). This paper provides an overview of the available data and describes a simulator that we release open-source to create synthetic event-camera data.
Using neural networks for learning motion controllers from motion capture data is becoming popular due to the natural and smooth motions they can produce, the wide range of movements they can learn ...and their compactness once they are trained. Despite these advantages, these systems require large amounts of motion capture data for each new character or style of motion to be generated, and systems have to undergo lengthy retraining, and often reengineering, to get acceptable results. This can make the use of these systems impractical for animators and designers and solving this issue is an open and rather unexplored problem in computer graphics. In this paper we propose a transfer learning approach for adapting a learned neural network to characters that move in different styles from those on which the original neural network is trained. Given a pretrained character controller in the form of a Phase‐Functioned Neural Network for locomotion, our system can quickly adapt the locomotion to novel styles using only a short motion clip as an example. We introduce a canonical polyadic tensor decomposition to reduce the amount of parameters required for learning from each new style, which both reduces the memory burden at runtime and facilitates learning from smaller quantities of data. We show that our system is suitable for learning stylized motions with few clips of motion data and synthesizing smooth motions in real‐time.