This study sought to compare and validate baseball pitching mechanics, including joint angles and spatiotemporal parameters, from a single camera markerless motion capture solution with a 3D optical ...marker-based system. Ten healthy pitchers threw 2-3 maximum effort fastballs while concurrently using marker-based optical capture and pitchAI
TM
(markerless) motion capture. Time-series measures were compared using R-squared (r
2
), and root mean square error (RMSE). Discrete kinematic measures at foot plant, maximal shoulder external rotation, and ball release, plus four spatiotemporal parameters were evaluated using descriptive statistics, Bland-Altman analyses, Pearson's correlation coefficients, p-values, r
2
, and RMSE. For time-series angles, r
2
ranged from 0.69 (glove arm shoulder external rotation) to 0.98 (trunk and pelvis rotation), and RMSE ranged from 4.37° (trunk lateral tilt) to 20.78° (glove arm shoulder external rotation). Bias for individual joint angle and spatiotemporal parameters ranged from −11.31 (glove arm shoulder horizontal abduction; MER) to 12.01 (ball visible). RMSE was 3.62 m/s for arm speed, 5.75% height for stride length and 21.75 ms for the ball visible metric. pitchAI
TM
can be recommended as a markerless alternative to marker-based motion capture for quantifying pitching kinematics. A database of pitchAI
TM
ranges should be established for comparison between systems.
Marker-based motion capture (mocap) is a conventional method used in biomechanics research to precisely analyze human movement. However, the time-consuming marker placement process and extensive ...post-processing limit its wider adoption. Therefore, markerless mocap systems that use deep learning to estimate 2D keypoint from images have emerged as a promising alternative, but annotation errors in training datasets used by deep learning models can affect estimation accuracy. To improve the precision of 2D keypoint annotation, we present a method that uses anatomical landmarks based on marker-based mocap. Specifically, we use multiple RGB cameras synchronized and calibrated with a marker-based mocap system to create a high-quality dataset (RRIS40) of images annotated with surface anatomical landmarks. A deep neural network is then trained to estimate these 2D anatomical landmarks and a ray-distance-based triangulation is used to calculate the 3D marker positions. We conducted extensive evaluations on our RRIS40 test set, which consists of 10 subjects performing various movements. Compared against a marker-based system, our method achieves a mean Euclidean error of 13.23 mm in 3D marker position, which is comparable to the precision of marker placement itself. By learning directly to predict anatomical keypoints from images, our method outperforms OpenCap's augmentation of 3D anatomical landmarks from triangulated wild keypoints. This highlights the potential of facilitating wider integration of markerless mocap into biomechanics research. The RRIS40 test set is made publicly available for research purposes at koonyook.github.io/rris40 .
Methods: Neuromusculoskeletal modelling was used to estimate gluteus maximus, medius, and minimus muscle forces in 14 female footballers during 8 hip-focused exercises (single-leg squat, split squat, ...single-leg Romanian deadlift (RDL), single-leg hip thrust, banded side-step, hip hike, body weight side plank, and side-lying leg raise) performed with a 12 repetition maximum (12RM) resistance. Body weight side planks most effectively target both gluteus medius and minimus and require no equipment, making it ideal for on-field or at-home injury prevention/rehabilitation. Characterising muscle forces as well as fibre and activation properties may help inform exercise selection to optimize mechanical loading of muscles, specific to the performance, injury prevention, or rehabilitation training goals.
We present a real-time approach for multi-person 3D motion capture at over 30 fps using a single RGB camera. It operates successfully in generic scenes which may contain occlusions by objects and by ...other people. Our method operates in subsequent stages. The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals. We contribute a new architecture for this CNN, called SelecSLS Net, that uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy. In the second stage, a fullyconnected neural network turns the possibly partial (on account of occlusion) 2D pose and 3D pose features for each subject into a complete 3D pose estimate per individual. The third stage applies space-time skeletal model fitting to the predicted 2D and 3D pose per subject to further reconcile the 2D and 3D pose, and enforce temporal coherence. Our method returns the full skeletal pose in joint angles for each subject. This is a further key distinction from previous work that do not produce joint angle results of a coherent skeleton in real time for multi-person scenes. The proposed system runs on consumer hardware at a previously unseen speed of more than 30 fps given 512x320 images as input while achieving state-of-the-art accuracy, which we will demonstrate on a range of challenging real-world scenes.
Despite the exponential growth in using inertial measurement units (IMUs) for biomechanical studies, future growth in “inertial motion capture” is stymied by a fundamental challenge - how to estimate ...the orientation of underlying bony anatomy using skin-mounted IMUs. This challenge is of paramount importance given the need to deduce the orientation of the bony anatomy to estimate joint angles. This paper systematically surveys a large number (N = 112) of studies from 2000 to 2018 that employ four broad categories of methods to address this challenge across a range of body segments and joints. We categorize these methods as: (1) Assumed Alignment methods, (2) Functional Alignment methods, (3) Model Based methods, and (4) Augmented Data methods. Assumed Alignment methods, which are simple and commonly used, require the researcher to visually align the IMU sense axes with the underlying anatomical axes. Functional Alignment methods, also commonly used, relax the need for visual alignment but require the subject to complete prescribed movements. Model Based methods further relax the need for prescribed movements but instead assume a model for the joint. Finally, Augmented Data methods shed all of the above assumptions, but require data from additional sensors. Significantly different estimates of the underlying anatomical axes arise both across and within these categories, and to a degree that renders it difficult, if not impossible, to compare results across studies. Consequently, a significant future need remains for creating and adopting a standard for defining anatomical axes via inertial motion capture to fully realize this technology’s potential for biomechanical studies.
Previous studies often inferred the focus of a bird's attention from its head movements because it provides important clues about their perception and cognition. However, it remains challenging to do ...so accurately, as the details of how they orient their visual field toward the visual targets remain largely unclear. We thus examined visual field configurations and the visual field use of large-billed crows (Corvus macrorhynchos Wagler 1827). We used an established ophthalmoscopic reflex technique to identify the visual field configuration, including the binocular width and optical axes, as well as the degree of eye movement. A newly established motion capture system was then used to track the head movements of freely moving crows to examine how they oriented their reconstructed visual fields toward attention-getting objects. When visual targets were moving, the crows frequently used their binocular visual fields, particularly around the projection of the beak-tip. When the visual targets stopped moving, crows frequently used non-binocular visual fields, particularly around the regions where their optical axes were found. On such occasions, the crows slightly preferred the right eye. Overall, the visual field use of crows is clearly predictable. Thus, while the untracked eye movements could introduce some level of uncertainty (typically within 15 deg), we demonstrated the feasibility of inferring a crow's attentional focus by 3D tracking of their heads. Our system represents a promising initial step towards establishing gaze tracking methods for studying corvid behavior and cognition.
The EuRoC micro aerial vehicle datasets Burri, Michael; Nikolic, Janosch; Gohl, Pascal ...
The International journal of robotics research,
09/2016, Volume:
35, Issue:
10
Journal Article
Peer reviewed
This paper presents visual-inertial datasets collected on-board a micro aerial vehicle. The datasets contain synchronized stereo images, IMU measurements and accurate ground truth. The first batch of ...datasets facilitates the design and evaluation of visual-inertial localization algorithms on real flight data. It was collected in an industrial environment and contains millimeter accurate position ground truth from a laser tracking system. The second batch of datasets is aimed at precise 3D environment reconstruction and was recorded in a room equipped with a motion capture system. The datasets contain 6D pose ground truth and a detailed 3D scan of the environment. Eleven datasets are provided in total, ranging from slow flights under good visual conditions to dynamic flights with motion blur and poor illumination, enabling researchers to thoroughly test and evaluate their algorithms. All datasets contain raw sensor measurements, spatio-temporally aligned sensor data and ground truth, extrinsic and intrinsic calibrations and datasets for custom calibrations.
Recent advances in data analytics and computer-aided diagnostics stimulate the vision of patient-centric precision healthcare, where treatment plans are customized based on the health records and ...needs of every patient. In physical rehabilitation, the progress in machine learning and the advent of affordable and reliable motion capture sensors have been conducive to the development of approaches for automated assessment of patient performance and progress toward functional recovery. The presented study reviews computational approaches for evaluating patient performance in rehabilitation programs using motion capture systems. Such approaches will play an important role in supplementing traditional rehabilitation assessment performed by trained clinicians, and in assisting patients participating in home-based rehabilitation. The reviewed computational methods for exercise evaluation are grouped into three main categories: discrete movement score, rule-based, and template-based approaches. The review places an emphasis on the application of machine learning methods for movement evaluation in rehabilitation. Related work in the literature on data representation, feature engineering, movement segmentation, and scoring functions is presented. The study also reviews existing sensors for capturing rehabilitation movements and provides an informative listing of pertinent benchmark datasets. The significance of this paper is in being the first to provide a comprehensive review of computational methods for evaluation of patient performance in rehabilitation programs.
•The article reviews computational methods for evaluation of rehabilitation movements using motion capture systems.•The review covers pertinent sensors for capturing rehabilitation movements, and datasets with rehabilitation exercises.•Related work in the literature on feature engineering, movement segmentation, and scoring functions is presented.
The emergence of pose estimation algorithms represents a potential paradigm shift in the study and assessment of human movement. Human pose estimation algorithms leverage advances in computer vision ...to track human movement automatically from simple videos recorded using common household devices with relatively low-cost cameras (e.g., smartphones, tablets, laptop computers). In our view, these technologies offer clear and exciting potential to make measurement of human movement substantially more accessible; for example, a clinician could perform a quantitative motor assessment directly in a patient's home, a researcher without access to expensive motion capture equipment could analyze movement kinematics using a smartphone video, and a coach could evaluate player performance with video recordings directly from the field. In this review, we combine expertise and perspectives from physical therapy, speech-language pathology, movement science, and engineering to provide insight into applications of pose estimation in human health and performance. We focus specifically on applications in areas of human development, performance optimization, injury prevention, and motor assessment of persons with neurologic damage or disease. We review relevant literature, share interdisciplinary viewpoints on future applications of these technologies to improve human health and performance, and discuss perceived limitations.
We present a system for real-time hand-tracking to drive virtual and augmented reality (VR/AR) experiences. Using four fisheye monochrome cameras, our system generates accurate and low-jitter 3D hand ...motion across a large working volume for a diverse set of users. We achieve this by proposing neural network architectures for detecting hands and estimating hand keypoint locations. Our hand detection network robustly handles a variety of real world environments. The keypoint estimation network leverages tracking history to produce spatially and temporally consistent poses. We design scalable, semi-automated mechanisms to collect a large and diverse set of ground truth data using a combination of manual annotation and automated tracking. Additionally, we introduce a detection-by-tracking method that increases smoothness while reducing the computational cost; the optimized system runs at 60Hz on PC and 30Hz on a mobile processor. Together, these contributions yield a practical system for capturing a user's hands and is the default feature on the Oculus Quest VR headset powering input and social presence.