This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection ...error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.
In industrial manufacturing processes, errors frequently occur at unpredictable times and in unknown manifestations. We tackle the problem of automatic defect detection without requiring any image ...samples of defective parts. Recent works model the distribution of defect-free image data, using either strong statistical priors or overly simplified data representations. In contrast, our approach handles fine-grained representations incorporating the global and local image context while flexibly estimating the density. To this end, we propose a novel fully convolutional cross-scale normalizing flow (CS-Flow) that jointly processes multiple feature maps of different scales. Using normalizing flows to assign meaningful likelihoods to input samples allows for efficient defect detection on image-level. Moreover, due to the preserved spatial arrangement the latent space of the normalizing flow is interpretable which enables to localize defective regions in the image. Our work sets a new state-of-the-art in image-level defect detection on the benchmark datasets Magnetic Tile Defects and MVTec AD showing a 100% AUROC on 4 out of 15 classes.
In this work, learning-based inverse dynamics algorithms are proposed for the analysis of human motion. Immeasurable joint torques and exterior contact forces are directly estimated from motions by ...machine learning techniques including deep neural networks, random forests and Ridge regression. A multistage subclass approach is introduced. The method recovers occluded motion data and generates meaningful features, as well as gait phase labels to restrict and facilitate the regression of forces and moments. In contrast to the state-of-the-art inverse dynamics optimization, the learning-based methods are independent of ground reaction force measurements and the global position and orientation of the human body. These properties make the application to reconstructed poses from videos or inertial measurements possible, creating fast and simple access to the underlying dynamics of recorded human motions. The performance of the proposed methods is evaluated on a self-recorded data set including walking and running motions and on a publicly available gait data set by Fukuchi et al. (PeerJ 6:e4640, 2018). Furthermore, the applicability to reconstructed gait sequences taken from the well-known CMU database (Human motion capture database, 2014.
http://mocap.cs.cmu.edu/
) is investigated. Finally, the method is tested as a tool to detect abnormal torque distributions in gait, based on a reconstructed 3D motion of a limping subject.
Dynamic scene graph generation aims at generating a scene graph of the given video. Compared to the task of scene graph generation from images, it is more challenging because of the dynamic ...relationships between objects and the temporal dependencies between frames allowing for a richer semantic interpretation. In this paper, we propose Spatial-temporal Transformer (STTran), a neural network that consists of two core modules: (1) a spatial encoder that takes an input frame to extract spatial context and reason about the visual relationships within a frame, and (2) a temporal decoder which takes the output of the spatial encoder as input in order to capture the temporal dependencies between frames and infer the dynamic relationships. Furthermore, STTran is flexible to take varying lengths of videos as input without clipping, which is especially important for long videos. Our method is validated on the benchmark dataset Action Genome (AG). The experimental results demonstrate the superior performance of our method in terms of dynamic scene graphs. Moreover, a set of ablative studies is conducted and the effect of each proposed module is justified. Code available at: https://github.com/yrcong/STTran.
In this paper, we propose a novel feature type, namely Motion Binary Pattern (MBP) and different computation strategies for the well-known Volume Local Binary Pattern (VLBP). MBPs are a combination ...of VLBPs and Optical Flow. By combining the benefit of both methods, a simple and efficient descriptor is constructed. Motion Binary Patterns are computed in the spatio-temporal domain while the motion in consecutive frames is described. Finally, a feature descriptor is constructed by a histogram computation. Volume Local Binary Patterns are a feature type to describe object characteristics in the spatio-temporal domain. But apart from the computation of such a pattern further steps are required to create a discriminative feature. These steps are evaluated in detail and the best strategy is shown. For MBPs and VLBPs, a Random Forest classifier is learned and applied to the task of human action recognition. The proposed novel feature type and VLBPs are evaluated on the well-known, publicly available KTH dataset, Weizman dataset and on the IXMAS dataset for multi-view action recognition. The results demonstrate challenging accuracies in comparison to state-of-the-art methods.
The literature has shown how to optimize and analyze the parameters of different types of neural networks using mixed integer linear programs (MILP). Building on these developments, this work ...presents an approach to do so for a McCulloch/Pitts and Rosenblatt neurons. As the original formulation involves a step-function, it is not differentiable, but it is possible to optimize the parameters of neurons, and their concatenation as a shallow neural network, by using a mixed integer linear program. The main contribution of this paper is to additionally enforce sparsity constraints on the weights and activations as well as on the amount of used neurons. Several experiments demonstrate that such constraints effectively prevent overfitting in neural networks, and ensure resource optimized models.
The literature has shown how to optimize and analyze the parameters of different types of neural networks using mixed integer linear programs (MILP). Building on these developments, this work ...presents an approach to do so for a McCulloch/Pitts and Rosenblatt neurons. As the original formulation involves a step-function, it is not differentiable, but it is possible to optimize the parameters of neurons, and their concatenation as a shallow neural network, by using a mixed integer linear program. The main contribution of this paper is to additionally enforce sparsity constraints on the weights and activations as well as on the amount of used neurons. Several experiments demonstrate that such constraints effectively prevent overfitting in neural networks, and ensure resource optimized models.
Markerless Human Pose Estimation (HPE) proved its potential to support decision making and assessment in many fields of application. HPE is often preferred to traditional marker-based Motion Capture ...systems due to the ease of setup, portability, and affordable cost of the technology. However, the exploitation of HPE in biomedical applications is still under investigation. This review aims to provide an overview of current biomedical applications of HPE. In this paper, we examine the main features of HPE approaches and discuss whether or not those features are of interest to biomedical applications. We also identify those areas where HPE is already in use and present peculiarities and trends followed by researchers and practitioners. We include here 25 approaches to HPE and more than 40 studies of HPE applied to motor development assessment, neuromuscolar rehabilitation, and gait & posture analysis. We conclude that markerless HPE offers great potential for extending diagnosis and rehabilitation outside hospitals and clinics, toward the paradigm of
remote medical care
.
Shotgun metagenome analysis provides a robust and verifiable method for comprehensive microbiome analysis of fungal, viral, archaeal and bacterial taxonomy, particularly with regard to visualization ...of read mapping location, normalization options, growth dynamics and functional gene repertoires. Current read classification tools use non-standard output formats, or do not fully show information on mapping location. As reference datasets are not perfect, portrayal of mapping information is critical for judging results effectively. Our alignment-based pipeline, Wochenende, incorporates flexible quality control, trimming, mapping, various filters and normalization. Results are completely transparent and filters can be adjusted by the user. We observe stringent filtering of mismatches and use of mapping quality sharply reduces the number of false positives. Further modules allow genomic visualization and the calculation of growth rates, as well as integration and subsequent plotting of pipeline results as heatmaps or heat trees. Our novel normalization approach additionally allows calculation of absolute abundance profiles by comparison with reads assigned to the human host genome. Wochenende has the ability to find and filter alignments to all kingdoms of life using both short and long reads, and requires only good quality reference genomes. Wochenende automatically combines multiple available modules ranging from quality control and normalization to taxonomic visualization. Wochenende is available at https://github.com/MHH-RCUG/nf_wochenende.