Human trajectory prediction is challenging and critical in various applications (e.g., autonomous vehicles and social robots). Because of the continuity and foresight of the pedestrian movements, the ...moving pedestrians in crowded spaces will consider both spatial and temporal interactions to avoid future collisions. However, most of the existing methods ignore the temporal correlations of interactions with other pedestrians involved in a scene. In this work, we propose a Spatial-Temporal Graph Attention network (STGAT), based on a sequence-to-sequence architecture to predict future trajectories of pedestrians. Besides the spatial interactions captured by the graph attention mechanism at each time-step, we adopt an extra LSTM to encode the temporal correlations of interactions. Through comparisons with state-of-the-art methods, our model achieves superior performance on two publicly available crowd datasets (ETH and UCY) and produces more "socially" plausible trajectories for pedestrians.
Predicting plausible and collisionless trajectories is critical in various applications, such as robotic navigation and autonomous driving. This is a challenging task due to two major factors. First, ...it is difficult for deep neural networks to understand how pedestrians move to avoid collisions and how they react to each other. Second, given observed trajectories, there are multiple possible and plausible trajectories followed by pedestrians. Although an increasing number of previous works have focused on modeling social interactions and multimodality, the trajectories generated by these methods still lead to many collisions. In this work, we propose CoL-GAN, a new attention-based generative adversarial network using a convolutional neural network as a discriminator, which is able to generate trajectories with fewer collisions. Through experimental comparisons with prior works on publicly available datasets, we demonstrate that Col-GAN achieves state-of-the-art performance in terms of accuracy and collision avoidance.
Virtualized traffic via various simulation models and real‐world traffic data are promising approaches to reconstruct detailed traffic flows. A variety of applications can benefit from the virtual ...traffic, including, but not limited to, video games, virtual reality, traffic engineering and autonomous driving. In this survey, we provide a comprehensive review on the state‐of‐the‐art techniques for traffic simulation and animation. We start with a discussion on three classes of traffic simulation models applied at different levels of detail. Then, we introduce various data‐driven animation techniques, including existing data collection methods, and the validation and evaluation of simulated traffic flows. Next, we discuss how traffic simulations can benefit the training and testing of autonomous vehicles. Finally, we discuss the current states of traffic simulation and animation and suggest future research directions.
Virtualized traffic via various simulation models and real‐world traffic data are promising approaches to reconstruct detailed traffic flows. A variety of applications can benefit from the virtual traffic, including, but not limited to, video games, virtual reality, traffic engineering and autonomous driving. In this survey, we provide a comprehensive review on the state‐of‐the‐art techniques for traffic simulation and animation. We start with a discussion on three classes of traffic simulation models applied at different levels of detail. Then, we introduce various data‐driven animation techniques, including existing data collection methods, and the validation and evaluation of simulated traffic flows. Next, we discuss how traffic simulations can benefit the training and testing of autonomous vehicles.
Most of existing traffic simulation methods have been focused on simulating vehicles on freeways or city-scale urban networks. However, relatively little research has been done to simulate ...intersectional traffic to date despite its broad potential applications. In this paper, we propose a novel deep learning-based framework to simulate and edit intersectional traffic. Specifically, based on an in-house collected intersectional traffic dataset, we employ the combination of convolution network (CNN) and recurrent network (RNN) to learn the patterns of vehicle trajectories in intersectional traffic. Besides simulating novel intersectional traffic, our method can be used to edit existing intersectional traffic. Through many experiments as well as comparative user studies, we demonstrate that the results by our method are visually indistinguishable from ground truth, and our method can outperform existing methods.
Realistic 3D facial modeling and animation have been increasingly used in many graphics, animation, and virtual reality applications. However, generating realistic fine-scale wrinkles on 3D faces, in ...particular, on animated 3D faces, is still a challenging problem that is far away from being resolved. In this article we propose an end-to-end system to automatically augment coarse-scale 3D faces with synthesized fine-scale geometric wrinkles. By formulating the wrinkle generation problem as a supervised generation task, we implicitly model the continuous space of face wrinkles via a compact generative model, such that plausible face wrinkles can be generated through effective sampling and interpolation in the space. We also introduce a complete pipeline to transfer the synthesized wrinkles between faces with different shapes and topologies. Through many experiments, we demonstrate our method can robustly synthesize plausible fine-scale wrinkles on a variety of coarse-scale 3D faces with different shapes and expressions.
Trajectory prediction for objects is challenging and critical for various applications (e.g., autonomous driving, and anomaly detection). Most of the existing methods focus on homogeneous pedestrian ...trajectories prediction, where pedestrians are treated as particles without size. However, they fall short of handling crowded vehicle-pedestrian-mixed scenes directly since vehicles, limited with kinematics in reality, should be treated as rigid, non-particle objects ideally. In this paper, we tackle this problem using separate LSTMs for heterogeneous vehicles and pedestrians. Specifically, we use an oriented bounding box to represent each vehicle, calculated based on its position and orientation, to denote its kinematic trajectories. We then propose a framework called VP-LSTM to predict the kinematic trajectories of both vehicles and pedestrians simultaneously. In order to evaluate our model, a large dataset containing the trajectories of both vehicles and pedestrians in vehicle-pedestrian-mixed scenes is specially built. Through comparisons between our method with state-of-the-art approaches, we show the effectiveness and advantages of our method on kinematic trajectories prediction in vehicle-pedestrian-mixed scenes.
Synthesizing indoor scene layouts is challenging and critical, especially for digital design and gaming entertainment. Although there has been significant research on the indoor layout synthesis of ...rectangular-shaped or L-shaped architecture, there is little known about synthesizing plausible layouts for more complicated indoor architecture with both geometric and semantic information of indoor architecture being fully considered. In this paper, we propose an effective and novel framework to synthesize plausible indoor layouts in various and complicated architecture. The given indoor architecture is first encoded to our proposed representation, called InAiR, based on its geometric and semantic information. The indoor objects are grouped and then arranged by functional blocks, represented by oriented bounding boxes, using dynamic convolution networks based on their functionality and human activities. Through comparisons with other approaches as well as comparative user studies, we find that our generated indoor scene layouts for diverse, complicated indoor architecture are visually indistinguishable, which reach state-of-the-art performance.
Trajectory prediction with uncertainty is a critical and challenging task for autonomous driving. Nowadays, we can easily access sensor data represented in multiple views. However, cross-view ...consistency has not been evaluated by the existing models, which might lead to divergences between the multimodal predictions from different views. It is not practical and effective when the network does not comprehend the 3D scene, which could cause the downstream module in a dilemma. Instead, we predicts multimodal trajectories while maintaining cross-view consistency. We presented a cross-view trajectory prediction method using shared 3D Queries (XVTP3D). We employ a set of 3D queries shared across views to generate multi-goals that are cross-view consistent. We also proposed a random mask method and coarse-to-fine cross-attention to capture robust cross-view features. As far as we know, this is the first work that introduces the outstanding top-down paradigm in BEV detection field to a trajectory prediction problem. The results of experiments on two publicly available datasets show that XVTP3D achieved state-of-the-art performance with consistent cross-view predictions.
In this paper, we propose a new data-driven model to simulate the process of lane-changing in traffic simulation. Specifically, we first extract the features from surrounding vehicles that are ...relevant to the lane-changing of the subject vehicle. Then, we learn the lane-changing characteristics from the ground-truth vehicle trajectory data using randomized forest and back-propagation neural network algorithms. Our method can make the subject vehicle to take account of more gap options on the target lane to cut in as well as achieve more realistic lane-changing trajectories for the subject vehicle and the follower vehicle. Through many experiments and comparisons with selected state-of-the-art methods, we demonstrate that our approach can soundly outperform them in terms of the accuracy and quality of lane-changing simulation. Our model can be flexibly used together with a variety of existing car-following models to produce natural traffic animations in various virtual environments.
Accurate vehicle trajectory prediction can benefit a variety of intelligent transportation system applications ranging from traffic simulations to driver assistance. The need for this ability is ...pronounced with the emergence of autonomous vehicles as they require the prediction of nearby vehicles' trajectories to navigate safely and efficiently. Recent studies based on deep learning have greatly improved prediction accuracy. However, one prominent issue of these models is the lack of model explainability. We alleviate this issue by proposing spatiotemporal attention long short-term memory (STA-LSTM), an LSTM model with spatial-temporal attention mechanisms for explainability in vehicle trajectory prediction. STA-LSTM not only achieves comparable prediction performance against other state-of-the-art models but, more importantly, explains the influence of historical trajectories and neighboring vehicles on the target vehicle. We provide in-depth analyses of the learned spatial-temporal attention weights in various highway scenarios based on different vehicle and environment factors, including target vehicle class, target vehicle location, and traffic density. A demonstration illustrating that STA-LSTM can capture and explain fine-grained lane-changing behaviors is also provided. The data and implementation of STA-LSTM can be found at https://github.com/leilin-research/VTP .