This paper describes a method of single-shot global localization based on graph-theoretic matching of instances between a query and a prior map. The proposed framework employs correspondence matching ...based on the maximum clique problem (MCP). The framework is potentially applicable to other map and/or query modalities thanks to the graph-based abstraction of the problem, while many existing global localization methods rely on a query and the dataset in the same modality. We implement it with a semantically labeled 3D point cloud map, and a semantic segmentation image as a query. Leveraging the graph-theoretic framework, the proposed method realizes global localization exploiting only the map and the query. The method shows promising results on multiple large-scale simulated maps of urban scenes.
Multi-Person Tracking (MPT) is often addressed within the detection-to-association paradigm. In such approaches, human detections are first extracted in every frame and person trajectories are then ...recovered by a procedure of data association (usually offline). However, their performances usually degenerate in presence of detection errors, mutual interactions and occlusions. In this paper, we present a deep learning based MPT approach that learns instance-aware representations of tracked persons and robustly online infers states of the tracked persons. Specifically, we design a multi-branch neural network (MBN), which predicts the classification confidences and locations of all targets by taking a batch of candidate regions as input. In our MBN architecture, each branch (instance-subnet) corresponds to an individual to be tracked and new branches can be dynamically created for handling newly appearing persons. Then based on the output of MBN, we construct a joint association matrix that represents meaningful states of tracked persons (e.g., being tracked or disappearing from the scene) and solve it by using the efficient Hungarian algorithm. Moreover, we allow the instance-subnets to be updated during tracking by online mining hard examples, accounting to person appearance variations over time. We comprehensively evaluate our framework on a popular MPT benchmark, demonstrating its excellent performance in comparison with recent online MPT methods.
In multi-object tracking, it is critical to explore the data associations by exploiting the temporal information from a sequence of frames rather than the information from the adjacent two frames. ...Since straightforwardly obtaining data associations from multi-frames is an NP-hard multi-dimensional assignment (MDA) problem, most existing methods solve this MDA problem by either developing complicated approximate algorithms, or simplifying MDA as a 2D assignment problem based upon the information extracted only from adjacent frames. In this paper, we show that the relation between associations of two observations is the equivalence relation in the data association problem, based on the spatial-temporal constraint that the trajectories of different objects must be disjoint. Therefore, the MDA problem can be equivalently divided into independent subproblems by equivalence partitioning. In contrast to existing works for solving the MDA problem, we develop a connected component model (CCM) by exploiting the constraints of the data association and the equivalence relation on the constraints. Based upon CCM, we can efficiently obtain the global solution of the MDA problem for multi-object tracking by optimizing a sequence of independent data association subproblems. Experiments on challenging public data sets demonstrate that our algorithm outperforms the state-of-the-art approaches.
In multi-target tracking, object interactions and occlusions are two significant factors that affect tracking performance. To settle this, we propose an identity association network (IANet) that ...integrates the geometry refinement network (GRNet) and the identity verification (IV) module to perform data association and reason the mapping between the detections and tracklets. In our data association process, the object drifts caused by object interactions are suppressed effectively by encoding the direction and velocity of objects to refine the geometric position of tracklets. The tracklets with refined geometric information are further utilized in the IV module to achieve a sufficient encoding of multivariate spatial cues including both appearance and geometry information, which defeats the misleading impacts of interactions and occlusions dramatically in multi-object tracking. The extensive experiments and comparative evaluations have demonstrated that our proposed method can significantly outperform many state-of-the-art methods on benchmarks of 2D MOT2015, MOT16, MOT17, MOT20, and KITTI by using public detection and online settings.
Information-Based Compact Pose SLAM Ila, V.; Porta, J.M.; Andrade-Cetto, J.
IEEE transactions on robotics,
02/2010, Volume:
26, Issue:
1
Journal Article, Publication
Peer reviewed
Open access
Pose SLAM is the variant of simultaneous localization and map building (SLAM) is the variant of SLAM, in which only the robot trajectory is estimated and where landmarks are only used to produce ...relative constraints between robot poses. To reduce the computational cost of the information filter form of Pose SLAM and, at the same time, to delay inconsistency as much as possible, we introduce an approach that takes into account only highly informative loop-closure links and nonredundant poses. This approach includes constant time procedures to compute the distance between poses, the expected information gain for each potential link, and the exact marginal covariances while moving in open loop, as well as a procedure to recover the state after a loop closure that, in practical situations, scales linearly in terms of both time and memory. Using these procedures, the robot operates most of the time in open loop, and the cost of the loop closure is amortized over long trajectories. This way, the computational bottleneck shifts to data association, which is the search over the set of previously visited poses to determine good candidates for sensor registration. To speed up data association, we introduce a method to search for neighboring poses whose complexity ranges from logarithmic in the usual case to linear in degenerate situations. The method is based on organizing the pose information in a balanced tree whose internal levels are defined using interval arithmetic. The proposed Pose-SLAM approach is validated through simulations, real mapping sessions, and experiments using standard SLAM data sets.
In the future 6G integrated sensing and communication (ISAC) cellular systems, networked sensing is a promising technique that can leverage the cooperation among the base stations (BSs) to perform ...high-resolution localization. However, a dense deployment of BSs to fully reap the networked sensing gain is not a cost-efficient solution in practice. Motivated by the advance in the intelligent reflecting surface (IRS) technology for 6G communication, this paper examines the feasibility of deploying the low-cost IRSs to enhance the anchor density for networked sensing. Specifically, we propose a novel heterogeneous networked sensing architecture, which consists of both the active anchors, i.e., the BSs, and the passive anchors, i.e., the IRSs. Under this framework, the BSs emit the orthogonal frequency division multiplexing (OFDM) communication signals in the downlink for localizing the targets based on their echoes reflected via/not via the IRSs. However, there are two challenges for using passive anchors in localization. First, it is impossible to utilize the round-trip signal between a passive IRS and a passive target for estimating their distance. Second, before localizing a target, we do not know which IRS is closest to it and serves as its anchor. In this paper, we show that the distance between a target and its associated IRS can be indirectly estimated based on the length of the BS-target-BS path and the BS-target-IRS-BS path. Moreover, we propose an efficient data association method to match each target to its associated IRS. Numerical results are given to validate the feasibility and effectiveness of our proposed heterogeneous networked sensing architecture with both active and passive anchors.
Long-term tracking is a commonly overlooked yet practical scenario in multi-object tracking. Handling occlusion and re-identifying long-lost targets are the main challenges for effective long-term ...tracking. In occlusion scenarios, both appearance and motion features can be unreliable, leading to association failure. For long-lost targets, predicting their long-term motion suffers from severe error accumulation, making the target re-identification challenging. In this paper, we propose a multi-object tracker called LTTrack for long-term tracking. For occlusion handling, we develop the Position-Based Association (PBA) module, which encodes relative and absolute positions as interaction and motion features for association. With interaction features, PBA can handle occlusion scenes where appearance and motion features are unreliable. For long-lost target re-identification, the Long-Term Motion (LTM) model is devised. By encoding long-term motion trends of targets for long-term motion prediction, LTM alleviates the error accumulation problem. Moreover, to prevent the erroneous deletion of long-lost tracks, we propose the Zombie Track Re-Match (ZTRM) strategy to re-identify long-lost targets so that they will neither be prematurely deleted nor disrupt the association of other tracks. Extensive experiments conducted on MOT17, MOT20, and DanceTrack demonstrate that LTTrack achieves performance comparable to state-of-the-art methods. The code and models are available at https://github.com/Lin-Jiaping/LTTrack.
Tracking multiple targets with unknown measurement-to-target association and uncertain target dynamics is a significant problem that arises in various applications such as surveillance monitoring and ...intelligent transportation systems. In this paper, we propose an enhanced multi-model multi-scan data association algorithm to address the problem of tracking multiple maneuvering targets. First, we use a probabilistic graphical model to represent the joint distribution of the dynamic model indices, target state, and multi-scan data association variables. This formulation transforms the inference of marginal distributions into a Bethe free energy (BFE) problem. Next, to transform the BFE problem into a convex one, we demonstrate that the BFE function can be made convex through re-weighting. Additionally, we decompose the re-weighted BFE function into a block-wise sum form. We prove that under certain regularization conditions, each block of the re-weighted BFE is convex, ensuring convergence of the primal–dual coordinate ascent algorithm to the minimum of the overall re-weighted BFE. Finally, we provide a particle implementation of the proposed algorithm, accompanied by an analysis of its complexity. Simulation results indicate that the proposed algorithm exhibits favorable performance when compared to both the single-model multi-scan algorithm and the multi-model single-scan algorithm.
•Transform the considered problem into a Bethe free energy formulation.•Prove the re-weighted Bethe free energy problem is convex.•Solving the problem uses the primal–dual coordinate ascent algorithm.•Provide a particle implementation with computational complexity analysis.
With recent advances in object detection, the tracking-by-detection method has become mainstream for multi-object tracking in computer vision. The tracking-by-detection scheme necessarily has to ...resolve a problem of data association between existing tracks and newly received detections at each frame. In this paper, we propose a new deep neural network (DNN) architecture that can solve the data association problem with a variable number of both tracks and detections including false positives. The proposed network consists of two parts: encoder and decoder. The encoder is the fully connected network with several layers that take bounding boxes of both detection and track-history as inputs. The outputs of the encoder are sequentially fed into the decoder which is composed of the bi-directional Long Short-Term Memory (LSTM) networks with a projection layer. The final output of the proposed network is an association matrix that reflects matching scores between tracks and detections. To train the network, we generate training samples using the annotation of Stanford Drone Dataset (SDD). The experiment results show that the proposed network achieves considerably high recall and precision rate as the binary classifier for the assignment tasks. We apply our network to track multiple objects on real-world datasets and evaluate the tracking performance. The performance of our tracker outperforms previous works based on DNN and comparable to other state-of-the-art methods.