•A spatiotemporal matrix completion model for network-wide traffic flow estimation.•The proposed model is formulated as a quadratic programming and solved by ADMM.•A spatial smoothing index based on ...the divergence is developed to measure the difficulty of estimation.•Both real-world and synthetic datasets to evaluate algorithm performances and acquire insights.
With the rapid development of urbanization and modernization, it is increasingly crucial to sense network-wide traffic. Network-wide traffic volume information is of great benefit for traffic planning, government management and vehicle emissions control. However, it is difficult to install detectors on every intersection due to the expensive deployment and maintenance costs, and the insufficient sensor coverage across the network limits the direct availability of network-wide traffic flow information. Whereas, crowdsourcing floating car data with a high coverage rate are currently available, which creates an opportunity to address this problem. In this paper, we propose a novel methodology to estimate network-wide traffic flow, which incorporates flow records and crowdsourcing floating car data into a geometric matrix completion model. Furthermore, a spatial smoothing index based on the divergence is developed to measure the difficulty of volume estimation for each road segment. We conduct extensive experiments on both real-world and synthetic datasets. The results demonstrate that our approach consistently outperforms other benchmark models and that the proposed index is highly correlated to estimation accuracy.
In this paper, we introduce a powerful technique based on Leave-One-Out analysis to the study of low-rank matrix completion problems. Using this technique, we develop a general approach for obtaining ...fine-grained, entrywise bounds for iterative stochastic procedures in the presence of probabilistic dependency. We demonstrate the power of this approach in analyzing two of the most important algorithms for matrix completion: (i) the non-convex approach based on Projected Gradient Descent (PGD) for a rank-constrained formulation, also known as the Singular Value Projection algorithm, and (ii) the convex relaxation approach based on nuclear norm minimization (NNM). Using this approach, we establish the first convergence guarantee for the original form of PGD without regularization or sample splitting , and in particular shows that it converges linearly in the infinity norm . For NNM, we use this approach to study a fictitious iterative procedure that arises in the dual analysis . Our results show that NNM recovers an <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula>-by- <inline-formula> <tex-math notation="LaTeX">d </tex-math></inline-formula> rank- <inline-formula> <tex-math notation="LaTeX">r </tex-math></inline-formula> matrix with <inline-formula> <tex-math notation="LaTeX">\mathcal {O}(\mu r \log (\mu r) d\log d) </tex-math></inline-formula> observed entries. This bound has optimal dependence on the matrix dimension and is independent of the condition number. To the best of our knowledge, none of previous sample complexity results for tractable matrix completion algorithms satisfies these two properties simultaneously.
The matrix completion problem consists of finding or approximating a low-rank matrix based on a few samples of this matrix. We propose a new algorithm for matrix completion that minimizes the ...least-square distance on the sampling set over the Riemannian manifold of fixed-rank matrices. The algorithm is an adaptation of classical nonlinear conjugate gradients, developed within the framework of retraction-based optimization on manifolds. We describe all the necessary objects from differential geometry necessary to perform optimization over this low-rank matrix manifold, seen as a submanifold embedded in the space of matrices. In particular, we describe how metric projection can be used as retraction and how vector transport lets us obtain the conjugate search directions. Finally, we prove convergence of a regularized version of our algorithm under the assumption that the restricted isometry property holds for incoherent matrices throughout the iterations. The numerical experiments indicate that our approach scales very well for large-scale problems and compares favorably with the state-of-the-art, while outperforming most existing solvers. PUBLICATION ABSTRACT
In this paper, we consider the contractive real symmetric matrix completion problems motivated in part by studies on sparse (or dense) matrices for weighted sparse recovery problems and rating ...matrices with rating density in recommender systems. We completely characterize symmetric patterns P with the property (C) that every partially contractive real symmetric matrix with pattern P has a contractive real symmetric completion using graphs.
•This paper proposes a novel mobile crowdsensing framework for modal identification.•It can be used for drive-by-based dense spatial-resolution mode shape identification.•It converts mode shape ...identification into a physical-informed optimization problem.•Numerical and experimental validations are conducted to verify the approach.•Road roughness and measurement noise effects are considered.
Moving vehicles equipped with various types of sensors can efficiently monitor the health conditions of a population of transportation infrastructure such as bridges. This paper presents a mobile crowdsensing framework to identify dense spatial-resolution bridge mode shapes using sparse drive-by measurements. The proposed method converts mode shape identification into a physical-informed optimization problem with two objective function terms. The first objective minimises the mode shape identification error based on the fact that the ratio of a specific order mode shape value at any two locations is time-invariant. Since the bridge mode shape should be globally smooth even when the local stiffness is discontinuous, the smoothness of the identified mode shape is introduced as the second objective. The feasibility and advantages of the proposed model are verified numerically and through large-scale experimental studies. Numerical results demonstrate that the proposed method can efficiently identify bridge mode shapes with a desirable accuracy. The adverse effects of road roughness and measurement noise on the mode shape identification accuracy are substantially suppressed by introducing crowdsensing and making use of collected responses over multiple trips. The applicability of the proposed method for bridges having varying cross sections and multiple spans is also studied. A series of drive-by tests with different vehicle masses and speeds are conducted on a large-scale footbridge. The experimental results verify that the proposed method can accurately identify the bridge mode shapes and is robust to vehicle mass and speed variation. The identification accuracy of large-scale bridge mode shapes using crowdsensing drive-by measurements is demonstrated in this study.
1-Bit matrix completion Davenport, Mark A; Plan, Yaniv; van den Berg, Ewout ...
Information and inference,
09/2014, Volume:
3, Issue:
3
Journal Article
Peer reviewed
In this paper, we develop a theory of matrix completion for the extreme case of noisy 1-bit observations. Instead of observing a subset of the real-valued entries of a matrix
M
, we obtain a small ...number of binary (1-bit) measurements generated according to a probability distribution determined by the real-valued entries of
M
. The central question we ask is whether or not it is possible to obtain an accurate estimate of
M
from this data. In general, this would seem impossible, but we show that the maximum likelihood estimate under a suitable constraint returns an accurate estimate of
M
when ∥
M
∥∞ ≤ α and rank(
M
) ≤ r. If the log-likelihood is a concave function (e.g. the logistic or probit observation models), then we can obtain this maximum likelihood estimate by optimizing a convex program. In addition, we also show that if instead of recovering
M
we simply wish to obtain an estimate of the distribution generating the 1-bit measurements, then we can eliminate the requirement that
M
when ∥
M
∥∞ ≤ α. For both cases, we provide lower bounds showing that these estimates are near-optimal. We conclude with a suite of experiments that both verify the implications of our theorems as well as illustrate some of the practical applications of 1-bit matrix completion. In particular, we compare our programme to standard matrix completion methods on movie rating data in which users submit ratings from 1 to 5. In order to use our program, we quantize this data to a single bit, but we allow the standard matrix completion program to have access to the original ratings (from 1 to 5). Surprisingly, the approach based on binary data performs significantly better.
Abstract
Emerging evidence shows that microRNAs (miRNAs) play a critical role in diverse fundamental and important biological processes associated with human diseases. Inferring potential disease ...related miRNAs and employing them as the biomarkers or drug targets could contribute to the prevention, diagnosis and treatment of complex human diseases. In view of that traditional biological experiments cost much time and resources, computational models would serve as complementary means to uncover potential miRNA–disease associations. In this study, we proposed a new computational model named Neighborhood Constraint Matrix Completion for MiRNA–Disease Association prediction (NCMCMDA) to predict potential miRNA–disease associations. The main task of NCMCMDA was to recover the missing miRNA–disease associations based on the known miRNA–disease associations and integrated disease (miRNA) similarity. In this model, we innovatively integrated neighborhood constraint with matrix completion, which provided a novel idea of utilizing similarity information to assist the prediction. After the recovery task was transformed into an optimization problem, we solved it with a fast iterative shrinkage-thresholding algorithm. As a result, the AUCs of NCMCMDA in global and local leave-one-out cross validation were 0.9086 and 0.8453, respectively. In 5-fold cross validation, NCMCMDA achieved an average AUC of 0.8942 and standard deviation of 0.0015, which demonstrated NCMCMDA’s superior performance than many previous computational methods. Furthermore, NCMCMDA was applied to three different types of case studies to further evaluate its prediction reliability and accuracy. As a result, 84% (colon neoplasms), 98% (esophageal neoplasms) and 98% (breast neoplasms) of the top 50 predicted miRNAs were verified by recent literature.
There is growing interest in multilabel image classification due to its critical role in web-based image analytics-based applications, such as large-scale image retrieval and browsing. Matrix ...completion (MC) has recently been introduced as a method for transductive (semisupervised) multilabel classification, and has several distinct advantages, including robustness to missing data and background noise in both feature and label space. However, it is limited by only considering data represented by a single-view feature, which cannot precisely characterize images containing several semantic concepts. To utilize multiple features taken from different views, we have to concatenate the different features as a long vector. However, this concatenation is prone to over-fitting and often leads to very high time complexity in MC-based image classification. Therefore, we propose to weightedly combine the MC outputs of different views, and present the multiview MC (MVMC) framework for transductive multilabel image classification. To learn the view combination weights effectively, we apply a cross-validation strategy on the labeled set. In particular, MVMC splits the labeled set into two parts, and predicts the labels of one part using the known labels of the other part. The predicted labels are then used to learn the view combination coefficients. In the learning process, we adopt the average precision (AP) loss, which is particular suitable for multilabel image classification, since the ranking-based criteria are critical for evaluating a multilabel classification system. A least squares loss formulation is also presented for the sake of efficiency, and the robustness of the algorithm based on the AP loss compared with the other losses is investigated. Experimental evaluation on two real-world data sets (PASCAL VOC' 07 and MIR Flickr) demonstrate the effectiveness of MVMC for transductive (semisupervised) multilabel image classification, and show that MVMC can exploit complementary properties of different features and output-consistent labels for improved multilabel image classification.
Spectral Gap-Based Seismic Survey Design Lopez, Oscar; Kumar, Rajiv; Moldoveanu, Nick ...
IEEE transactions on geoscience and remote sensing,
01/2023, Volume:
61
Journal Article
Peer reviewed
Open access
Seismic imaging in challenging sedimentary basins and reservoirs requires acquiring, processing, and imaging very large volumes of data (tens of terabytes). To reduce the cost of acquisition and the ...time from acquiring the data to producing a subsurface image, novel acquisition systems based on compressive sensing, low-rank matrix recovery, and randomized sampling have been developed and implemented. These approaches allow practitioners to achieve dense wavefield reconstruction from a substantially reduced number of field samples. However, designing acquisition surveys suited for this new sampling paradigm remains a critical and challenging role in oil, gas, and geothermal exploration. Typical random designs studied in the low-rank matrix recovery and compressive sensing literature are difficult to achieve by standard industry hardware. For practical purposes, a compromise between stochastic and realizable samples is needed. In this paper, we propose a deterministic and computationally cheap tool to alleviate randomized acquisition design, prior to survey deployment and large-scale optimization. We consider universal and deterministic matrix completion results in the context of seismology, where a bipartite graph representation of the source-receiver layout allows for the respective spectral gap to act as a quality metric for wavefield reconstruction. We provide realistic scenarios to demonstrate the utility of the spectral gap as a flexible tool that can be incorporated into existing survey design workflows for successful seismic data acquisition via low-rank and sparse signal recovery.
Conventional methods of matrix completion are linear methods that are not effective in handling data of nonlinear structures. Recently a few researchers attempted to incorporate nonlinear techniques ...into matrix completion but there still exists considerable limitations. In this paper, a novel method called deep matrix factorization (DMF) is proposed for nonlinear matrix completion. Different from conventional matrix completion methods that are based on linear latent variable models, DMF is on the basis of a nonlinear latent variable model. DMF is formulated as a deep-structure neural network, in which the inputs are the low-dimensional unknown latent variables and the outputs are the partially observed variables. In DMF, the inputs and the parameters of the multilayer neural network are simultaneously optimized to minimize the reconstruction errors for the observed entries. Then the missing entries can be readily recovered by propagating the latent variables to the output layer. DMF is compared with state-of-the-art methods of linear and nonlinear matrix completion in the tasks of toy matrix completion, image inpainting and collaborative filtering. The experimental results verify that DMF is able to provide higher matrix completion accuracy than existing methods do and DMF is applicable to large matrices.