Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often ...takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, and robust principal component analysis. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.
We present the optimal design of a spectral method widely used to initialize nonconvex optimization algorithms for solving phase retrieval and other signal recovery problems. This paper leverages ...recent results that provide an exact characterization of the performance of the spectral method in the high-dimensional limit. This characterization allows us to map the task of optimal design to a constrained optimization problem in a weighted <inline-formula><tex-math notation="LaTeX">L^2</tex-math></inline-formula> function space. The latter has a closed-form solution. Interestingly, under a mild technical condition, our results show that there exists a fixed design that is uniformly optimal over all sampling ratios. Numerical simulations demonstrate the performance improvement brought by the proposed optimal design over existing constructions in the literature. In a recent work, Mondelli and Montanari have shown the existence of a weak recovery threshold below which the spectral method cannot provide useful estimates. Our results serve to complement that work by deriving the fundamental limit of the spectral method beyond the aforementioned threshold.
For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of ...memory and computational resources in a timely manner for decision making. This often is coupled with the missing data problem, such that only a small fraction of data attributes are observed. These complications impose significant, and unconventional, constraints on the problem of streaming principal component analysis (PCA) and subspace tracking, which is an essential building block for many inference tasks in signal processing and machine learning. This survey article reviews a variety of classical and recent algorithms for solving this problem with low computational and memory complexities, particularly those applicable in the big data regime with missing data. We illustrate that streaming PCA and subspace tracking algorithms can be understood through algebraic and geometric perspectives, and they need to be adjusted carefully to handle missing data. Both asymptotic and nonasymptotic convergence guarantees are reviewed. Finally, we benchmark the performance of several competitive algorithms in the presence of missing data for both well-conditioned and ill-conditioned systems.
Acoustic echoes reveal room shape Dokmanić, Ivan; Parhizkar, Reza; Walther, Andreas ...
Proceedings of the National Academy of Sciences - PNAS,
07/2013, Letnik:
110, Številka:
30
Journal Article
Recenzirano
Odprti dostop
Imagine that you are blindfolded inside an unknown room. You snap your fingers and listen to the room’s response. Can you hear the shape of the room? Some people can do it naturally, but can we ...design computer algorithms that hear rooms? We show how to compute the shape of a convex polyhedral room from its response to a known sound, recorded by a few microphones. Geometric relationships between the arrival times of echoes enable us to “blindfoldedly” estimate the room geometry. This is achieved by exploiting the properties of Euclidean distance matrices. Furthermore, we show that under mild conditions, first-order echoes provide a unique description of convex polyhedral rooms. Our algorithm starts from the recorded impulse responses and proceeds by learning the correct assignment of echoes to walls. In contrast to earlier methods, the proposed algorithm reconstructs the full 3D geometry of the room from a single sound emission, and with an arbitrary geometry of the microphone array. As long as the microphones can hear the echoes, we can position them as we want. Besides answering a basic question about the inverse problem of room acoustics, our results find applications in areas such as architectural acoustics, indoor localization, virtual reality, and audio forensics.
We propose a fully distributed Gauss-Newton algorithm for state estimation of electric power systems. At each Gauss-Newton iteration, matrix-splitting techniques are utilized to carry out the matrix ...inversion needed for calculating the Gauss-Newton step in a distributed fashion. In order to reduce the communication burden as well as increase robustness of state estimation, the proposed distributed scheme relies only on local information and a limited amount of information from neighboring areas. The matrix-splitting scheme is designed to calculate the Gauss-Newton step with exponential convergence speed. The effectiveness of the method is demonstrated in various numerical experiments.
We propose a randomized version of the nonlocal means (NLM) algorithm for large-scale image filtering. The new algorithm, called Monte Carlo nonlocal means (MCNLM), speeds up the classical NLM by ...computing a small subset of image patch distances, which are randomly selected according to a designed sampling pattern. We make two contributions. First, we analyze the performance of the MCNLM algorithm and show that, for large images or large external image databases, the random outcomes of MCNLM are tightly concentrated around the deterministic full NLM result. In particular, our error probability bounds show that, at any given sampling ratio, the probability for MCNLM to have a large deviation from the original NLM solution decays exponentially as the size of the image or database grows. Second, we derive explicit formulas for optimal sampling patterns that minimize the error probability bound by exploiting partial knowledge of the pairwise similarity weights. Numerical experiments show that MCNLM is competitive with other state-of-the-art fast NLM algorithms for single-image denoising. When applied to denoising images using an external database containing ten billion patches, MCNLM returns a randomized solution that is within 0.2 dB of the full NLM solution while reducing the runtime by three orders of magnitude.
Estimating low-rank positive-semidefinite (PSD) matrices from symmetric rank-one measurements is of great importance in many applications, such as high-dimensional data processing, quantum state ...tomography, and phase retrieval. When the rank is known a priori, this problem can be regarded as solving a system of quadratic equations of a low-dimensional subspace. The authors develop a fast iterative algorithm based on an adaptation of the Kaczmarz method, which is traditionally used for solving overdetermined linear systems. In particular, the authors characterize the dynamics of the algorithm when the measurement vectors are composed of standard Gaussian entries in the online setting. Numerical simulations demonstrate the compelling performance of the proposed algorithm.
The spectral theory of graphs provides a bridge between classical signal processing and the nascent field of graph signal processing. In this paper, a spectral graph analogy to Heisenberg's ...celebrated uncertainty principle is developed. Just as the classical result provides a tradeoff between signal localization in time and frequency, this result provides a fundamental tradeoff between a signal's localization on a graph and in its spectral domain. Using the eigenvectors of the graph Laplacian as a surrogate Fourier basis, quantitative definitions of graph and spectral "spreads" are given, and a complete characterization of the feasibility region of these two quantities is developed. In particular, the lower boundary of the region, referred to as the uncertainty curve, is shown to be achieved by eigenvectors associated with the smallest eigenvalues of an affine family of matrices. The convexity of the uncertainty curve allows it to be found to within ε by a fast approximation algorithm requiring O (ε -1/2 ) typically sparse eigenvalue evaluations. Closed-form expressions for the uncertainty curves for some special classes of graphs are derived, and an accurate analytical approximation for the expected uncertainty curve of Erd-s-Rényi random graphs is developed. These theoretical results are validated by numerical experiments, which also reveal an intriguing connection between diffusion processes on graphs and the uncertainty bounds.
Transfer learning seeks to improve the generalization performance of a target task by exploiting the knowledge learned from a related source task. Central questions include deciding what information ...one should transfer and when transfer can be beneficial. The latter question is related to the so-called negative transfer phenomenon, where the transferred source information actually reduces the generalization performance of the target task. This happens when the two tasks are sufficiently dissimilar. In this paper, we present a theoretical analysis of transfer learning by studying a pair of related perceptron learning tasks. Despite the simplicity of our model, it reproduces several key phenomena observed in practice. Specifically, our asymptotic analysis reveals a phase transition from negative transfer to positive transfer as the similarity of the two tasks moves past a well-defined threshold.