Many problems in the sciences and engineering can be rephrased as optimization problems on matrix search spaces endowed with a so-called manifold structure. This book shows how to exploit the special ...structure of such problems to develop efficient numerical algorithms. It places careful emphasis on both the numerical formulation of the algorithm and its differential geometric abstraction--illustrating how good algorithms draw equally from the insights of differential geometry, optimization, and numerical analysis. Two more theoretical chapters provide readers with the background in differential geometry necessary to algorithmic development. In the other chapters, several well-known optimization methods such as steepest descent and conjugate gradients are generalized to abstract manifolds. The book provides a generic development of each of these methods, building upon the material of the geometric chapters. It then guides readers through the calculations that turn these geometrically formulated methods into concrete numerical algorithms. The state-of-the-art algorithms given as examples are competitive with the best existing algorithms for a selection of eigenspace problems in numerical linear algebra.
This paper deals with constructing retractions, a key step when applying optimization algorithms on matrix manifolds. For submanifolds of Euclidean spaces, we show that the operation consisting of ...taking a tangent step in the embedding Euclidean space followed by a projection onto the submanifold is a retraction. We also show that the operation remains a retraction if the projection is generalized to a projection-like procedure that consists of coming back to the submanifold along "admissible" directions, and we give a sufficient condition on the admissible directions for the generated retraction to be second order. This theory offers a framework in which previously proposed retractions can be analyzed, as well as a toolbox for constructing new ones. Illustrations are given for projection-like procedures on some specific manifolds for which we have an explicit, easy-to-compute expression.
Abstract
We consider the minimization of a cost function f on a manifold $\mathcal{M}$ using Riemannian gradient descent and Riemannian trust regions (RTR). We focus on satisfying necessary ...optimality conditions within a tolerance ε. Specifically, we show that, under Lipschitz-type assumptions on the pullbacks of f to the tangent spaces of $\mathcal{M}$, both of these algorithms produce points with Riemannian gradient smaller than ε in $\mathcal{O}\big(1/\varepsilon ^{2}\big)$ iterations. Furthermore, RTR returns a point where also the Riemannian Hessian’s least eigenvalue is larger than −ε in $\mathcal{O} \big(1/\varepsilon ^{3}\big)$ iterations. There are no assumptions on initialization. The rates match their (sharp) unconstrained counterparts as a function of the accuracy ε (up to constants) and hence are sharp in that sense. These are the first deterministic results for global rates of convergence to approximate first- and second-order Karush-Kuhn-Tucker points on manifolds. They apply in particular for optimization constrained to compact submanifolds of ${\mathbb{R}^{n}}$, under simpler assumptions.
The low-rank matrix completion problem can be solved by Riemannian optimization on a fixed-rank manifold. However, a drawback of the known approaches is that the rank parameter has to be fixed a ...priori. In this paper, we consider the optimization problem on the set of bounded-rank matrices. We propose a Riemannian rank-adaptive method, which consists of fixed-rank optimization, rank increase step and rank reduction step. We explore its performance applied to the low-rank matrix completion problem. Numerical experiments on synthetic and real-world datasets illustrate that the proposed rank-adaptive method compares favorably with state-of-the-art algorithms. In addition, it shows that one can incorporate each aspect of this rank-adaptive framework separately into existing algorithms for the purpose of improving performance.
Context. Data processing constitutes a critical component of high-contrast exoplanet imaging. Its role is almost as important as the choice of a coronagraph or a wavefront control system, and it is ...intertwined with the chosen observing strategy. Among the data processing techniques for angular differential imaging (ADI), the most recent is the family of principal component analysis (PCA) based algorithms. It is a widely used statistical tool developed during the first half of the past century. PCA serves, in this case, as a subspace projection technique for constructing a reference point spread function (PSF) that can be subtracted from the science data for boosting the detectability of potential companions present in the data. Unfortunately, when building this reference PSF from the science data itself, PCA comes with certain limitations such as the sensitivity of the lower dimensional orthogonal subspace to non-Gaussian noise. Aims. Inspired by recent advances in machine learning algorithms such as robust PCA, we aim to propose a localized subspace projection technique that surpasses current PCA-based post-processing algorithms in terms of the detectability of companions at near real-time speed, a quality that will be useful for future direct imaging surveys. Methods. We used randomized low-rank approximation methods recently proposed in the machine learning literature, coupled with entry-wise thresholding to decompose an ADI image sequence locally into low-rank, sparse, and Gaussian noise components (LLSG). This local three-term decomposition separates the starlight and the associated speckle noise from the planetary signal, which mostly remains in the sparse term. We tested the performance of our new algorithm on a long ADI sequence obtained on beta Pictoris with VLT/NACO. Results. Compared to a standard PCA approach, LLSG decomposition reaches a higher signal-to-noise ratio and has an overall better performance in the receiver operating characteristic space. This three-term decomposition brings a detectability boost compared to the full-frame standard PCA approach, especially in the small inner working angle region where complex speckle noise prevents PCA from discerning true companions from noise.
Approximate matrix factorization techniques with both nonnegativity and orthogonality constraints, referred to as orthogonal nonnegative matrix factorization (ONMF), have been recently introduced and ...shown to work remarkably well for clustering tasks such as document classification. In this paper, we introduce two new methods to solve ONMF. First, we show mathematical equivalence between ONMF and a weighted variant of spherical k-means, from which we derive our first method, a simple EM-like algorithm. This also allows us to determine when ONMF should be preferred to k-means and spherical k-means. Our second method is based on an augmented Lagrangian approach. Standard ONMF algorithms typically enforce nonnegativity for their iterates while trying to achieve orthogonality at the limit (e.g., using a proper penalization term or a suitably chosen search direction). Our method works the opposite way: orthogonality is strictly imposed at each step while nonnegativity is asymptotically obtained, using a quadratic penalty. Finally, we show that the two proposed approaches compare favorably with standard ONMF algorithms on synthetic, text and image data sets.
We address the numerical problem of recovering large matrices of low rank when most of the entries are unknown. We exploit the geometry of the low-rank constraint to recast the problem as an ...unconstrained optimization problem on a single Grassmann manifold. We then apply second-order Riemannian trust-region methods (RTRMC 2) and Riemannian conjugate gradient methods (RCGMC) to solve it. A preconditioner for the Hessian is introduced that helps control the conditioning of the problem and we detail preconditioned versions of Riemannian optimization algorithms. The cost of each iteration is linear in the number of known entries. The proposed methods are competitive with state-of-the-art algorithms on a wide range of problem instances. In particular, they perform well on rectangular matrices. We further note that second-order and preconditioned methods are well suited to solve badly conditioned matrix completion tasks.
Affecting millions of individuals yearly, malaria is one of the most dangerous and deadly tropical diseases. It is a major global public health problem, with an alarming spread of parasite ...transmitted by mosquito (Anophele). Various studies have emerged that construct a mathematical and statistical model for malaria incidence forecasting. In this study, we formulate a generalized linear model based on Poisson and negative binomial regression models for forecasting malaria incidence, taking into account climatic variables (such as the monthly rainfall, average temperature, relative humidity), other predictor variables (the insecticide-treated bed-nets (ITNs) distribution and Artemisinin-based combination therapy (ACT)) and the history of malaria incidence in Dakar, Fatick and Kedougou, three different endemic regions of Senegal. A forecasting algorithm is developed by taking the meteorological explanatory variable Xj at time t-𝓁j, where
is the observation time and 𝓁j is the lag in Xj that maximizes its correlation with the malaria incidence. We saturated the rainfall in order to reduce over-forecasting. The results of this study show that the Poisson regression model is more adequate than the negative binomial regression model to forecast accurately the malaria incidence taking into account some explanatory variables. The application of the saturation where the over-forecasting was observed noticeably increases the quality of the forecasts.