This paper considers the noisy sparse phase retrieval problem: recovering a sparse signal x ϵ ℝp from noisy quadratic measurements yj = (a′jx)² + εj, j = 1 , . . . , m, with independent ...sub-exponential noise εj. The goals are to understand the effect of the sparsity of x on the estimation precision and to construct a computationally feasible estimator to achieve the optimal rates adaptively. Inspired by the Wirtinger Flow IEEE Trans. Inform. Theory 61 (2015) 1985-2007 proposed for non-sparse and noiseless phase retrieval, a novel thresholded gradient descent algorithm is proposed and it is shown to adaptively achieve the minimax optimal rates of convergence over a wide range of sparsity levels when the aj's are independent standard Gaussian random vectors, provided that the sample size is sufficiently large compared to the sparsity of x.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
We consider the orthogonal matching pursuit (OMP) algorithm for the recovery of a high-dimensional sparse signal based on a small number of noisy linear measurements. OMP is an iterative greedy ...algorithm that selects at each step the column, which is most correlated with the current residuals. In this paper, we present a fully data driven OMP algorithm with explicit stopping rules. It is shown that under conditions on the mutual incoherence and the minimum magnitude of the nonzero components of the signal, the support of the signal can be recovered exactly by the OMP algorithm with high probability. In addition, we also consider the problem of identifying significant components in the case where some of the nonzero components are possibly small. It is shown that in this case the OMP algorithm will still select all the significant components before possibly selecting incorrect ones. Moreover, with modified stopping rules, the OMP algorithm can ensure that no zero components are selected.
High-dimensional logistic regression is widely used in analyzing data with binary outcomes. In this article, global testing and large-scale multiple testing for the regression coefficients are ...considered in both single- and two-regression settings. A test statistic for testing the global null hypothesis is constructed using a generalized low-dimensional projection for bias correction and its asymptotic null distribution is derived. A lower bound for the global testing is established, which shows that the proposed test is asymptotically minimax optimal over some sparsity range. For testing the individual coefficients simultaneously, multiple testing procedures are proposed and shown to control the false discovery rate and falsely discovered variables asymptotically. Simulation studies are carried out to examine the numerical performance of the proposed tests and their superiority over existing methods. The testing procedures are also illustrated by analyzing a dataset of a metabolomics study that investigates the association between fecal metabolites and pediatric Crohn's disease and the effects of treatment on such associations.
Supplementary materials
for this article are available online.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
Perturbation bounds for singular spaces, in particular Wedin’s sin Θ theorem, are a fundamental tool in many fields including high-dimensional statistics, machine learning and applied mathematics. In ...this paper, we establish separate perturbation bounds, measured in both spectral and Frobenius sin Θ distances, for the left and right singular subspaces. Lower bounds, which show that the individual perturbation bounds are rate-optimal, are also given.
The new perturbation bounds are applicable to a wide range of problems. In this paper, we consider in detail applications to low-rank matrix denoising and singular space estimation, high-dimensional clustering and canonical correlation analysis (CCA). In particular, separate matching upper and lower bounds are obtained for estimating the left and right singular spaces. To the best of our knowledge, this is the first result that gives different optimal rates for the left and right singular spaces under the same perturbation.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the ...principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in term of the convergence rate. The lower bound is obtained by calculating the local metric entropy and an application of Fano's lemma. The rate optimal estimator is constructed using aggregation, which, however, might not be computationally feasible. We then introduce an adaptive procedure for estimating the principal subspace which is fully data driven and can be computed efficiently. It is shown that the estimator attains the optimal rates of convergence simultaneously over a large collection of the parameter spaces. A key idea in our construction is a reduction scheme which reduces the sparse PCA problem to a high-dimensional multivariate regression problem. This method is potentially also useful for other related problems.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
The paper considers in the high dimensional setting a canonical testing problem in multivariate analysis, namely testing the equality of two mean vectors. We introduce a new test statistic that is ...based on a linear transformation of the data by the precision matrix which incorporates the correlations between the variables. The limiting null distribution of the test statistic and the power of the test are analysed. It is shown that the test is particularly powerful against sparse alternatives and enjoys certain optimality. A simulation study is carried out to examine the numerical performance of the test and to compare it with other tests given in the literature. The results show that the test proposed significantly outperforms those tests in a range of settings.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, INZLJ, IZUM, KILJ, NLZOH, NMLJ, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, UL, UM, UPUK, ZRSKP
This article considers minimax and adaptive prediction with functional predictors in the framework of functional linear model and reproducing kernel Hilbert space. Minimax rate of convergence for the ...excess prediction risk is established. It is shown that the optimal rate is determined jointly by the reproducing kernel and the covariance kernel. In particular, the alignment of these two kernels can significantly affect the difficulty of the prediction problem. In contrast, the existing literature has so far focused only on the setting where the two kernels are nearly perfectly aligned. This motivates us to propose an easily implementable data-driven roughness regularization predictor that is shown to attain the optimal rate of convergence adaptively without the need of knowing the covariance kernel. Simulation studies are carried out to illustrate the merits of the adaptive predictor and to demonstrate the theoretical results.
Full text
Available for:
BFBNIB, GIS, IJS, INZLJ, KISLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
Testing covariance structure is of significant interest in many areas of statistical analysis and construction of compressed sensing matrices is an important problem in signal processing. Motivated ...by these applications, we study in this paper the limiting laws of the coherence of an n × p random matrix in the high-dimensional setting where p can be much larger than n. Both the law of large numbers and the limiting distribution are derived. We then consider testing the bandedness of the covariance matrix of a high-dimensional Gaussian distribution which includes testing for independence as a special case. The limiting laws of the coherence of the data matrix play a critical role in the construction of the test. We also apply the asymptotic results to the construction of compressed sensing matrices.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
This paper considers compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix ...recovery. The analysis relies on a key technical tool, which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while yielding sharp results. It is shown that for any given constant t ≥ 4/3, in compressed sensing, δ tk A <; √((t-1)/t) guarantees the exact recovery of all k sparse signals in the noiseless case through the constrained l 1 minimization, and similarly, in affine rank minimization, δ tr M <; √((t-1)/t) ensures the exact reconstruction of all matrices with rank at most r in the noiseless case via the constrained nuclear norm minimization. In addition, for any ε > 0, δ tk A <; √( t-1 / t ) + ε is not sufficient to guarantee the exact recovery of all k-sparse signals for large k. Similar results also hold for matrix recovery. In addition, the conditions δ tk A <; √((t-)1/t) and δ tr M <; √((t-1)/t) are also shown to be sufficient, respectively, for stable recovery of approximately sparse signals and low-rank matrices in the noisy case.
We study the asymptotic distributions of the spiked eigenvalues and the largest nonspiked eigenvalue of the sample covariance matrix under a general covariance model with divergent spiked ...eigenvalues, while the other eigenvalues are bounded but otherwise arbitrary. The limiting normal distribution for the spiked sample eigenvalues is established. It has distinct features that the asymptotic mean relies on not only the population spikes but also the nonspikes and that the asymptotic variance in general depends on the population eigenvectors. In addition, the limiting Tracy–Widom law for the largest nonspiked sample eigenvalue is obtained.
Estimation of the number of spikes and the convergence of the leading eigenvectors are also considered. The results hold even when the number of the spikes diverges. As a key technical tool, we develop a central limit theorem for a type of random quadratic forms where the random vectors and random matrices involved are dependent. This result can be of independent interest.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP