Sample covariance matrices are widely used in multivariate statistical analysis. The central limit theorems (CLTs) for linear spectral statistics of high-dimensional noncentralized sample covariance ...matrices have received considerable attention in random matrix theory and have been applied to many high-dimensional statistical problems. However, known population mean vectors are assumed for noncentralized sample covariance matrices, some of which even assume Gaussian-like moment conditions. In fact, there are still another two most frequently used sample covariance matrices: the ME (moment estimator, constructed by subtracting the sample mean vector from each sample vector) and the unbiased sample covariance matrix (by changing the denominator n as N = n — 1 in the ME) without depending on unknown population mean vectors. In this paper, we not only establish the new CLTs for noncentralized sample covariance matrices when the Gaussianlike moment conditions do not hold but also characterize the nonnegligible differences among the CLTs for the three classes of high-dimensional sample covariance matrices by establishing a substitution principle: by substituting the adjusted sample size N = n — 1 for the actual sample size n in the centering term of the new CLTs, we obtain the CLT of the unbiased sample covariance matrices. Moreover, it is found that the difference between the CLTs for the ME and unbiased sample covariance matrix is nonnegligible in the centering term although the only difference between two sample covariance matrices is a normalization by n and n — 1, respectively. The new results are applied to two testing problems for high-dimensional covariance matrices.
Abstract
Studying the positive definiteness of the covariance matrix of discrete samples helps to determine whether the dimensionality of the samples can be reduced, which is beneficial for ...optimizing the number of samples and designing optimal plans for sampling surveys. This paper aims to provide a method to determine the variable numbers of the sample subjecting to Poisson distribution.
Methods
. It is based on the theory of
I
-linear combination and its properties which are the author’s previous studying results.
Results
. study shows the covariance matrix of multi-Poisson distribution is positively defined and the probability of the sample covariance matrix of multi-poisson distribution is about 1 when the sample capacity is very large.
Conclusion
. The dimension size of the sample data matrix of multi-poisson distribution can be reduced when the sample capacity
n
is no more than the dimension size
p
.
Frequency diverse array and multi-input and multi-output (FDA-MIMO) radar have been applied to many fields due to its angle-range-dependent beampattern. A part of the related array processing ...algorithms can be successfully used on the basis of the known covariance matrix (CM). The estimation accuracy of the CM directly influences the algorithm performance. In particular, the estimation performance of the sample covariance matrix (SCM) will be degraded shapely once the sample size is less than the channel number. Aiming to improve the estimation accuracy of the SCM with FDA-MIMO radar, we propose a novel shrinkage-to-tapering (ST)-based method combined with the block Toeplitz rectification (STT). Firstly, each block matrix in the CM is processed by using the Toeplitz rectification method, and then, three different estimation matrix structures based on the ST method are proposed. Next, by plugging the unbiased estimators into the optimal shrinkage coefficient and normalized mean square error (MSE), their closed forms of their estimators can be given. The numerical simulation results demonstrate that the proposed three different STT approaches are superior to a number of existing methods in terms of CM estimation performance, and the estimation performance is related with the choice of tapering matrix.
In this paper, we study the asymptotic behavior of the extreme eigenvalues and eigenvectors of the high-dimensional spiked sample covariance matrices, in the supercritical case when a reliable ...detection of spikes is possible. In particular, we derive the joint distribution of the extreme eigenvalues and the generalized components of the associated eigenvectors, that is, the projections of the eigenvectors onto arbitrary given direction, assuming that the dimension and sample size are comparably large. In general, the joint distribution is given in terms of linear combinations of finitely many Gaussian and Chi-square variables, with parameters depending on the projection direction and the spikes. Our assumption on the spikes is fully general. First, the strengths of spikes are only required to be slightly above the critical threshold and no upper bound on the strengths is needed. Second, multiple spikes, that is, spikes with the same strength, are allowed. Third, no structural assumption is imposed on the spikes. Thanks to the general setting, we can then apply the results to various high dimensional statistical hypothesis testing problems involving both the eigenvalues and eigenvectors. Specifically, we propose accurate and powerful statistics to conduct hypothesis testing on the principal components. These statistics are data-dependent and adaptive to the underlying true spikes. Numerical simulations also confirm the accuracy and powerfulness of our proposed statistics and illustrate significantly better performance compared to the existing methods in the literature. In particular, our methods are accurate and powerful even when either the spikes are small or the dimension is large.
A highly popular regularized (shrinkage) covariance matrix estimator is the shrinkage sample covariance matrix (SCM) which shares the same set of eigenvectors as the SCM but shrinks its eigenvalues ...toward the grand mean of the eigenvalues of the SCM. In this paper, a more general approach is considered in which the SCM is replaced by an M-estimator of scatter matrix and a fully automatic data adaptive method to compute the optimal shrinkage parameter with minimum mean squared error is proposed. Our approach permits the use of any weight function such as Gaussian, Huber's, Tyler's, or <inline-formula><tex-math notation="LaTeX">t</tex-math></inline-formula> weight functions, all of which are commonly used in M-estimation framework. Our simulation examples illustrate that shrinkage M-estimators based on the proposed optimal tuning combined with robust weight function do not loose in performance to shrinkage SCM estimator when the data is Gaussian, but provide significantly improved performance when the data is sampled from an unspecified heavy-tailed elliptically symmetric distribution. Also, real-world and synthetic stock market data validate the performance of the proposed method in practical applications.
We consider general high-dimensional spiked sample covariance models and show that their leading sample spiked eigenvalues and their linear spectral statistics are asymptotically independent when the ...sample size and dimension are proportional to each other. As a byproduct, we also establish the central limit theorem of the leading sample spiked eigenvalues by removing the block diagonal assumption on the population covariance matrix, which is commonly needed in the literature. Moreover, we propose consistent estimators of the L4 norm of the spiked population eigenvectors. Based on these results, we develop a new statistic to test the equality of two spiked population covariance matrices. Numerical studies show that the new test procedure is more powerful than some existing methods.
•Covariance matrices are used in various signal processing tasks.•A cross-validation method for shrinkage covariance matrix estimation is proposed.•The method provides closed-form solutions for ...general covariance matrix estimators.•Applications to array signal processing applications are demonstrated.
Shrinkage can effectively improve the condition number and accuracy of covariance matrix estimation, especially for low-sample-support applications with the number of training samples smaller than the dimensionality. This paper investigates parameter choice for linear shrinkage estimators. We propose data-driven, leave-one-out cross-validation (LOOCV) methods for automatically choosing the shrinkage coefficients, aiming to minimize the Frobenius norm of the estimation error. A quadratic loss is used as the prediction error for LOOCV. The resulting solutions can be found analytically or by solving optimization problems of small sizes and thus have low complexities. Our proposed methods are compared with various existing techniques. We show that the LOOCV method achieves near-oracle performance for shrinkage designs using sample covariance matrix (SCM) and several typical shrinkage targets. Furthermore, the LOOCV method provides low-complexity solutions for estimators that use general shrinkage targets, multiple targets, and/or ordinary least squares (OLS)-based covariance matrix estimation. We also show applications of our proposed techniques to several different problems in array signal processing.
The eigenvector empirical spectral distribution (VESD) is a useful tool in studying the limiting behavior of eigenvalues and eigenvectors of covariance matrices. In this paper, we study the ...convergence rate of the VESD of sample covariance matrices to the deformed Marčenko–Pastur (MP) distribution. Consider sample covariance matrices of the form Σ1/2
XX*Σ1/2, where X = (xij) is an M × N random matrix whose entries are independent random variables with mean zero and variance N−1
, and Σ is a deterministic positive-definite matrix. We prove that the Kolmogorov distance between the expected VESD and the deformed MP distribution is bounded by N
−1+ϵ
for any fixed ϵ > 0, provided that the entries √Nxij
have uniformly bounded 6th moments and |N/M − 1| ≥ τ for some constant τ > 0. This result improves the previous one obtained in (Ann. Statist. 41 (2013) 2572–2607), which gave the convergence rate O(N
−1/2) assuming i.i.d. X entries, bounded 10th moment, Σ = I and M < N. Moreover, we also prove that under the finite 8th moment assumption, the convergence rate of the VESD is O(N
−1/2+ϵ
) almost surely for any fixed ϵ > 0, which improves the previous bound N
−1/4+ϵ
in (Ann. Statist. 41 (2013) 2572–2607).
A blind multiband spectrum sensing approach in a wideband scenario is presented using eigenvalues of a reduced sample covariance matrix formed from a small number of samples. In this approach, the ...wideband is split into non-overlapping multiple subbands to determine the vacant subbands. The proposed detection scheme does not require apriori knowledge about the primary users or the noise signals and has lesser computational complexity. Simulation results show better probability of detection for the proposed method in comparison with the existing methods.