Analysis of networks via the sparse β‐model Chen, Mingli; Kato, Kengo; Leng, Chenlei
Journal of the Royal Statistical Society. Series B, Statistical methodology,
November 2021, 2021-11-01, 20211101, Volume:
83, Issue:
5
Journal Article
Peer reviewed
Open access
Data in the form of networks are increasingly available in a variety of areas, yet statistical models allowing for parameter estimates with desirable statistical properties for sparse networks remain ...scarce. To address this, we propose the Sparse β‐Model (SβM), a new network model that interpolates the celebrated Erdős–Rényi model and the β‐model that assigns one different parameter to each node. By a novel reparameterization of the β‐model to distinguish global and local parameters, our SβM can drastically reduce the dimensionality of the β‐model by requiring some of the local parameters to be zero. We derive the asymptotic distribution of the maximum likelihood estimator of the SβM when the support of the parameter vector is known. When the support is unknown, we formulate a penalized likelihood approach with the ℓ0‐penalty. Remarkably, we show via a monotonicity lemma that the seemingly combinatorial computational problem due to the ℓ0‐penalty can be overcome by assigning non‐zero parameters to those nodes with the largest degrees. We further show that a β‐min condition guarantees our method to identify the true model and provide excess risk bounds for the estimated parameters. The estimation procedure enjoys good finite sample properties as shown by simulation studies. The usefulness of the SβM is further illustrated via the analysis of a microfinance take‐up example.
Abstract
Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same ...conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.
Motivated by analysis of gene expression data measured over different tissues or over time, we consider matrix-valued random variable and matrix-normal distribution, where the precision matrices have ...a graphical interpretation for genes and tissues, respectively. We present a l1 penalized likelihood method and an efficient coordinate descent-based computational algorithm for model selection and estimation in such matrix normal graphical models (MNGMs). We provide theoretical results on the asymptotic distributions, the rates of convergence of the estimates and the sparsistency, allowing both the numbers of genes and tissues to diverge as the sample size goes to infinity. Simulation results demonstrate that the MNGMs can lead to a better estimate of the precision matrices and better identifications of the graph structures than the standard Gaussian graphical models. We illustrate the methods with an analysis of mouse gene expression data measured over ten different tissues.
Statistical models support medical research by facilitating individualized outcome prognostication conditional on independent variables or by estimating effects of risk factors adjusted for ...covariates. Theory of statistical models is well‐established if the set of independent variables to consider is fixed and small. Hence, we can assume that effect estimates are unbiased and the usual methods for confidence interval estimation are valid. In routine work, however, it is not known a priori which covariates should be included in a model, and often we are confronted with the number of candidate variables in the range 10–30. This number is often too large to be considered in a statistical model. We provide an overview of various available variable selection methods that are based on significance or information criteria, penalized likelihood, the change‐in‐estimate criterion, background knowledge, or combinations thereof. These methods were usually developed in the context of a linear regression model and then transferred to more generalized linear models or models for censored survival data. Variable selection, in particular if used in explanatory modeling where effect estimates are of central interest, can compromise stability of a final model, unbiasedness of regression coefficients, and validity of p‐values or confidence intervals. Therefore, we give pragmatic recommendations for the practicing statistician on application of variable selection methods in general (low‐dimensional) modeling problems and on performing stability investigations and inference. We also propose some quantities based on resampling the entire variable selection process to be routinely reported by software packages offering automated variable selection algorithms.
Introduction
When a study sample includes a large proportion of long‐term survivors, mixture cure (MC) models that separately assess biomarker associations with long‐term recurrence‐free survival and ...time to disease recurrence are preferred to proportional‐hazards models. However, in samples with few recurrences, standard maximum likelihood can be biased.
Objective and Methods
We extend Firth‐type penalized likelihood (FT‐PL) developed for bias reduction in the exponential family to the Weibull‐logistic MC, using the Jeffreys invariant prior. Via simulation studies based on a motivating cohort study, we compare parameter estimates of the FT‐PL method to those by ML, as well as type 1 error (T1E) and power obtained using likelihood ratio statistics.
Results
In samples with relatively few events, the Firth‐type penalized likelihood estimates (FT‐PLEs) have mean bias closer to zero and smaller mean squared error than maximum likelihood estimates (MLEs), and can be obtained in samples where the MLEs are infinite. Under similar T1E rates, FT‐PL consistently exhibits higher statistical power than ML in samples with few events. In addition, we compare FT‐PL estimation with two other penalization methods (a log‐F prior method and a modified Firth‐type method) based on the same simulations.
Discussion
Consistent with findings for logistic and Cox regressions, FT‐PL under MC regression yields finite estimates under stringent conditions, and better bias‐and‐variance balance than the other two penalizations. The practicality and strength of FT‐PL for MC analysis is illustrated in a cohort study of breast cancer prognosis with long‐term follow‐up for recurrence‐free survival.
The Spike-and-Slab LASSO Ročková, Veronika; George, Edward I.
Journal of the American Statistical Association,
01/2018, Volume:
113, Issue:
521
Journal Article
Peer reviewed
Despite the wide adoption of spike-and-slab methodology for Bayesian variable selection, its potential for penalized likelihood estimation has largely been overlooked. In this article, we bridge this ...gap by cross-fertilizing these two paradigms with the Spike-and-Slab LASSO procedure for variable selection and parameter estimation in linear regression. We introduce a new class of self-adaptive penalty functions that arise from a fully Bayes spike-and-slab formulation, ultimately moving beyond the separable penalty framework. A virtue of these nonseparable penalties is their ability to borrow strength across coordinates, adapt to ensemble sparsity information and exert multiplicity adjustment. The Spike-and-Slab LASSO procedure harvests efficient coordinate-wise implementations with a path-following scheme for dynamic posterior exploration. We show on simulated data that the fully Bayes penalty mimics oracle performance, providing a viable alternative to cross-validation. We develop theory for the separable and nonseparable variants of the penalty, showing rate-optimality of the global mode as well as optimal posterior concentration when p > n. Supplementary materials for this article are available online.
Modeling financial time series data poses a significant challenge in the realm of time series analysis. The Autoregressive Conditional Heteroskedasticity (ARCH) model stands out as a potent tool for ...capturing time-varying volatility and heteroskedasticity in financial data. However, conventional ARCH models display sensitivity to departures from normality, leading to the development of extensions employing more flexible distributions. In this context, we propose a robust enhancement to the mixture of ARCH (MoARCH) model by integrating normal mean–variance mixture (NMVM) distributions to model component errors. The stochastic representation of the proposed model allows for a straightforward implementation of an Expectation Conditional Maximization Either (ECME) algorithm for obtaining Maximum Penalized Likelihood estimates (MPL). To thoroughly evaluate the model, we conduct four simulation studies to explore finite-sample properties, assess MPL estimators, scrutinize model robustness, and evaluate the accuracy of our proposal in fitting, clustering, and forecasting. Practical applications further highlight the effectiveness of our methodology, showcasing successful implementations across diverse real datasets.
This paper is concerned with an important issue in finite mixture modeling, the selection of the number of mixing components. A new penalized likelihood method is proposed for finite multi variate ...Gaussian mixture models, and it is shown to be consistent in determining the number of components. A modified EM algorithm is developed to simultaneously select the number of components and estimate the mixing probabilities and the unknown parameters of Gaussian distributions. Simulations and a data analysis are presented to illustrate the performance of the proposed method.
When the data are stored in a distributed manner, direct applications of traditional statistical inference procedures are often prohibitive due to communication costs and privacy concerns. This ...article develops and investigates two communication-efficient accurate statistical estimators (CEASE), implemented through iterative algorithms for distributed optimization. In each iteration, node machines carry out computation in parallel and communicate with the central processor, which then broadcasts aggregated information to node machines for new updates. The algorithms adapt to the similarity among loss functions on node machines, and converge rapidly when each node machine has large enough sample size. Moreover, they do not require good initialization and enjoy linear converge guarantees under general conditions. The contraction rate of optimization errors is presented explicitly, with dependence on the local sample size unveiled. In addition, the improved statistical accuracy per iteration is derived. By regarding the proposed method as a multistep statistical estimator, we show that statistical efficiency can be achieved in finite steps in typical statistical applications. In addition, we give the conditions under which the one-step CEASE estimator is statistically efficient. Extensive numerical experiments on both synthetic and real data validate the theoretical results and demonstrate the superior performance of our algorithms.