•For q < 1, the q-Exponential log-likelihood may present monotone behavior.•Two likelihood correction methods are evaluated.•Guided use and validation of q-Exponential models in reliability data ...analysis.•Correction based on Firth's penalization presented superior results.•q-Exponential outperformed Weibull-based models in two real failure data sets.
Maintenance-related decisions are often based on the expected number of interventions during a specified period of time. The proper estimation of this quantity relies on the choice of the probabilistic model that best fits reliability-related data. In this context, the q-Exponential probability distribution has emerged as a promising alternative. It can model each of the three phases of the bathtub curve; however, for the wear-out phase, its usage may become difficult due to the “monotone likelihood problem”. Two correction methods (Firth's and resample-based) are considered and have their performances evaluated through numerical experiments. To aid the reliability analyst in applying the q-Exponential model, we devise a methodology involving original and corrected functions for point and interval estimates for the q-Exponential parameters and validation of the estimated models using the expected number of failures via Monte Carlo simulation and the bootstrapped Kolmogorov-Smirnov test. Two examples with failure data presenting increasing hazard rates are provided. The performances of the estimated q-Exponential, Weibull, q-Weibull and modified extended Weibull (MEW) models are compared. In both examples, the q-Exponential presented superior results, despite the increased flexibility of the q-Weibull and MEW distributions in modeling non-monotone hazard rates (e.g., bathtub-shaped).
In logistic regression with nonignorable missing responses, Ibrahim and Lipsitz proposed a method for estimating regression parameters. It is known that the regression estimates obtained by using ...this method are biased when the sample size is small. Also, another complexity arises when the iterative estimation process encounters separation in estimating regression coefficients. In this article, we propose a method to improve the estimation of regression coefficients. In our likelihood-based method, we penalize the likelihood by multiplying it by a noninformative Jeffreys prior as a penalty term. The proposed method reduces bias and is able to handle the issue of separation. Simulation results show substantial bias reduction for the proposed method as compared to the existing method. Analyses using real world data also support the simulation findings. An
R
package called brlrmr is developed implementing the proposed method and the Ibrahim and Lipsitz method.
We consider the estimation of the prevalence of a rare disease, and the log‐odds ratio for two specified groups of individuals from group testing data. For a low‐prevalence disease, the maximum ...likelihood estimate of the log‐odds ratio is severely biased. However, Firth correction to the score function leads to a considerable improvement of the estimator. Also, for a low‐prevalence disease, if the diagnostic test is imperfect, the group testing is found to yield more precise estimate of the log‐odds ratio than the individual testing.
Summary
Presence‐only data can be used to determine resource selection and estimate a species’ distribution. Maximum likelihood is a common parameter estimation method used for species distribution ...models. Maximum likelihood estimates, however, do not always exist for a commonly used species distribution model – the Poisson point process.
We demonstrate the issue with conventional maximum likelihood mathematically, using a data example, and a simulation experiment and show alternative estimation methods.
We found that when habitat preferences are strong or the number of presence‐only locations is small, by chance, maximum likelihood coefficient estimates for the Poisson point process model may not exist. We found that several alternative estimation methods can produce reliable estimates, but results will depend on the chosen method.
It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence‐only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
We consider statistical methods for benchmarking clinical centers based on a dichotomous outcome indicator. Borrowing ideas from the causal inference literature, we aim to reveal how the entire study ...population would have fared under the current care level of each center. To this end, we evaluate direct standardization based on fixed versus random center effects outcome models that incorporate patient-specific baseline covariates to adjust for differential case-mix. We explore fixed effects (FE) regression with Firth correction and normal mixed effects (ME) regression to maintain convergence in the presence of very small centers. Moreover, we study doubly robust FE regression to avoid outcome model extrapolation. Simulation studies show that shrinkage following standard ME modeling can result in substantial power loss relative to the considered alternatives, especially for small centers. Results are consistent with findings in the analysis of 30-day mortality risk following acute stroke across 90 centers in the Swedish Stroke Register.
In recent years, numerous approaches for biomarker‐based clinical trials have been developed. One of these developments are multiple‐biomarker trials, which aim to investigate multiple biomarkers ...simultaneously in independent subtrials. For low‐prevalence biomarkers, small sample sizes within the subtrials have to be expected, as well as many biomarker‐negative patients at the screening stage. The small sample sizes may make it unfeasible to analyze the subtrials individually. This imposes the need to develop new approaches for the analysis of such trials. With an expected large group of biomarker‐negative patients, it seems reasonable to explore options to benefit from including them in such trials. We consider advantages and disadvantages of the inclusion of biomarker‐negative patients in a multiple‐biomarker trial with a survival endpoint. We discuss design options that include biomarker‐negative patients in the study and address the issue of small sample size bias in such trials. We carry out a simulation study for a design where biomarker‐negative patients are kept in the study and are treated with standard of care. We compare three different analysis approaches based on the Cox model to examine if the inclusion of biomarker‐negative patients can provide a benefit with respect to bias and variance of the treatment effect estimates. We apply the Firth correction to reduce the small sample size bias. The results of the simulation study suggest that for small sample situations, the Firth correction should be applied to adjust for the small sample size bias. Additional to the Firth penalty, the inclusion of biomarker‐negative patients in the analysis can lead to further but small improvements in bias and standard deviation of the estimates.
When developing risk prediction models on datasets with limited sample size, shrinkage methods are recommended. Earlier studies showed that shrinkage results in better predictive performance on ...average. This simulation study aimed to investigate the variability of regression shrinkage on predictive performance for a binary outcome. We compared standard maximum likelihood with the following shrinkage methods: uniform shrinkage (likelihood-based and bootstrap-based), penalized maximum likelihood (ridge) methods, LASSO logistic regression, adaptive LASSO, and Firth’s correction. In the simulation study, we varied the number of predictors and their strength, the correlation between predictors, the event rate of the outcome, and the events per variable. In terms of results, we focused on the calibration slope. The slope indicates whether risk predictions are too extreme (slope < 1) or not extreme enough (slope > 1). The results can be summarized into three main findings. First, shrinkage improved calibration slopes on average. Second, the between-sample variability of calibration slopes was often increased relative to maximum likelihood. In contrast to other shrinkage approaches, Firth’s correction had a small shrinkage effect but showed low variability. Third, the correlation between the estimated shrinkage and the optimal shrinkage to remove overfitting was typically negative, with Firth’s correction as the exception. We conclude that, despite improved performance on average, shrinkage often worked poorly in individual datasets, in particular when it was most needed. The results imply that shrinkage methods do not solve problems associated with small sample size or low number of events per variable.
Abstract
Background
For finite samples with binary outcomes penalized logistic regression such as ridge logistic regression has the potential of achieving smaller mean squared errors (MSE) of ...coefficients and predictions than maximum likelihood estimation. There is evidence, however, that ridge logistic regression can result in highly variable calibration slopes in small or sparse data situations.
Methods
In this paper, we elaborate this issue further by performing a comprehensive simulation study, investigating the performance of ridge logistic regression in terms of coefficients and predictions and comparing it to Firth’s correction that has been shown to perform well in low-dimensional settings. In addition to tuned ridge regression where the penalty strength is estimated from the data by minimizing some measure of the out-of-sample prediction error or information criterion, we also considered ridge regression with pre-specified degree of shrinkage. We included ‘oracle’ models in the simulation study in which the complexity parameter was chosen based on the true event probabilities (prediction oracle) or regression coefficients (explanation oracle) to demonstrate the capability of ridge regression if truth was known.
Results
Performance of ridge regression strongly depends on the choice of complexity parameter. As shown in our simulation and illustrated by a data example, values optimized in small or sparse datasets are negatively correlated with optimal values and suffer from substantial variability which translates into large MSE of coefficients and large variability of calibration slopes. In contrast, in our simulations pre-specifying the degree of shrinkage prior to fitting led to accurate coefficients and predictions even in non-ideal settings such as encountered in the context of rare outcomes or sparse predictors.
Conclusions
Applying tuned ridge regression in small or sparse datasets is problematic as it results in unstable coefficients and predictions. In contrast, determining the degree of shrinkage according to some meaningful prior assumptions about true effects has the potential to reduce bias and stabilize the estimates.
The parameters of logistic regression models are usually obtained by the method of maximum likelihood (ML). However, in analyses of small data sets or data sets with unbalanced outcomes or exposures, ...ML parameter estimates may not exist. This situation has been termed 'separation' as the two outcome groups are separated by the values of a covariate or a linear combination of covariates. To overcome the problem of non-existing ML parameter estimates, applying Firth's correction (FC) was proposed. In practice, however, a principal investigator might be advised to 'bring more data' in order to solve a separation issue. We illustrate the problem by means of examples from colorectal cancer screening and ornithology. It is unclear if such an increasing sample size (ISS) strategy that keeps sampling new observations until separation is removed improves estimation compared to applying FC to the original data set. We performed an extensive simulation study where the main focus was to estimate the cost-adjusted relative efficiency of ML combined with ISS compared to FC. FC yielded reasonably small root mean squared errors and proved to be the more efficient estimator. Given our findings, we propose not to adapt the sample size when separation is encountered but to use FC as the default method of analysis whenever the number of observations or outcome events is critically low.