Profiling or evaluation of health care providers, including hospitals or dialysis facilities, involves the application of hierarchical regression models to compare each provider's performance with ...respect to a patient outcome, such as unplanned 30-day hospital readmission. This is achieved by comparing a specific provider's estimate of unplanned readmission rate, adjusted for patient case-mix, to a normative standard, typically defined as an "average" national readmission rate across all providers. Profiling is of national importance in the United States because the Centers for Medicare and Medicaid Services (CMS) policy for payment to providers is dependent on providers' performance, which is part of a national strategy to improve delivery and quality of patient care. Novel high dimensional fixed effects (FE) models have been proposed for profiling dialysis facilities and are more focused towards inference on the tail of the distribution of provider outcomes, which is well-suited for the objective of identifying sub-standard ("extreme") performance. However, the extent to which estimation and inference procedures for FE profiling models are effective when the outcome is sparse and/or when there are relatively few patients within a provider, referred to as the "low information" context, have not been examined. This scenario is common in practice when the patient outcome of interest is cause-specific 30-day readmissions, such as 30-day readmission due to infections in patients on dialysis, which is only about ~ 8% compared to the > 30% for all-cause 30-day readmission. Thus, we examine the feasibility and effectiveness of profiling models under the low information context in simulation studies and propose a novel correction method to FE profiling models to better handle sparse outcome data.
•For q < 1, the q-Exponential log-likelihood may present monotone behavior.•Two likelihood correction methods are evaluated.•Guided use and validation of q-Exponential models in reliability data ...analysis.•Correction based on Firth's penalization presented superior results.•q-Exponential outperformed Weibull-based models in two real failure data sets.
Maintenance-related decisions are often based on the expected number of interventions during a specified period of time. The proper estimation of this quantity relies on the choice of the probabilistic model that best fits reliability-related data. In this context, the q-Exponential probability distribution has emerged as a promising alternative. It can model each of the three phases of the bathtub curve; however, for the wear-out phase, its usage may become difficult due to the “monotone likelihood problem”. Two correction methods (Firth's and resample-based) are considered and have their performances evaluated through numerical experiments. To aid the reliability analyst in applying the q-Exponential model, we devise a methodology involving original and corrected functions for point and interval estimates for the q-Exponential parameters and validation of the estimated models using the expected number of failures via Monte Carlo simulation and the bootstrapped Kolmogorov-Smirnov test. Two examples with failure data presenting increasing hazard rates are provided. The performances of the estimated q-Exponential, Weibull, q-Weibull and modified extended Weibull (MEW) models are compared. In both examples, the q-Exponential presented superior results, despite the increased flexibility of the q-Weibull and MEW distributions in modeling non-monotone hazard rates (e.g., bathtub-shaped).
In logistic regression with nonignorable missing responses, Ibrahim and Lipsitz proposed a method for estimating regression parameters. It is known that the regression estimates obtained by using ...this method are biased when the sample size is small. Also, another complexity arises when the iterative estimation process encounters separation in estimating regression coefficients. In this article, we propose a method to improve the estimation of regression coefficients. In our likelihood-based method, we penalize the likelihood by multiplying it by a noninformative Jeffreys prior as a penalty term. The proposed method reduces bias and is able to handle the issue of separation. Simulation results show substantial bias reduction for the proposed method as compared to the existing method. Analyses using real world data also support the simulation findings. An
R
package called brlrmr is developed implementing the proposed method and the Ibrahim and Lipsitz method.
We consider the estimation of the prevalence of a rare disease, and the log‐odds ratio for two specified groups of individuals from group testing data. For a low‐prevalence disease, the maximum ...likelihood estimate of the log‐odds ratio is severely biased. However, Firth correction to the score function leads to a considerable improvement of the estimator. Also, for a low‐prevalence disease, if the diagnostic test is imperfect, the group testing is found to yield more precise estimate of the log‐odds ratio than the individual testing.
Summary
Presence‐only data can be used to determine resource selection and estimate a species’ distribution. Maximum likelihood is a common parameter estimation method used for species distribution ...models. Maximum likelihood estimates, however, do not always exist for a commonly used species distribution model – the Poisson point process.
We demonstrate the issue with conventional maximum likelihood mathematically, using a data example, and a simulation experiment and show alternative estimation methods.
We found that when habitat preferences are strong or the number of presence‐only locations is small, by chance, maximum likelihood coefficient estimates for the Poisson point process model may not exist. We found that several alternative estimation methods can produce reliable estimates, but results will depend on the chosen method.
It is important to identify conditions for which maximum likelihood estimates are unlikely to be identifiable from presence‐only data. In data sets where the maximum likelihood estimates do not exist, penalized likelihood and Bayesian methods will produce coefficient estimates, but these are sensitive to the choice of estimation procedure and prior or penalty term. When sample size is small or it is thought that habitat preferences are strong, we propose a suite of estimation procedures researchers can consider using.
We consider statistical methods for benchmarking clinical centers based on a dichotomous outcome indicator. Borrowing ideas from the causal inference literature, we aim to reveal how the entire study ...population would have fared under the current care level of each center. To this end, we evaluate direct standardization based on fixed versus random center effects outcome models that incorporate patient-specific baseline covariates to adjust for differential case-mix. We explore fixed effects (FE) regression with Firth correction and normal mixed effects (ME) regression to maintain convergence in the presence of very small centers. Moreover, we study doubly robust FE regression to avoid outcome model extrapolation. Simulation studies show that shrinkage following standard ME modeling can result in substantial power loss relative to the considered alternatives, especially for small centers. Results are consistent with findings in the analysis of 30-day mortality risk following acute stroke across 90 centers in the Swedish Stroke Register.
In recent years, numerous approaches for biomarker‐based clinical trials have been developed. One of these developments are multiple‐biomarker trials, which aim to investigate multiple biomarkers ...simultaneously in independent subtrials. For low‐prevalence biomarkers, small sample sizes within the subtrials have to be expected, as well as many biomarker‐negative patients at the screening stage. The small sample sizes may make it unfeasible to analyze the subtrials individually. This imposes the need to develop new approaches for the analysis of such trials. With an expected large group of biomarker‐negative patients, it seems reasonable to explore options to benefit from including them in such trials. We consider advantages and disadvantages of the inclusion of biomarker‐negative patients in a multiple‐biomarker trial with a survival endpoint. We discuss design options that include biomarker‐negative patients in the study and address the issue of small sample size bias in such trials. We carry out a simulation study for a design where biomarker‐negative patients are kept in the study and are treated with standard of care. We compare three different analysis approaches based on the Cox model to examine if the inclusion of biomarker‐negative patients can provide a benefit with respect to bias and variance of the treatment effect estimates. We apply the Firth correction to reduce the small sample size bias. The results of the simulation study suggest that for small sample situations, the Firth correction should be applied to adjust for the small sample size bias. Additional to the Firth penalty, the inclusion of biomarker‐negative patients in the analysis can lead to further but small improvements in bias and standard deviation of the estimates.
When developing risk prediction models on datasets with limited sample size, shrinkage methods are recommended. Earlier studies showed that shrinkage results in better predictive performance on ...average. This simulation study aimed to investigate the variability of regression shrinkage on predictive performance for a binary outcome. We compared standard maximum likelihood with the following shrinkage methods: uniform shrinkage (likelihood-based and bootstrap-based), penalized maximum likelihood (ridge) methods, LASSO logistic regression, adaptive LASSO, and Firth’s correction. In the simulation study, we varied the number of predictors and their strength, the correlation between predictors, the event rate of the outcome, and the events per variable. In terms of results, we focused on the calibration slope. The slope indicates whether risk predictions are too extreme (slope < 1) or not extreme enough (slope > 1). The results can be summarized into three main findings. First, shrinkage improved calibration slopes on average. Second, the between-sample variability of calibration slopes was often increased relative to maximum likelihood. In contrast to other shrinkage approaches, Firth’s correction had a small shrinkage effect but showed low variability. Third, the correlation between the estimated shrinkage and the optimal shrinkage to remove overfitting was typically negative, with Firth’s correction as the exception. We conclude that, despite improved performance on average, shrinkage often worked poorly in individual datasets, in particular when it was most needed. The results imply that shrinkage methods do not solve problems associated with small sample size or low number of events per variable.