Abstract
To extend previous simulations on the performance of propensity score (PS) weighting and trimming methods to settings without and with unmeasured confounding, Poisson outcomes, and various ...strengths of treatment prediction (PS c statistic), we simulated studies with a binary intended treatment T as a function of 4 measured covariates. We mimicked treatment withheld and last-resort treatment by adding 2 “unmeasured” dichotomous factors that directed treatment to change for some patients in both tails of the PS distribution. The number of outcomes Y was simulated as a Poisson function of T and confounders. We estimated the PS as a function of measured covariates and trimmed the tails of the PS distribution using 3 strategies (“Crump,” “Stürmer,” and “Walker”). After trimming and reestimation, we used alternative PS weights to estimate the treatment effect (rate ratio): inverse probability of treatment weighting, standardized mortality ratio (SMR)-treated, SMR-untreated, the average treatment effect in the overlap population (ATO), matching, and entropy. With no unmeasured confounding, the ATO (123%) and “Crump” trimming (112%) improved relative efficiency compared with untrimmed inverse probability of treatment weighting. With unmeasured confounding, untrimmed estimates were biased irrespective of weighting method, and only Stürmer and Walker trimming consistently reduced bias. In settings where unmeasured confounding (e.g., frailty) may lead physicians to withhold treatment, Stürmer and Walker trimming should be considered before primary analysis.
The high-dimensional propensity score is a semiautomated variable selection algorithm that can supplement expert knowledge to improve confounding control in nonexperimental medical studies utilizing ...electronic healthcare databases. Although the algorithm can be used to generate hundreds of patient-level variables and rank them by their potential confounding impact, it remains unclear how to select the optimal number of variables for adjustment. We used plasmode simulations based on empirical data to discuss and evaluate data-adaptive approaches for variable selection and prediction modeling that can be combined with the high-dimensional propensity score to improve confounding control in large healthcare databases. We considered approaches that combine the high-dimensional propensity score with Super Learner prediction modeling, a scalable version of collaborative targeted maximum-likelihood estimation, and penalized regression. We evaluated performance using bias and mean squared error (MSE) in effect estimates. Results showed that the high-dimensional propensity score can be sensitive to the number of variables included for adjustment and that severe overfitting of the propensity score model can negatively impact the properties of effect estimates. Combining the high-dimensional propensity score with Super Learner was the most consistent strategy, in terms of reducing bias and MSE in the effect estimates, and may be promising for semiautomated data-adaptive propensity score estimation in high-dimensional covariate datasets.
To determine the impact of electronic health record (EHR)-discontinuity on the performance of prediction models.
The study population consisted of patients with a history of cardiovascular (CV) ...comorbidities identified using US Medicare claims data from 2007 to 2017, linked to EHR from two networks (used as model training and validation set, respectively). We built models predicting one-year risk of mortality, major CV events, and major bleeding events, stratified by high vs. low algorithm-predicted EHR-continuity. The best-performing models for each outcome were chosen among 5 commonly used machine-learning models. We compared model performance by Area under the ROC curve (AUROC) and Area under the precision-recall curve (AUPRC).
Based on 180,950 in the training and 103,061 in the validation set, we found EHR captured only 21.0-28.1% of all the non-fatal outcomes in the low EHR-continuity cohort but 55.4-66.1% of that in the high EHR-continuity cohort. In the validation set, the best-performing model developed among high EHR-continuity patients had consistently higher AUROC than that based on low-continuity patients: AUROC was 0.849 vs. 0.743 when predicting mortality; AUROC was 0.802 vs. 0.659 predicting the CV events; AUROC was 0.635 vs. 0.567 predicting major bleeding. We observed a similar pattern when using AUPRC as the outcome metric.
Among patients with CV comorbidities, when predicting mortality, major CV events, and bleeding outcomes, the prediction models developed in datasets with low EHR-continuity consistently had worse performance compared to models developed with high EHR-continuity.
The covariate-balancing propensity score (CBPS) extends logistic regression to simultaneously optimize covariate balance and treatment prediction. Although the CBPS has been shown to perform well in ...certain settings, its performance has not been evaluated in settings specific to pharmacoepidemiology and large database research. In this study, we use both simulations and empirical data to compare the performance of the CBPS with logistic regression and boosted classification and regression trees. We simulated various degrees of model misspecification to evaluate the robustness of each propensity score (PS) estimation method. We then applied these methods to compare the effect of initiating glucagonlike peptide-1 agonists versus sulfonylureas on cardiovascular events and all-cause mortality in the US Medicare population in 2007-2009. In simulations, the CBPS was generally more robust in terms of balancing covariates and reducing bias compared with misspecified logistic PS models and boosted classification and regression trees. All PS estimation methods performed similarly in the empirical example. For settings common to pharmacoepidemiology, logistic regression with balance checks to assess model specification is a valid method for PS estimation, but it can require refitting multiple models until covariate balance is achieved. The CBPS is a promising method to improve the robustness of PS models.
Abstract
To examine methodologies that address imbalanced treatment switching and censoring, 6 different analytical approaches were evaluated under a comparative effectiveness framework: ...intention-to-treat, as-treated, intention-to-treat with censor-weighting, as-treated with censor-weighting, time-varying exposure, and time-varying exposure with censor-weighting. Marginal structural models were employed to address time-varying exposure, confounding, and possibly informative censoring in an administrative data set of adult patients who were hospitalized with acute coronary syndrome and treated with either clopidogrel or ticagrelor. The effectiveness endpoint included first occurrence of death, myocardial infarction, or stroke. These methodologies were then applied across simulated data sets with varying frequencies of treatment switching and censoring to compare the effect estimate of each analysis. The findings suggest that implementing different analytical approaches has an impact on the point estimate and interpretation of analyses, especially when censoring is highly unbalanced.
Introduction
For a new drug to be developed, the desired properties are described in a target product profile.
Objective
We propose a framework for using real-world data to measure the ...disease-specific costs of the current standard of care and then to project the costs of the proposed new product for early data-driven portfolio decisions to select drug candidates for development.
Methods
We sampled from a cohort of patients representing the current standard of care to generate a hypothetical cohort of patients that fits a given target product profile for a new (hypothetical) treatment. The healthcare costs were determined and compared between standard of care and the new treatment. The approach differed according to the number of outcomes defined in the target product profile, and the cases for one, two, and three outcome variables are described.
Results
Based on assumed hypothetical treatment effect, absolute risk and cost reductions were estimated in a worked example. The median costs per day for one patient were estimated to be $10.37 and $8.39 in the original and hypothetical cohorts, respectively. This means that the assumed target product profile would result in cost savings of $1.98 per day and patient—not accounting for any additional drug costs.
Conclusions
We present a simple approach to assess the potential absolute clinical and economic benefit of a new drug based on real-world data and its target product profile. The approach allows for early data-driven portfolio decisions to select drug candidates based on their expected cost savings.
Abstract
Increasing emphasis on the use of real-world evidence (RWE) to support clinical policy and regulatory decision-making has led to a proliferation of guidance, advice, and frameworks from ...regulatory agencies, academia, professional societies, and industry. A broad spectrum of studies use real-world data (RWD) to produce RWE, ranging from randomized trials with outcomes assessed using RWD to fully observational studies. Yet, many proposals for generating RWE lack sufficient detail, and many analyses of RWD suffer from implausible assumptions, other methodological flaws, or inappropriate interpretations. The
Causal Roadmap
is an explicit, itemized, iterative process that guides investigators to prespecify study design and analysis plans; it addresses a wide range of guidance within a single framework. By supporting the transparent evaluation of causal assumptions and facilitating objective comparisons of design and analysis choices based on prespecified criteria, the
Roadmap
can help investigators to evaluate the quality of evidence that a given study is likely to produce, specify a study to generate high-quality RWE, and communicate effectively with regulatory agencies and other stakeholders. This paper aims to disseminate and extend the
Causal Roadmap
framework for use by clinical and translational researchers; three companion papers demonstrate applications of the
Causal Roadmap
for specific use cases.
Real-world data, such as administrative claims and electronic health records, are increasingly used for safety monitoring and to help guide regulatory decision-making. In these settings, it is ...important to document analytic decisions transparently and objectively to assess and ensure that analyses meet their intended goals.
The Causal Roadmap is an established framework that can guide and document analytic decisions through each step of the analytic pipeline, which will help investigators generate high-quality real-world evidence.
In this paper, we illustrate the utility of the Causal Roadmap using two case studies previously led by workgroups sponsored by the Sentinel Initiative - a program for actively monitoring the safety of regulated medical products. Each case example focuses on different aspects of the analytic pipeline for drug safety monitoring. The first case study shows how the Causal Roadmap encourages transparency, reproducibility, and objective decision-making for causal analyses. The second case study highlights how this framework can guide analytic decisions beyond inference on causal parameters, improving outcome ascertainment in clinical phenotyping.
These examples provide a structured framework for implementing the Causal Roadmap in safety surveillance and guide transparent, reproducible, and objective analysis.