Randomized clinical trials (RCTs) are conducted to guide clinicians' selection of therapies for individual patients. Currently, RCTs in critical care often report an overall mean effect and selected ...individual subgroups. Yet work in other fields suggests that such reporting practices can be improved. Specifically, this Critical Care Perspective reviews recent work on so-called "heterogeneity of treatment effect" (HTE) by baseline risk and extends that work to examine its applicability to trials of acute respiratory failure and severe sepsis. Because patients in RCTs in critical care medicine-and patients in intensive care units-have wide variability in their risk of death, these patients will have wide variability in the absolute benefit that they can derive from a given therapy. If the side effects of the therapy are not perfectly collinear with the treatment benefits, this will result in HTE, where different patients experience quite different expected benefits of a therapy. We use simulations of RCTs to demonstrate that such HTE could result in apparent paradoxes, including: (1) positive trials of therapies that are beneficial overall but consistently harm or have little benefit to low-risk patients who met enrollment criteria, and (2) overall negative trials of therapies that still consistently benefit high-risk patients. We further show that these results persist even in the presence of causes of death unmodified by the treatment under study. These results have implications for reporting and analyzing RCT data, both to better understand how our therapies work and to improve the bedside applicability of RCTs. We suggest a plan for measurement in future RCTs in the critically ill.
Intensive blood pressure (BP) treatment can avert cardiovascular disease (CVD) events but can cause some serious adverse events. We sought to develop and validate risk models for predicting absolute ...risk difference (increased risk or decreased risk) for CVD events and serious adverse events from intensive BP therapy. A secondary aim was to test if the statistical method of elastic net regularization would improve the estimation of risk models for predicting absolute risk difference, as compared to a traditional backwards variable selection approach.
Cox models were derived from SPRINT trial data and validated on ACCORD-BP trial data to estimate risk of CVD events and serious adverse events; the models included terms for intensive BP treatment and heterogeneous response to intensive treatment. The Cox models were then used to estimate the absolute reduction in probability of CVD events (benefit) and absolute increase in probability of serious adverse events (harm) for each individual from intensive treatment. We compared the method of elastic net regularization, which uses repeated internal cross-validation to select variables and estimate coefficients in the presence of collinearity, to a traditional backwards variable selection approach. Data from 9,069 SPRINT participants with complete data on covariates were utilized for model development, and data from 4,498 ACCORD-BP participants with complete data were utilized for model validation. Participants were exposed to intensive (goal systolic pressure < 120 mm Hg) versus standard (<140 mm Hg) treatment. Two composite primary outcome measures were evaluated: (i) CVD events/deaths (myocardial infarction, acute coronary syndrome, stroke, congestive heart failure, or CVD death), and (ii) serious adverse events (hypotension, syncope, electrolyte abnormalities, bradycardia, or acute kidney injury/failure). The model for CVD chosen through elastic net regularization included interaction terms suggesting that older age, black race, higher diastolic BP, and higher lipids were associated with greater CVD risk reduction benefits from intensive treatment, while current smoking was associated with fewer benefits. The model for serious adverse events chosen through elastic net regularization suggested that male sex, current smoking, statin use, elevated creatinine, and higher lipids were associated with greater risk of serious adverse events from intensive treatment. SPRINT participants in the highest predicted benefit subgroup had a number needed to treat (NNT) of 24 to prevent 1 CVD event/death over 5 years (absolute risk reduction ARR = 0.042, 95% CI: 0.018, 0.066; P = 0.001), those in the middle predicted benefit subgroup had a NNT of 76 (ARR = 0.013, 95% CI: -0.0001, 0.026; P = 0.053), and those in the lowest subgroup had no significant risk reduction (ARR = 0.006, 95% CI: -0.007, 0.018; P = 0.71). Those in the highest predicted harm subgroup had a number needed to harm (NNH) of 27 to induce 1 serious adverse event (absolute risk increase ARI = 0.038, 95% CI: 0.014, 0.061; P = 0.002), those in the middle predicted harm subgroup had a NNH of 41 (ARI = 0.025, 95% CI: 0.012, 0.038; P < 0.001), and those in the lowest subgroup had no significant risk increase (ARI = -0.007, 95% CI: -0.043, 0.030; P = 0.72). In ACCORD-BP, participants in the highest subgroup of predicted benefit had significant absolute CVD risk reduction, but the overall ACCORD-BP participant sample was skewed towards participants with less predicted benefit and more predicted risk than in SPRINT. The models chosen through traditional backwards selection had similar ability to identify absolute risk difference for CVD as the elastic net models, but poorer ability to correctly identify absolute risk difference for serious adverse events. A key limitation of the analysis is the limited sample size of the ACCORD-BP trial, which expanded confidence intervals for ARI among persons with type 2 diabetes. Additionally, it is not possible to mechanistically explain the physiological relationships explaining the heterogeneous treatment effects captured by the models, since the study was an observational secondary data analysis.
We found that predictive models could help identify subgroups of participants in both SPRINT and ACCORD-BP who had lower versus higher ARRs in CVD events/deaths with intensive BP treatment, and participants who had lower versus higher ARIs in serious adverse events.
Type 2 diabetes mellitus is common, and treatment to correct blood glucose levels is standard. However, treatment burden starts years before treatment benefits accrue. Because guidelines often ignore ...treatment burden, many patients with diabetes may be overtreated.
To examine how treatment burden affects the benefits of intensive vs moderate glycemic control in patients with type 2 diabetes.
We estimated the effects of hemoglobin A1c (HbA1c) reduction on diabetes outcomes and overall quality-adjusted life years (QALYs) using a Markov simulation model. Model probabilities were based on estimates from randomized trials and observational studies. Simulated patients were based on adult patients with type 2 diabetes drawn from the National Health and Nutrition Examination Study.
Glucose lowering with oral agents or insulin in type 2 diabetes.
Main outcomes were QALYs and reduction in risk of microvascular and cardiovascular diabetes complications.
Assuming a low treatment burden (0.001, or 0.4 lost days per year), treatment that lowered HbA1c level by 1 percentage point provided benefits ranging from 0.77 to 0.91 QALYs for simulated patients who received a diagnosis at age 45 years to 0.08 to 0.10 QALYs for those who received a diagnosis at age 75 years. An increase in treatment burden (0.01, or 3.7 days lost per year) resulted in HbA1c level lowering being associated with more harm than benefit in those aged 75 years. Across all ages, patients who viewed treatment as more burdensome (0.025-0.05 disutility) experienced a net loss in QALYs from treatments to lower HbA1c level.
Improving glycemic control can provide substantial benefits, especially for younger patients; however, for most patients older than 50 years with an HbA1c level less than 9% receiving metformin therapy, additional glycemic treatment usually offers at most modest benefits. Furthermore, the magnitude of benefit is sensitive to patients' views of the treatment burden, and even small treatment adverse effects result in net harm in older patients. The current approach of broadly advocating intensive glycemic control should be reconsidered; instead, treating patients with HbA1c levels less than 9% should be individualized on the basis of estimates of benefit weighed against the patient's views of the burdens of treatment.
The limitations of subgroup analyses are well established—false positives due to multiple comparisons, false negatives due to inadequate power, and limited ability to inform individual treatment ...decisions because patients have multiple characteristics that vary simultaneously. In this article, we apply Bayes’s rule to determine the probability that a positive subgroup analysis is a true positive. From this framework, we derive simple rules to determine when subgroup analyses can be performed as hypothesis testing analyses and thus inform when subgroup analyses should influence how we practice medicine.
ABSTRACT
BACKGROUND
Due to a shortage of studies focusing on older adults, clinicians and policy makers frequently rely on clinical trials of the general population to provide supportive evidence for ...treating complex, older patients.
OBJECTIVES
To examine the inclusion and analysis of complex, older adults in randomized controlled trials.
REVIEW METHODS
A PubMed search identified phase III or IV randomized controlled trials published in 2007 in JAMA, NEJM, Lancet, Circulation, and BMJ. Therapeutic interventions that assessed major morbidity or mortality in adults were included. For each study, age eligibility, average age of study population, primary and secondary outcomes, exclusion criteria, and the frequency, characteristics, and methodology of age-specific subgroup analyses were reviewed.
RESULTS
Of the 109 clinical trials reviewed in full, 22 (20.2%) excluded patients above a specified age. Almost half (45.6%) of the remaining trials excluded individuals using criteria that could disproportionately impact older adults. Only one in four trials (26.6%) examined outcomes that are considered highly relevant to older adults, such as health status or quality of life. Of the 42 (38.5%) trials that performed an age-specific subgroup analysis, fewer than half examined potential confounders of differential treatment effects by age, such as comorbidities or risk of primary outcome. Trials with age-specific subgroup analyses were more likely than those without to be multicenter trials (97.6% vs. 79.1%, p < 0.01) and funded by industry (83.3% vs. 62.7%, p < 0.05). Differential benefit by age was found in seven trials (16.7%).
CONCLUSION
Clinical trial evidence guiding treatment of complex, older adults could be improved by eliminating upper age limits for study inclusion, by reducing the use of eligibility criteria that disproportionately affect multimorbid older patients, by evaluating outcomes that are highly relevant to older individuals, and by encouraging adherence to recommended analytic methods for evaluating differential treatment effects by age.
The 2013 pooled cohort equations (PCEs) are central in prevention guidelines for cardiovascular disease (CVD) but can misestimate CVD risk.
To improve the clinical accuracy of CVD risk prediction by ...revising the 2013 PCEs using newer data and statistical methods.
Derivation and validation of risk equations.
Population-based.
26 689 adults aged 40 to 79 years without prior CVD from 6 U.S. cohorts.
Nonfatal myocardial infarction, death from coronary heart disease, or fatal or nonfatal stroke.
The 2013 PCEs overestimated 10-year risk for atherosclerotic CVD by an average of 20% across risk groups. Misestimation of risk was particularly prominent among black adults, of whom 3.9 million (33% of eligible black persons) had extreme risk estimates (<70% or >250% those of white adults with otherwise-identical risk factor values). Updating these equations improved accuracy among all race and sex subgroups. Approximately 11.8 million U.S. adults previously labeled high-risk (10-year risk ≥7.5%) by the 2013 PCEs would be relabeled lower-risk by the updated equations.
Updating the 2013 PCEs with data from modern cohorts reduced the number of persons considered to be at high risk. Clinicians and patients should consider the potential benefits and harms of reducing the number of persons recommended aspirin, blood pressure, or statin therapy. Our findings also indicate that risk equations will generally become outdated over time and require routine updating.
Revised PCEs can improve the accuracy of CVD risk estimates.
National Institutes of Health.
Background Comparative effectiveness data pertaining to competing colorectal cancer (CRC) screening tests do not exist but are necessary to guide clinical decision making and policy. Objective To ...perform a comparative synthesis of clinical outcomes studies evaluating the effects of competing tests on CRC-related mortality. Design Traditional and network meta-analyses. Two reviewers identified studies evaluating the effect of guaiac-based fecal occult blood testing (gFOBT), flexible sigmoidoscopy (FS), or colonoscopy on CRC-related mortality. Interventions gFOBT, FS, colonoscopy. Main Outcome Measurements Traditional meta-analysis was performed to produce pooled estimates of the effect of each modality on CRC mortality. Bayesian network meta-analysis (NMA) was performed to indirectly compare the effectiveness of screening modalities. Multiple sensitivity analyses were performed. Results Traditional meta-analysis revealed that, compared with no intervention, colonoscopy reduced CRC-related mortality by 57% (relative risk RR 0.43; 95% confidence interval CI, 0.33-0.58), whereas FS reduced CRC-related mortality by 40% (RR 0.60; 95% CI, 0.45-0.78), and gFOBT reduced CRC-related mortality by 18% (RR 0.82; 95% CI, 0.76-0.88). NMA demonstrated nonsignificant trends favoring colonoscopy over FS (RR 0.71; 95% CI, 0.45-1.11) and FS over gFOBT (RR 0.74; 95% CI, 0.51-1.09) for reducing CRC-related deaths. NMA-based simulations, however, revealed that colonoscopy has a 94% probability of being the most effective test for reducing CRC mortality and a 99% probability of being most effective when the analysis is restricted to screening studies. Limitations Randomized trials and observational studies were combined within the same analysis. Conclusion Clinical outcomes studies demonstrate that gFOBT, FS, and colonoscopy are all effective in reducing CRC-related mortality. Network meta-analysis suggests that colonoscopy is the most effective test.
Sex differences in dementia risk are unclear, but some studies have found greater risk for women.
To determine associations between sex and cognitive decline in order to better understand sex ...differences in dementia risk.
This cohort study used pooled analysis of individual participant data from 5 cohort studies for years 1971 to 2017: Atherosclerosis Risk in Communities Study, Coronary Artery Risk Development in Young Adults Study, Cardiovascular Health Study, Framingham Offspring Study, and Northern Manhattan Study. Linear mixed-effects models were used to estimate changes in each continuous cognitive outcome over time by sex. Data analysis was completed from March 2019 to October 2020.
Sex.
The primary outcome was change in global cognition. Secondary outcomes were change in memory and executive function. Outcomes were standardized as t scores (mean SD, 50 10); a 1-point difference represents a 0.1-SD difference in cognition.
Among 34 349 participants, 26 088 who self-reported Black or White race, were free of stroke and dementia, and had covariate data at or before the first cognitive assessment were included for analysis. Median (interquartile range) follow-up was 7.9 (5.3-20.5) years. There were 11 775 (44.7%) men (median interquartile range age, 58 51-66 years at first cognitive assessment; 2229 18.9% Black) and 14 313 women (median interquartile range age, 58 51-67 years at first cognitive assessment; 3636 25.4% Black). Women had significantly higher baseline performance than men in global cognition (2.20 points higher; 95% CI, 2.04 to 2.35 points; P < .001), executive function (2.13 points higher; 95% CI, 1.98 to 2.29 points; P < .001), and memory (1.89 points higher; 95% CI, 1.72 to 2.06 points; P < .001). Compared with men, women had significantly faster declines in global cognition (-0.07 points/y faster; 95% CI, -0.08 to -0.05 points/y; P < .001) and executive function (-0.06 points/y faster; 95% CI, -0.07 to -0.05 points/y; P < .001). Men and women had similar declines in memory (-0.004 points/y faster; 95% CI, -0.023 to 0.014; P = .61).
The results of this cohort study suggest that women may have greater cognitive reserve but faster cognitive decline than men, which could contribute to sex differences in late-life dementia.
Two recent randomized trials produced discordant results when testing the benefits and harms of treatment to reduce blood pressure (BP) in patients with cardiovascular disease (CVD).
To perform a ...theoretical modeling study to identify whether large, clinically important differences in benefit and harm among patients (heterogeneous treatment effects HTEs) can be hidden in, and explain discordant results between, treat-to-target BP trials.
Microsimulation.
Results of 2 trials comparing standard (systolic BP target <140 mm Hg) with intensive (systolic BP target <120 mm Hg) BP treatment and data from the National Health and Nutrition Examination Survey (2013 to 2014).
U.S. adults.
5 years.
Societal.
BP treatment.
CVD events and mortality.
Clinically important HTEs could explain differences in outcomes between 2 trials of intensive BP treatment, particularly diminishing benefit with each additional BP agent (for example, adding a second agent reduces CVD risk hazard ratio, 0.61, but adding a fourth agent to a third has no benefit) and increasing harm at low diastolic BP.
Conventional treat-to-target trial designs had poor (<5%) statistical power to detect the HTEs, despite large samples (n > 20 000), and produced biased effect estimates. In contrast, a trial with sequential randomization to more intensive therapy achieved greater than 80% power and unbiased HTE estimates, despite small samples (n = 3500).
The HTEs as a function of the number of BP agents only were explored. Simulated aggregate data from the trials were used as model inputs because individual-participant data were not available.
Clinically important heterogeneity in intensive BP treatment effects remains undetectable in conventional trial designs but can be detected in sequential randomization trial designs.
National Institutes of Health and U.S. Department of Veterans Affairs.
In view of substantial mis-estimation of risks of diabetes complications using existing equations, we sought to develop updated Risk Equations for Complications Of type 2 Diabetes (RECODe).
To ...develop and validate these risk equations, we used data from the Action to Control Cardiovascular Risk in Diabetes study (ACCORD, n=9635; 2001-09) and validated the equations for microvascular events using data from the Diabetes Prevention Program Outcomes Study (DPPOS, n=1018; 1996-2001), and for cardiovascular events using data from the Action for Health in Diabetes (Look AHEAD, n=4760; 2001-12). Microvascular outcomes were nephropathy, retinopathy, and neuropathy. Cardiovascular outcomes were myocardial infarction, stroke, congestive heart failure, and cardiovascular mortality. We also included all-cause mortality as an outcome. We used a cross-validating machine learning method to select predictor variables from demographic characteristics, clinical variables, comorbidities, medications, and biomarkers into Cox proportional hazards models for each outcome. The new equations were compared to older risk equations by assessing model discrimination, calibration, and the net reclassification index.
All equations had moderate internal and external discrimination (C-statistics 0·55-0·84 internally, 0·57-0·79 externally) and high internal and external calibration (slopes 0·71-1·31 between observed and estimated risk). Our equations had better discrimination and calibration than the UK Prospective Diabetes Study Outcomes Model 2 (for microvascular and cardiovascular outcomes, C-statistics 0·54-0·62, slopes 0·06-1·12) and the American College of Cardiology/American Heart Association Pooled Cohort Equations (for fatal or non-fatal myocardial infarction or stroke, C-statistics 0·61-0·66, slopes 0·30-0·39).
RECODe might improve estimation of risk of complications for patients with type 2 diabetes.
National Institute for Diabetes and Digestive and Kidney Disease, National Heart, Lung and Blood Institute, and National Institute on Minority Health and Health Disparities, National Institutes of Health, and US Department of Veterans Affairs.