The decision curve is a graphical summary recently proposed for assessing the potential clinical impact of risk prediction biomarkers or risk models for recommending treatment or intervention. It was ...applied recently in an article in Journal of Clinical Oncology to measure the impact of using a genomic risk model for deciding on adjuvant radiation therapy for prostate cancer treated with radical prostatectomy. We illustrate the use of decision curves for evaluating clinical- and biomarker-based models for predicting a man's risk of prostate cancer, which could be used to guide the decision to biopsy. Decision curves are grounded in a decision-theoretical framework that accounts for both the benefits of intervention and the costs of intervention to a patient who cannot benefit. Decision curves are thus an improvement over purely mathematical measures of performance such as the area under the receiver operating characteristic curve. However, there are challenges in using and interpreting decision curves appropriately. We caution that decision curves cannot be used to identify the optimal risk threshold for recommending intervention. We discuss the use of decision curves for miscalibrated risk models. Finally, we emphasize that a decision curve shows the performance of a risk model in a population in which every patient has the same expected benefit and cost of intervention. If every patient has a personal benefit and cost, then the curves are not useful. If subpopulations have different benefits and costs, subpopulation-specific decision curves should be used. As a companion to this article, we released an R software package called DecisionCurve for making decision curves and related graphics.
When evaluating a new risk factor for disease (eg, a measurement from imaging studies), many investigators examine its value above and beyond existing biomarkers and risk factors. They compare the ...performance of an "old" risk model using established predictors and a "new" risk model that adds the new factor. Net reclassification index (NRI) statistics are a family of metrics for comparing two risk models. NRI statistics became popular in some medical fields and have appeared in high-impact journals. This article reviews NRI statistics and describes several issues with them. Problems include unacceptable statistical behavior, incorrect statistical inferences, and lack of interpretability. NRI statistics are unhelpful (at best) and misleading (at worst).
Display omitted
•HCC risk varies dramatically in patients with cirrhosis.•We developed models estimating HCC risk in patients with NAFLD-cirrhosis or ALD-cirrhosis.•The models use simple, readily ...available predictors.•The models are available as web-based tools at www.hccrisk.com.
Hepatocellular carcinoma (HCC) risk varies dramatically in patients with cirrhosis according to well-described, readily available predictors. We aimed to develop simple models estimating HCC risk in patients with alcohol-related liver disease (ALD)-cirrhosis or non-alcoholic fatty liver disease (NAFLD)-cirrhosis and calculate the net benefit that would be derived by implementing HCC surveillance strategies based on HCC risk as predicted by our models.
We identified 7,068 patients with NAFLD-cirrhosis and 16,175 with ALD-cirrhosis who received care in the Veterans Affairs (VA) healthcare system in 2012. We retrospectively followed them for the development of incident HCC until January 2018. We used Cox proportional hazards regression to develop and internally validate models predicting HCC risk using baseline characteristics at entry into the cohort in 2012. We plotted decision curves of net benefit against HCC screening thresholds.
We identified 1,278 incident cases of HCC during a mean follow-up period of 3.7 years. Mean annualized HCC incidence was 1.56% in NAFLD-cirrhosis and 1.44% in ALD-cirrhosis. The final models estimating HCC were developed separately for NAFLD-cirrhosis and ALD-cirrhosis and included 7 predictors: age, gender, diabetes, body mass index, platelet count, serum albumin and aspartate aminotransferase to √alanine aminotransferase ratio. The models exhibited very good measures of discrimination and calibration and an area under the receiver operating characteristic curve of 0.75 for NAFLD-cirrhosis and 0.76 for ALD-cirrhosis. Decision curves showed higher standardized net benefit of risk-based screening using our prediction models compared to the screen-all approach.
We developed simple models estimating HCC risk in patients with NAFLD-cirrhosis or ALD-cirrhosis, which are available as web-based tools (www.hccrisk.com). Risk stratification can be used to inform risk-based HCC surveillance strategies in individual patients or healthcare systems or to identify high-risk patients for clinical trials.
Patients with cirrhosis of the liver are at risk of getting hepatocellular carcinoma (HCC or liver cancer) and therefore it is recommended that they undergo surveillance for HCC. However, the risk of HCC varies dramatically in patients with cirrhosis, which has implications on if and how patients get surveillance, how providers counsel patients about the need for surveillance, and how healthcare systems approach and prioritize surveillance. We used readily available predictors to develop models estimating HCC risk in patients with cirrhosis, which are available as web-based tools at www.hccrisk.com.
Adaptive gain theory proposes that the dynamic shifts between exploration and exploitation control states are modulated by the locus coeruleus-norepinephrine system and reflected in tonic and phasic ...pupil diameter. This study tested predictions of this theory in the context of a societally important visual search task: the review and interpretation of digital whole slide images of breast biopsies by physicians (pathologists). As these medical images are searched, pathologists encounter difficult visual features and intermittently zoom in to examine features of interest. We propose that tonic and phasic pupil diameter changes during image review may correspond to perceived difficulty and dynamic shifts between exploration and exploitation control states. To examine this possibility, we monitored visual search behavior and tonic and phasic pupil diameter while pathologists (N = 89) interpreted 14 digital images of breast biopsy tissue (1,246 total images reviewed). After viewing the images, pathologists provided a diagnosis and rated the level of difficulty of the image. Analyses of tonic pupil diameter examined whether pupil dilation was associated with pathologists' difficulty ratings, diagnostic accuracy, and experience level. To examine phasic pupil diameter, we parsed continuous visual search data into discrete zoom-in and zoom-out events, including shifts from low to high magnification (e.g., 1× to 10×) and the reverse. Analyses examined whether zoom-in and zoom-out events were associated with phasic pupil diameter change. Results demonstrated that tonic pupil diameter was associated with image difficulty ratings and zoom level, and phasic pupil diameter showed constriction upon zoom-in events, and dilation immediately preceding a zoom-out event. Results are interpreted in the context of adaptive gain theory, information gain theory, and the monitoring and assessment of physicians' diagnostic interpretive processes.
Display omitted
•We developed and validated models to estimate HCC risk after antiviral treatment for HCV.•Using these models may improve HCC screening strategies.•Models are available as web-based ...tools.
Most patients with hepatitis C virus (HCV) infection will undergo antiviral treatment with direct-acting antivirals (DAAs) and achieve sustained virologic response (SVR). We aimed to develop models estimating hepatocellular carcinoma (HCC) risk after antiviral treatment.
We identified 45,810 patients who initiated antiviral treatment in the Veterans Affairs (VA) national healthcare system from 1/1/2009 to 12/31/2015, including 29,309 (64%) DAA-only regimens and 16,501 (36%) interferon ± DAA regimens. We retrospectively followed patients until 6/15/2017 to identify incident cases of HCC. We used Cox proportional hazards regression to develop and internally validate models predicting HCC risk using baseline characteristics at the time of antiviral treatment.
We identified 1,412 incident cases of HCC diagnosed at least 180 days after initiation of antiviral treatment during a mean follow-up of 2.5 years (range 1.0–7.5 years). Models predicting HCC risk after antiviral treatment were developed and validated separately for four subgroups of patients: cirrhosis/SVR, cirrhosis/no SVR, no cirrhosis/SVR, no cirrhosis/no SVR. Four predictors (age, platelet count, serum aspartate aminotransferase/√alanine aminotransferase ratio and albumin) accounted for most of the models’ predictive value, with smaller contributions from sex, race-ethnicity, HCV genotype, body mass index, hemoglobin and serum alpha-fetoprotein. Fitted models were well-calibrated with very good measures of discrimination. Decision curves demonstrated higher net benefit of using model-based HCC risk estimates to determine whether to recommend screening or not compared to the screen-all or screen-none strategies.
We developed and internally validated models that estimate HCC risk following antiviral treatment. These models are available as web-based tools that can be used to inform risk-based HCC surveillance strategies in individual patients.
Most patients with hepatitis C virus have been treated or will be treated with direct-acting antivirals. It is important that we can model the risk of hepatocellular carcinoma in these patients, so that we develop the optimum screening strategy that avoids unnecessary screening, while adequately screening those at increased risk. Herein, we have developed and validated models that are available as web-based tools that can be used to guide screening strategies.
The FDA approved drug rapamycin increases lifespan in rodents and delays age-related dysfunction in rodents and humans. Nevertheless, important questions remain regarding the optimal dose, duration, ...and mechanisms of action in the context of healthy aging. Here we show that 3 months of rapamycin treatment is sufficient to increase life expectancy by up to 60% and improve measures of healthspan in middle-aged mice. This transient treatment is also associated with a remodeling of the microbiome, including dramatically increased prevalence of segmented filamentous bacteria in the small intestine. We also define a dose in female mice that does not extend lifespan, but is associated with a striking shift in cancer prevalence toward aggressive hematopoietic cancers and away from non-hematopoietic malignancies. These data suggest that a short-term rapamycin treatment late in life has persistent effects that can robustly delay aging, influence cancer prevalence, and modulate the microbiome.
Net reclassification indices have recently become popular statistics for measuring the prediction increment of new biomarkers. We review the various types of net reclassification indices and their ...correct interpretations. We evaluate the advantages and disadvantages of quantifying the prediction increment with these indices. For predefined risk categories, we relate net reclassification indices to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for net reclassification indices and evaluate the merits of hypothesis testing based on such indices. We recommend that investigators using net reclassification indices should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the components of net reclassification indices are the same as the changes in the true- and false-positive rates. We advocate the use of true- and false-positive rates and suggest it is more useful for investigators to retain the existing, descriptive terms. When there are three or more risk categories, we recommend against net reclassification indices because they do not adequately account for clinically important differences in shifts among risk categories. The category-free net reclassification index is a new descriptive device designed to avoid predefined risk categories. However, it experiences many of the same problems as other measures such as the area under the receiver operating characteristic curve. In addition, the category-free index can mislead investigators by overstating the incremental value of a biomarker, even in independent validation data. When investigators want to test a null hypothesis of no prediction increment, the well-established tests for coefficients in the regression model are superior to the net reclassification index. If investigators want to use net reclassification indices, confidence intervals should be calculated using bootstrap methods rather than published variance formulas. The preferred single-number summary of the prediction increment is the improvement in net benefit.
Biomarkers can be used to enrich a clinical trial for patients at higher risk for an outcome, a strategy termed "prognostic enrichment." Methodology is needed to evaluate biomarkers for prognostic ...enrichment of trials with time-to-event endpoints such as survival. Key considerations when considering prognostic enrichment include: clinical trial sample size; the number of patients one must screen to enroll the trial; and total patient screening costs and total per-patient trial costs. The Biomarker Prognostic Enrichment Tool for Survival Outcomes (BioPETsurv) is a suite of methods for estimating these elements to evaluate a prognostic enrichment biomarker and/or plan a prognostically enriched clinical trial with a time-to-event primary endpoint. BioPETsurv allows investigators to analyze data on a candidate biomarker and potentially censored survival times. Alternatively, BioPETsurv can simulate data to match a particular clinical setting. BioPETsurv's data simulator enables investigators to explore the potential utility of a prognostic enrichment biomarker for their clinical setting. Results demonstrate that both modestly prognostic and strongly prognostic biomarkers can improve trial metrics such as reducing sample size or trial costs. In addition to the quantitative analysis provided by BioPETsurv, investigators should consider the generalizability of trial results and evaluate the ethics of trial eligibility criteria. BioPETsurv is freely available as a package for the R statistical computing platform, and as a webtool at www.prognosticenrichment.com/surv.