The graft-versus-leukemia (GVL) effect after allogeneic hematopoietic cell transplant (HCT) can prevent relapse but the risk of severe graft-vs-host disease (GVHD) leads to prolonged intensive ...immunosuppression and possible blunting of the GVL effect. Strategies to reduce immunosuppression in order to prevent relapse have been offset by increases in severe GVHD and non-relapse mortality (NRM). We recently validated the MAGIC algorithm probability (MAP) that predicts the risk for severe GVHD and NRM in asymptomatic patients using serum biomarkers. In this study we tested whether the MAP could identify patients whose risk for relapse is higher than their risk for severe GVHD and NRM. The multicenter study population (n=1604) was divided into two cohorts: historical (2006–2015, n=702) and current (2015–2017, n=902) with similar non-relapse mortality, relapse, and survival. On day 28 post-HCT, patients who had not developed GVHD (75% of the population) and who possessed a low MAP were at much higher risk for relapse (24%) than severe GVHD and NRM (16% and 9%); this difference was even more pronounced in patients with a high disease risk index (relapse 33%, NRM 9%). Such patients are good candidates to test relapse prevention strategies that might enhance GVL.
Systemic glucocorticoids are the principal treatment for acute graft-versus-host disease (GVHD), which remains the major cause of non-relapse mortality (NRM) after allogeneic hematopoietic cell ...transplantation (HCT). However, there are no validated biomarkers that measure a patient's response to glucocorticoid therapy, and thus response is evaluated by the change in clinical symptom severity. A major weakness in the predictive power of clinical responses is that changes to all organs are weighted equally even though the major driver of NRM is irreversible damage to the crypts of the GI tract. Recent studies from the Mount Sinai Acute GVHD International Consortium (MAGIC) have validated an algorithm probability (MAP) that combines serum concentrations of two biomarkers of GVHD (REG3α and ST2) to generate an estimated probability of 6 month NRM for individual patients. The MAP has been considered a “liquid biopsy” that estimates the damage caused by GVHD to crypts throughout the lower GI tract (Hartwell et al., JCI Insight, 2017; Major-Monfried et al., Blood, 2018). We hypothesized that the change in MAP between start of treatment and 28 days later could serve as a response biomarker for GVHD and might compare favorably to the change in clinical symptoms that measures response to GVHD treatment, which is widely used as a surrogate for long term survival and is the primary endpoint in most GVHD treatment trials (Martin et al., BBMT, 2009; MacMillan et al., Blood, 2010).
We prospectively collected serum samples and clinical staging from 368 sequential HCT patients who received systemic treatment for acute GVHD in one of 20 MAGIC centers between January 2016 and February 2018. We measured the serum concentrations of REG3α and ST2 before and after systemic therapy for acute GVHD and computed MAPs, the changes in MAPs, and clinical responses for each patient.
MAPs of patients who experienced 6 month NRM showed significantly greater increases than MAPs of patients who survived (p=0.0004). In patients whose MAPs at the start of treatment were low (Ann Arbor 1, MAP < 0.141) or intermediate (Ann Arbor 2, 0.141 ≤ MAP ≤ 0.290), 6 month NRM clustered among those who had the greatest increases in MAP after 28 days (Fig 1A,B). In patients with high MAPs at the start of treatment (Ann Arbor 3, MAP > 0.290), those who survived tended to have the largest decreases in MAP (Fig 1C). These changes in MAP suggested crossing a single threshold could predict risk of mortality. We found that patients whose MAPs rose above a threshold MAP of 0.290 (5% of Ann Arbor 1, 27% of Ann Arbor 2) had significantly worse survival compared to those who remained below it, whereas the large number patients with initially high MAPs that remained above the threshold (66% of Ann Arbor 3) had a large increases in mortality (Fig 2).
When measured at day 28, the MAP was significantly more accurate in predicting NRM than the gold standard of the clinical response, with areas under the receiver operating characteristic curve (AUC) of 0.86 and 0.70, respectively (p<0.0001). An algorithm that combined clinical response with biomarkers generated the same AUC as the MAP alone (0.83 v 0.86, p = NS). We next tested whether the same MAP threshold of 0.290 could predict risk within clinical response subsets. A significant minority (10%) of clinical responders had high MAPs and experienced three-fold greater NRM than those with low MAPs (40% v 12%, p<0.0001) whereas the majority (57%) of non-responders had low MAPs and experienced almost three-fold lower NRM than those with high MAPs (24% v 65%, p<0.0001) (Fig 3). Thus the MAP provides important prognostic information over and above the change in clinical symptoms, further stratifying both responders and non-responders at four weeks of treatment. The MAP threshold classified patients both with and without significant lower GI symptoms because the MAP is a more specific measure of irreversible cryptic damage in patients with copious diarrhea and more sensitive in patients with less than 0.5 liters of daily diarrhea (Fig 4).
We conclude that the MAP is, to our knowledge, the first validated laboratory test to serve as response biomarker for the treatment for acute GVHD and a more accurate predictor of survival than clinical response after four weeks of treatment. The MAP may serve as a novel endpoint and an important complement to changes in clinical symptom severity in future trials of GVHD treatment.
Display omitted
Srinagesh:National Institutes of Health: Research Funding. Ozbek:Viracor: Patents & Royalties: Biomarker Patent. Ayuk:Novartis: Honoraria, Other: Advisory Board, Research Funding. Aziz:Doris Duke Charitable Foundation: Research Funding. Defilipp:Incyte: Research Funding. Grupp:Novartis: Consultancy, Research Funding; Roche: Consultancy; GSK: Consultancy; CBMG: Consultancy; Novartis: Research Funding; Kite: Research Funding; Servier: Research Funding; Jazz: Other: study steering committees or scientific advisory boards; Adaptimmune: Other: study steering committees or scientific advisory boards; Cure Genetics: Consultancy; Humanigen: Consultancy. Hexner:novartis: Research Funding. Kitko:Mallinckrodt: Honoraria; Novartis: Consultancy, Honoraria. Mielke:EBMT/EHA: Other: Travel support; ISCT: Other: Travel support; Miltenyi: Consultancy, Honoraria, Other: Travel and speakers fee (via institution), Speakers Bureau; Jazz Pharma: Honoraria, Other: Travel support, Speakers Bureau; IACH: Other: Travel support; Kiadis Pharma: Consultancy, Honoraria, Other: Travel support (via institution), Speakers Bureau; DGHO: Other: Travel support; Bellicum: Consultancy, Honoraria, Other: Travel (via institution); GILEAD: Consultancy, Honoraria, Other: travel (via institution), Speakers Bureau; Celgene: Honoraria, Other: Travel support (via institution), Speakers Bureau. Merli:Sobi: Consultancy; Amgen: Honoraria; Novartis: Honoraria; Bellicum: Consultancy. Pulsipher:Amgen: Other: Lecture; Miltenyi: Research Funding; Bellicum: Consultancy; Novartis: Consultancy, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Jazz: Other: Education for employees; CSL Behring: Membership on an entity's Board of Directors or advisory committees; Adaptive: Membership on an entity's Board of Directors or advisory committees, Research Funding; Medac: Honoraria. Qayed:Bristol-Myers Squibb: Honoraria. Reshef:Pfizer: Consultancy; Magenta: Consultancy; Kite: Consultancy, Research Funding; Atara: Consultancy, Research Funding; BMS: Consultancy; Pharmacyclics: Consultancy, Research Funding; Incyte: Consultancy, Research Funding; Celgene: Research Funding; Shire: Research Funding. Levine:Incyte: Consultancy, Research Funding; Biogen: Other: non-financial support; Viracor: Patents & Royalties: biomarker patent; Ironwood: Honoraria; bluebird bio: Consultancy; National Cancer Institute: Research Funding; Novartis: Honoraria; Kamada: Research Funding. Ferrara:National Institutes of Health: Research Funding; ViraCor: Consultancy; Incyte: Consultancy; Kamada: Consultancy; Mallinckrodt: Consultancy; Enlivex: Consultancy; Xenikos: Consultancy; CSL Behring: Consultancy.
Relapse of malignancy and lethal graft versus host disease (GVHD) are the principal causes of failure of allogeneic hematopoietic cell transplant (HCT). Recently we have shown that at seven days ...after HCT an algorithm using two serum biomarkers (ST2 and REG3α) can predict severe GVHD (Hartwell et al. JCI Insight 2017). We determined whether serial testing (in the first month following HCT) of patients with low probability biomarkers would improve the predictive accuracy of the algorithm and identify patients with different risks of relapse and lethal GVHD. Patients who received an HCT at 18 centers in the Mount Sinai Acute GVHD International Consortium (MAGIC) for hematologic malignancy and who supplied three blood samples were divided into a training set and validation set with equal numbers of lethal GVHD events, which was defined as death from GVHD or infection during treatment for GVHD. Patients in the training set (n=702) underwent HCT from January 1, 2006 until June 30, 2015, whereas patients in the validation set (n=906) underwent HCT from July 1, 2015 to May 1, 2017. Serum samples were analyzed using the previously published algorithm of two biomarkers up to three times (day 7, day 14, day 28 or GVHD onset, if onset occurred within the first 28 days). The algorithm generates a predicted probability of lethal GVHD between 0 and 1 for each patient. Patients were categorized as either low probability (LP) or high probability (HP) for lethal GVHD. HP thresholds of 0.20 and 0.16 were used to classify patients with and without GVHD symptoms, respectively (once categorized as HP, patients remained in that category and were not retested). All results were similar between training and validation sets, and we present here the validation set results.
Serial testing identified 28% of patients as HP with a three-fold greater cumulative incidence of lethal GVHD at one year (13% vs 4%, p<0.001, Figure 1). Relapse rates were the same in both probability groups, and thus LP patients experienced significantly better relapse free survival (RFS) (69% vs 53%, p<0.001). As expected, significantly fewer LP patients experienced severe GVHD at onset (as measured by Minnesota risk), maximum grade III/IV GVHD, or steroid resistant GVHD by day 180 after HCT (Figure 2, p<0.001 for all three parameters). To measure the accuracy of prediction, at each timepoint we calculated the area under the curve (AUC) of receiver operating characteristic curves; the AUC increased significantly with each subsequent evaluation, from 0.59 at one timepoint (dotted line) to 0.74 at the third timepoint (solid line) (p<0.001), with a final sensitivity of 65% and specificity of 74% (Figure 3A).
Early development of GVHD (by day 28) is a risk factor for lethal GVHD. Therefore, we next plotted RFS (dashed line), relapse (solid line), and lethal GVHD (dotted line) rates in patients who developed GVHD by day 28. 25% of patients with GVHD were categorized as HP and had a cumulative incidence of lethal GVHD more than four times higher (28%) than that of relapse (6%); however the risks were reversed for the 75% of patients who were LP, where relapse (15%) occurred twice as often as lethal GVHD (7%) (Figure 3B). In patients who did not develop GVHD in the first month, this reversal of risks was even more dramatic. Approximately half (53%) of the entire validation cohort did not develop GVHD by day 28 and was LP at all three evaluations. These patients had an exceptionally low risk of lethal GVHD and thus they relapsed (25%) much more often than they died from GVHD (3%). When malignancies were classified according to risk for relapse by the disease risk index (DRI) (Figure 3C), the probability of relapse was three fold higher than lethal GVHD in malignancies with a low DRI (12%), six fold higher for intermediate DRI (20%), and eleven fold higher for high/very high DRI (33%).
We conclude that a serial monitoring strategy using GVHD biomarkers for one month after HCT is able to identify two groups of patients with very different risks of lethal GVHD and relapse. For these patients, the intensity of immunosuppression after day 28 could be tailored according to the probabilities of developing lethal GVHD and relapse in the context of clinical trials.
Display omitted
Aziz:Doris Duke Charitable Foundation: Research Funding. Ayuk:Therakos (Mallinckrodt): Honoraria; Novartis: Honoraria; Celgene: Consultancy; Gilead: Consultancy. Chen:REGiMMUNE: Consultancy; Incyte: Consultancy, Membership on an entity's Board of Directors or advisory committees; Magenta Therapeutics: Consultancy; Takeda Pharmaceuticals: Consultancy. Merli:Neovii Biotech: Honoraria; AMGEN: Honoraria. Roesler:Sanofi: Other: Travel, Accommodations, Expenses; Amgen: Equity Ownership; Jazz Pharmaceuticals: Other: Travel, Accommodations, Expenses; Immunomedics: Equity Ownership; Biogen: Equity Ownership; Merck: Consultancy; Pfizer: Consultancy. Kitko:Novartis: Consultancy, Honoraria; Mallinckrodt: Honoraria, Other: Travel, Accommodations, Expenses. Qayed:Novartis: Consultancy. Wölfl:Bristol-myers Squibb: Equity Ownership; Novartis: Equity Ownership; Taheda: Equity Ownership; Juno: Equity Ownership; Neovii: Other: Travel, Accommodations, Expenses. Mielke:Celgene: Speakers Bureau; DGHO: Speakers Bureau; EHA: Speakers Bureau; Kiadis Pharma: Speakers Bureau; Miltenyi: Speakers Bureau. Wudhikarn:Takeda Oncology: Other: Travel, Accommodations, Expenses. Nakamura:Celgene: Honoraria; Molmed: Honoraria; Merck: Consultancy; Pharmacyclics: Consultancy; Atara: Consultancy; Jazz Pharmaceuticals: Consultancy. Pulsipher:CSL Behring: Consultancy; Novartis: Consultancy, Honoraria, Speakers Bureau; Adaptive Biotech: Consultancy, Research Funding; Amgen: Honoraria. Reshef:Pfizer: Consultancy; Atara Biotherapeutics: Consultancy; Kite Pharma: Consultancy; Takeda Pharmaceuticals: Consultancy; Bristol-Myers Squibb: Consultancy; Incyte: Consultancy. Levine:Therakos: Consultancy; Novartis: Consultancy; Bluebird: Consultancy; Incyte: Consultancy; Kamada: Research Funding; Viracor: Patents & Royalties. Ferrara:Incyte: Consultancy, Honoraria, Other: Travel, Accommodations, Expenses; Xenikos: Consultancy, Other: Travel, Accommodations, Expenses; Kamada: Consultancy, Research Funding; Viracor: Consultancy, Patents & Royalties.
•Biomarker scores predicted risk of NRM better than clinical severity at the onset of GVHD treatment in children.•A combined biomarker/clinical model was highly sensitive and specific for NRM in ...children with GVHD.
Acute graft versus host disease (GVHD) is a common and serious complication of allogeneic hematopoietic cell transplantation (HCT) in children but overall clinical grade at onset only modestly predicts response to treatment and survival outcomes. Two tools to assess risk at initiation of treatment were recently developed. The Minnesota risk system stratifies children for risk of nonrelapse mortality (NRM) according to the pattern of GVHD target organ severity. The Mount Sinai Acute GVHD International Consortium (MAGIC) algorithm of 2 serum biomarkers (ST2 and REG3α) predicts NRM in adult patients but has not been validated in a pediatric population. We aimed to develop and validate a system that stratifies children at the onset of GVHD for risk of 6-month NRM. We determined the MAGIC algorithm probabilities (MAPs) and Minnesota risk for a multicenter cohort of 315 pediatric patients who developed GVHD requiring treatment with systemic corticosteroids. MAPs created 3 risk groups with distinct outcomes at the start of treatment and were more accurate than Minnesota risk stratification for prediction of NRM (area under the receiver operating curve (AUC), .79 versus .62, P = .001). A novel model that combined Minnesota risk and biomarker scores created from a training cohort was more accurate than either biomarkers or clinical systems in a validation cohort (AUC .87) and stratified patients into 2 groups with highly different 6-month NRM (5% versus 38%, P < .001). In summary, we validated the MAP as a prognostic biomarker in pediatric patients with GVHD, and a novel risk stratification that combines Minnesota risk and biomarker risk performed best. Biomarker-based risk stratification can be used in clinical trials to develop more tailored approaches for children who require treatment for GVHD.
Display omitted
We used a rigorous PRoBE study design to compare the ability of biomarkers of systemic inflammation and biomarkers of GI tissue damage to predict response to corticosteroid treatment, the incidence ...of clinically severe disease, 6-month nonrelapse mortality (NRM), and overall survival in patients with acute graft vs. host disease (GVHD). We prospectively collected serum samples of newly diagnosed GVHD patients (n=730) from 19 centers, divided them into training (n=352) and validation cohorts (n=378), and measured TNFR1, TIM3, IL6, ST2, and REG3α via ELISA. Performances of the 4 strongest algorithms from the training cohort (TNFR1+TIM3, TNFR1+ST2, TNFR1+REG3α, ST2+REG3α) were evaluated in the validation cohort. The algorithm that included only biomarkers of systemic inflammation (TNFR1+TIM3) had a significantly smaller area under the curve (AUC, 0.57) than the AUCs of algorithms that contained at least 1 GI damage biomarker (TNFR1+ST2, 0.70; TNFR1+REG3α, 0.73; ST2+REG3α, 0.79; all p<0.001). All 4 algorithms were able to predict short-term outcomes such as response to systemic corticosteroids and severe GVHD, but inclusion of a GI damage biomarker was needed to predict long-term outcomes such as 6-month NRM and survival. The algorithm that included 2 GI damage biomarkers was the most accurate of the 4 algorithms for all endpoints.
IntroductionA subset of patients with borderline hypoplastic left-sided structures (BHLSS) may be candidates for single to biventricular conversion (BiVC), but significant long-term morbidity and ...mortality persist. Prior studies have shown conflicting results regarding the association of preoperative diastolic dysfunction and outcome, and patient selection for BiVC remains challenging.HypothesisPreoperative parameters consistent with diastolic dysfunction such as elevated LV end diastolic pressure (EDP) and history of LV endocardial fibroelastosis (EFE) are associated with adverse outcome after BiVC.MethodsPatients with BHLSS and single ventricle physiology undergoing BiVC from 2005-2017 were included. Cox regression was used to identify preoperative parameters associated with a composite outcome after BiVC of time to the earliest occurrence of mortality, transplant, takedown to single ventricle circulation, or hemodynamic failure (defined as LV EDP > 20 mmHg, mean pulmonary artery pressure > 35 mmHg, or PVR > 6 iWU).ResultsAmong 43 patients, 20 (47%) met the primary outcome, with a median time to outcome of 5.2 years. On univariate analysis, history of EFE, lower MRI LV end-diastolic volume/BSA (EDVi, when < 50 ml), lower MRI LV stroke volume/BSA (SVi, when < 32 ml/m), and lower left:right ventricular SV ratio (when < 0.7) were associated with increased risk of the outcome; higher preoperative LV EDP was not. Multivariable analysis demonstrated that presence of EFE (hazard 5.1, 95% CI 1.5-22.7, p = 0.033) and LV SVi < 28 ml/m (lowest tertile, hazard 4.3, 95% CI 1.5-12.3, p = 0.006) were independently associated with a higher hazard of the outcome. Nearly all (86%) of patients with EFE and LV SVi < 28 ml/m2 met the outcome in comparison to only 10% of those without EFE and higher LV SVi.ConclusionHistory of EFE and smaller LV SVi are independent predictors of a composite adverse outcome among patients with BHLSS undergoing BiVC.
Relapse is most common cause of death after HCT. However, given the heterogeneity of diseases and disease states among patients (pts), risk for relapse is often difficult to manage in clinical trials ...when survival is the endpoint. The Disease Risk Index (DRI) predicts overall survival (OS) based on pre-HCT disease characteristics such as type of disease, remission status, and cytogenetic abnormalities and can be used to group heterogeneous diseases into 4 risk strata: very high risk VHR, 1% of pts, high risk HR, 23%, intermediate risk IR, 64%, and low risk LR, 12% (Armand, Blood, 2014). OS for each of these strata is distinctly different. The DRI has been validated for adults, but not for children. Furthermore, the DRI cannot be calculated for some pediatric diseases like juvenile myelomonocytic leukemia (JMML). To determine its validity for children we calculated the DRI for 280 pediatric pts transplanted between 2008 and 2018 at Mount Sinai Acute GVHD International Consortium centers. The median age was 8.9 years (range, 0.4 - 17). The most common indications for HCT were ALL (51%), AML (31%), MDS (10%), and JMML (4%). Donors were adult unrelated donors (48%), unrelated cord blood (20%), haploidentical donors (17%), and matched siblings (15%). We first determined 2y survival by DRI for the 270 pts without JMML. Survival for JMML was closest to the VHR strata and we grouped JMML with VHR for all further analyses. The size of each of the risk strata were different for children compared to adults, VHR (n=32, 11%), HR (n=108, 39%), IR (n=128, 46%), and LR (n=12, 4%). Given the small number of LR pts, we excluded them from further analyses. We then determined that DRI effectively stratified pts into distinct strata for relapse (FIG 1A) and disease-free survival (FIG 1B). Because transplant practices are different for children compared to adults (e.g., more myeloablative conditioning), we examined whether DRI remained predictive of outcomes after adjustment for age, donor type, stem cell source, and conditioning intensity. We used IR as the reference group for these multivariate analyses. Children with HR DRI were significantly more likely to relapse (HR 2.1, 95% CI 1.2-3.7) and have worse DFS (HR 1.6, 95% CI 1.0-2.5). The risks for relapse and worse DFS were even greater for children with VHR DRI (relapse, HR 3.4, 95% CI 1.6-7.3, DFS 3.59, 95% CI 2.0-6.5). There were no significant differences in 2y non-relapse mortality by DRI. VHR pts had significantly worse 2y OS (HR 3.2, 95% CI 1.6-6.4) but HR pts were not significantly different for 2y OS than IR pts (HR 1.3, 95% CI 0.7-2.2). OS for HR and IR pts may diverge with longer follow-up. In summary, the LR strata is small and more pts are needed to determine their outcomes. However, for >95% of children the DRI creates 3 distinct risk strata. The DRI can be used in pediatric clinical trials to stratify for risk of relapse.
There are no validated biomarkers that measure a patient's response to therapy for acute graft-versus-host disease (GVHD), the leading cause of non-relapse mortality (NRM) after allogeneic ...hematopoietic cell transplant (HCT). Recent studies from the Mount Sinai Acute GVHD International Consortium (MAGIC) have validated an algorithm probability (MAP) that combines serum concentrations of two biomarkers of GVHD (REG3α and ST2) to generate an estimated probability of 6 month NRM for individual patients. The MAP estimates GVHD-mediated damage to crypts throughout the lower GI tract at single time points (Hartwell et al., JCI Insight, 2017; Major- Monfried et al., Blood, 2018). We hypothesized that the change in MAP between start of treatment and 28 days later could serve as a response biomarker and would compare favorably to clinical response, the gold standard which is widely used as a surrogate for long term survival and is the primary endpoint in most GVHD treatment trials (Martin et al., BBMT, 2009; MacMillan et al., Blood, 2010).
We prospectively collected serum samples and clinical staging from 368 sequential HCT patients who received systemic treatment for acute GVHD in one of 20 MAGIC centers between January 2016 and February 2018. We computed MAPs and clinical responses for each patient.
MAPs of patients who experienced 6 month NRM increased significantly compared to MAPs of patients who survived (p=0.0004). In patients whose initial MAPs were low (Ann Arbor 1, MAP < 0.141) or intermediate (Ann Arbor 2, 0.141 ≤ MAP ≤ 0.290), 6 month NRM clustered among those with the greatest increases in MAP after 28 days (Fig 1A,B). In patients with high initial MAPs (Ann Arbor 3, MAP > 0.290), those who survived tended to have the largest decreases in MAP (Fig 1C). We found that patients whose MAPs rose above the previously determined high-risk threshold MAP of 0.290 had significantly worse survival compared to those who remained below it, whereas the large number patients with initially high MAPs that remained above the threshold had a large increase in mortality (Fig 2).
When measured at day 28, MAPs predicted NRM more accurately than clinical responses, with areas under the receiver operating characteristic curve (AUC) of 0.86 and 0.70, respectively (p<0.0001). An algorithm that combined clinical response with biomarkers generated the same AUC as the MAP alone (0.83 v 0.86, p = NS). The same MAP threshold predicted risk of NRM within both clinical responders and non-responders after four weeks of treatment (Fig 3), and in those with and without significant lower GI symptoms (Fig 4).
We conclude that the MAP is, to our knowledge, the first laboratory test validated as a response biomarker for acute GVHD treatment and more accurately predicts survival than clinical response after 28 days of treatment. The MAP may serve as a novel endpoint in future trials of GVHD treatment.
Acute GVHD biomarkers predict long-term outcomes after GVHD diagnosis. Our group has validated the MAGIC algorithm which uses the concentrations of 2 GVHD biomarkers, ST2+REG3a, to predict lethal ...GVHD (defined as death from GVHD without relapse) Hartwell 2017. Other published biomarker combinations that predict GVHD outcomes include ST2+ REG3a+TNFR1 Levine 2015, ST2+TIM3 Abu Zaid 2017, ST2+TNFR1 McDonald 2015, TIM3+TNFR1+IL6 McDonald 2015, as well as AREG alone Holtan 2018. It is not clear which biomarker combination best predicts lethal GVHD because these algorithms were developed using different patient cohorts with different endpoints. To answer this question, we compared the predictive accuracy of different biomarker combinations as well as these six specific combinations in the same patient cohort. We studied 522 patients with serum samples at GVHD diagnosis who were transplanted at 19 Mount Sinai Acute GVHD International Consortium (MAGIC) centers between January 1, 2004 and April 30, 2017. Patients were divided into training (n = 253) and validation (n = 269) sets; validation patients were transplanted after November 1, 2015 and had not previously been used to generate an algorithm. Biomarkers were measured by ELISA and log-transformed values were used to predict 1-year lethal GVHD by competing risk regression. Four of the 6 biomarkers (ST2, REG3a, TNFR1, and TIM3) independently predicted lethal GVHD in the training set. We then developed algorithms that predicted lethal GVHD using all possible combinations of these 4 biomarkers as well as the published combination of TIM3+TNFR1+IL6 and AREG alone. The best algorithms of 1-4 biomarkers for predicting lethal GVHD were REG3a alone, ST2+REG3a, ST2+REG3a+TNFR1, and ST2+REG3a+TNFR1+TIM3. While ST2+REG3a was the most accurate combination based on the lowest Akaike Information Criterion (AIC), the other published biomarker combinations produced similar AICs. Therefore, we used the independent validation set to compare all six published algorithms. We generated area under the receiver operating characteristic curves (AUC) for each algorithm (FIG 1). We next determined the threshold that maximized sensitivity and specificity for each algorithm and calculated the cumulative incidence of lethal GVHD in the resulting high and low risk strata (FIG 2). Highly similar results were obtained when the cumulative incidence of non-relapse mortality was used as the endpoint. In a validation set of previously unanalyzed patients, several biomarker combinations reproducibly stratified patients with GVHD for risk of death. The best 4 algorithms, all of which included ST2, produced comparable outcomes. Adding TNFR1 to ST2+ REG3a did not improve accuracy. The ST2+REG3a algorithm best identifies patients at onset of GVHD for high risk of lethal GVHD and NRM, and thus remains a standard for biomarker-based risk prediction.