Systemic lupus erythematosus (SLE) is known to be clinically heterogeneous. Previous efforts to characterize subsets of SLE patients based on gene expression analysis have not been reproduced because ...of small sample sizes or technical problems. The aim of this study was to develop a robust patient stratification system using gene expression profiling to characterize individual lupus patients.
We employed gene set variation analysis (GSVA) of informative gene modules to identify molecular endotypes of SLE patients, machine learning (ML) to classify individual patients into molecular subsets, and logistic regression to develop a composite metric estimating the scope of immunologic perturbations. SHapley Additive ExPlanations (SHAP) revealed the impact of specific features on patient sub-setting.
Using five datasets comprising 2183 patients, eight SLE endotypes were identified. Expanded analysis of 3166 samples in 17 datasets revealed that each endotype had unique gene enrichment patterns, but not all endotypes were observed in all datasets. ML algorithms trained on 2183 patients and tested on 983 patients not used to develop the model demonstrated effective classification into one of eight endotypes. SHAP indicated a unique array of features influential in sorting individual samples into each of the endotypes. A composite molecular score was calculated for each patient and significantly correlated with standard laboratory measures. Significant differences in clinical characteristics were associated with different endotypes, with those with the least perturbed transcriptional profile manifesting lower disease severity. The more abnormal endotypes were significantly more likely to experience a severe flare over the subsequent 52 weeks while on standard-of-care medication and specific endotypes were more likely to be clinical responders to the investigational product tested in one clinical trial analyzed (tabalumab).
Transcriptomic profiling and ML reproducibly separated lupus patients into molecular endotypes with significant differences in clinical features, outcomes, and responsiveness to therapy. Our classification approach using a composite scoring system based on underlying molecular abnormalities has both staging and prognostic relevance.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
BackgroundSLE patients exhibit considerable clinical and molecular heterogeneity. A robust patient stratification approach can help to characterize individual lupus patients more effectively and aid ...patient care.MethodsWe employed gene set variation analysis (GSVA) of informative gene modules and k-means clustering to identify molecular endotypes of SLE patients based on dysregulation of specific biologic pathways and interrogated them for clinical utility. We utilized machine learning (ML) of these molecular profiles to classify individual lupus patients into singular molecular subsets and used logistic regression with ridge penalization to develop a novel, composite metric estimating the severity of disease based on lupus-related immunologic activity. Shapley Additive Explanation (SHAP) was employed to understand the impact of specific molecular features on the patient sub-setting.ResultsSix molecular endotypes were identified in a proof-of-concept cohort from the Illuminate trials (GSE88884) using baseline gene expression profiles. Significant differences in clinical characteristics were associated with different endotypes, with the least perturbed transcriptional profile manifesting the lowest disease activity, and endotypes with more perturbed transcriptional profiles exhibiting more severe disease activity. The more abnormal endotypes were also identified as more likely to have a severe flare over the 52 weeks of the trial and specific endotypes were more likely to be clinical responders to the investigational product (tabalumab). GSVA and k-means clustering of 3166 samples in 17 datasets revealed a total of eight SLE molecular endotypes, each with unique gene enrichment patterns, but not all endotypes were observed in all datasets. ML algorithms were trained and validated on 2183 patients from GSE88884 (ILLUMINATE-1 and ILLUMINATE-2) and three additional datasets (GSE116006, GSE65391, and GSE45291) and demonstrated high degrees of accuracy (98%), precision (94%), sensitivity, and specificity in classifying patients into one of the eight endotypes. A composite molecular score, which comprised aggregate molecular scores of each GSVA gene module, was calculated for each lupus patient. A subset of patients was identified whose molecular scores were not different than those found in normal subjects, whereas other subsets of lupus patients had progressively higher scores indicative of the aggregation of molecular abnormalities. The composite molecular scores were significantly correlated with both anti-DNA titers and SLEDAI. Finally, SHAP analysis of the impact of input GSVA scores indicated that a unique array of features was influential in sorting individual samples into each of the molecular endotypes.ConclusionsTranscriptomic profiling and ML allowed for reproducible separation of lupus patients into molecular endotypes with significant differences in clinical outcomes and responsiveness to therapy.Gene expression profiles were reduced to a score to assess lupus-related immune activity that correlated with clinical features, the implementation of which may provide a means to categorize lupus patients numerically based on the nature of each individual’s underlying molecular abnormalities.Lay SummaryLupus patients present with arrays of symptoms that are highly variable, which we describe as heterogeneity. This heterogeneity is also present at a molecular level which means the biological mechanisms underlying disease differ from patient to patient at a given moment in time. We have addressed the clinical challenges presented by this heterogeneity by developing a new way to identify endotypes, or subsets of patients with commonalities in these underlying mechanisms. We used data from thousands of patients in multiple datasets to ensure we are representing the likely universe of lupus patients and used computational algorithms to not only subset the patients but also develop machine learning models that can accurately predict subset (endotype) membership. Finally, the underlying molecular commonalities among these subsets were simplified to the calculation of a single score reflecting an individual patient’s current status of immunologic perturbation. Together, these analyses should provide a new way to categorize lupus patients based on information not currently captured in clinical settings.
Genetic variants in human microRNA (miRNA) genes may alter mature miRNA processing and/or target selection, and likely contribute to cancer susceptibility and disease progression. Previous studies ...have suggested that miR-101 may play important roles in the development of cancer by regulating key tumor-associated genes. However, the role of single nucleotide polymorphisms (SNPs) of miR-101 in breast cancer susceptibility remains unclear. In this study, we genotyped 11 SNPs of the miR-101 genes (including miR-101-1 and miR-101-2) in a case-control study of 1064 breast cancer cases and 1073 cancer-free controls. The results revealed that rs462480 and rs1053872 in the flank regions of pre-miR-101-2 were significantly associated with increased risk of breast cancer (rs462480 AC/CC vs AA: adjusted OR = 1.182, 95% CI: 1.030-1.357, P = 0.017; rs1053872 CG/GG vs CC: adjusted OR = 1.179, 95% CI: 1.040-1.337, P = 0.010). However, the remaining 9 SNPs were not significantly associated with risk of breast cancer. Additionally, combined analysis of the two high-risk SNPs revealed that subjects carrying the variant genotypes of rs462480 and rs1053872 had increased risk of breast cancer in a dose-response manner (P(trend) = 0.002). Compared with individuals with "0-1" risk allele, those carrying "2-4" risk alleles had 1.29-fold risk of breast cancer. In conclusion, these findings suggested that the SNPs rs462480 and rs1053872 residing in miR-101-2 gene may have a solid impact on genetic susceptibility to breast cancer, which may improve our understanding of the potential contribution of miRNA SNPs to cancer pathogenesis.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Single-nucleotide polymorphisms (SNPs) at 6q25.1 that are associated with breast cancer susceptibility have been identified in several genome-wide association studies (GWASs). However, the exact ...causal variants in this region have not been clarified.
In the present study, we genotyped six potentially functional single-nucleotide polymorphisms (SNPs) within the CCDC170 and ESR1 gene regions at 6q25.1 and accessed their associations with risk of breast cancer in a study of 1,064 cases and 1,073 cancer-free controls in Chinese women. The biological function of the risk variant was further evaluated by performing laboratory experiments.
Breast cancer risk was significantly associated with three SNPs located at 6q25.1-rs9383935 in CCDC170 and rs2228480 and rs3798758 in ESR1-with variant allele attributed odds ratios (ORs) of 1.38 (95% confidence interval (CI): 1.20 to 1.57, P=2.21×10(-6)), 0.84 (95% CI: 0.72 to 0.98, P=0.025) and 1.19 (95% CI: 1.04 to 1.37, P=0.013), respectively. The functional variant rs9383935 is in high linkage disequilibrium (LD) with GWAS-reported top-hit SNP (rs2046210), but only rs9383935 showed a strong independent effect in conditional regression analysis. The rs9383935 risk allele A showed decreased activity of reporter gene in both the MCF-7 and BT-474 breast cancer cell lines, which might be due to an altered binding capacity of miR-27a to the 3' untranslated region (3' UTR) sequence of CCDC170. Real-time quantitative reverse transcription PCR confirmed the correlation between rs9383935 genotypes and CCDC170 expression levels.
The results of this study suggest that the functional variant rs9383935, located at the 3' UTR of CCDC170, may be one candidate of the causal variants at 6q25.1 that modulate the risk of breast cancer.
Esophageal cancer and gastric cancer have shared risk factors and inherited susceptibility. Recent genome‐wide association studies have identified multiple genetic loci associated with gastric cancer ...risk, which may also involve in the development of esophageal cancer. Herein, we evaluated the relationship of gastric cancer risk‐related variants at 1q22, 3q13.3, 5p13.1, and 8q24 with the risk of esophageal squamous cell carcinoma (ESCC) in a Chinese population with a case–control study (2139 cases and 2273 controls). We found that the T allele of rs2294008, an intronic variant of the PSCA gene at 8q24 that was previously associated with an increased risk of gastric cancer, was inversely associated with a decreased risk of ESCC (odds ratio = 0.90; 95% confidence interval, 0.81–0.99; P = 0.034). Of interest, the association of rs2294008 with ESCC was consistent with that observed in esophageal adenocarcinoma and ESCC in Caucasian populations. However, no significant associations were observed for the other three variants at 1q22 (rs4072037), 3q13.31 (rs9841504), and 5p13.1 (rs13361707). Our findings suggest that the susceptibility locus of PSCA at 8q24 may be a double‐edged sword, as modulator between the carcinogenesis processes of stomach and esophagus.
We evaluated the relationship of gastric cancer (GC) risk‐related variants at 1q22, 3q13.3, 5p13.1 and 8q24 with the risk of esophageal squamous cell carcinoma (ESCC) in a Chinese population with 2139 cases and 2273 controls. We found the T allele of rs2294008, an intronic variant of PSCA at 8q24 that was associated with an increased risk of GC, was inversely associated with a decreased risk of ESCC (OR = 0.90, 95 %CI: 0.81–0.99, P = 0.034).
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SAZU, SBCE, SBMB, UL, UM, UPUK
A recent genome-wide association study (GWAS) has identified three new breast cancer susceptibility loci at 12p11, 12q24 and 21q21 in populations of European descent. However, because of the genetic ...heterogeneity, it is largely unknown for the role of these loci in the breast cancer susceptibility in the populations of non-European descent.
Here, we genotyped three variants (rs10771399 at 12p11, rs1292011 at 12q24 and rs2823093 at 21q21) in an independent case-control study with a total of 1792 breast cancer cases and 1867 cancer-free controls in a Chinese population. We found that rs10771399 and rs1292011 were significantly associated with risk of breast cancer with per-allele odds ratios (ORs) of 0.85 (95% confidence interval (CI): 0.76-0.96; P = 0.010) and 0.84 (95% CI: 0.76-0.95; P = 4.50×10(-3)), respectively, which was consistent with those reported in populations of European descent. Similar effects were observed between ER/PR positive and negative breast cancer for both loci. However, we did not found significant association between rs2823093 and breast cancer risk (OR = 0.97, 95%CI = 0.76-1.24; P = 0.795).
Our results indicate that genetic variants at 12p11 and 12q24 may also play an important role in breast cancer development in Chinese women.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Chronic hepatitis B virus (HBV) infection is a challenging global health problem. To identify genetic loci involved in chronic HBV infection, we designed a three-phase genome-wide association study ...in Han Chinese populations. The discovery phase included 951 HBV carriers (cases) and 937 individuals who had naturally cleared HBV infection (controls) and was followed by independent replications with a total of 2,248 cases and 3,051 controls and additional replications with 1,982 HBV carriers and 2,622 controls from the general population. We identified two new loci associated with chronic HBV infection: rs3130542 at 6p21.33 (near HLA-C, odds ratio (OR) = 1.33, P = 9.49 × 10(-14)) and rs4821116 at 22q11.21 (in UBE2L3, OR = 0.82, P = 1.71 × 10(-12)). Additionally, we replicated the previously identified associations of HLA-DP and HLA-DQ variants at 6p21.32 with chronic HBV infection. These findings highlight the importance of HLA-C and UBE2L3 in the clearance of HBV infection in addition to HLA-DP and HLA-DQ.
Full text
Available for:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK