Single-cell RNA-sequencing (scRNA-seq) has made it possible to profile gene expression in tissues at high resolution. An important preprocessing step prior to performing downstream analyses is to ...identify and remove cells with poor or degraded sample quality using quality control (QC) metrics. Two widely used QC metrics to identify a ‘low-quality’ cell are (i) if the cell includes a high proportion of reads that map to mitochondrial DNA (mtDNA) encoded genes and (ii) if a small number of genes are detected. Current best practices use these QC metrics independently with either arbitrary, uniform thresholds (e.g. 5%) or biological context-dependent (e.g. species) thresholds, and fail to jointly model these metrics in a data-driven manner. Current practices are often overly stringent and especially untenable on certain types of tissues, such as archived tumor tissues, or tissues associated with mitochondrial function, such as kidney tissue 1. We propose a data-driven QC metric (miQC) that jointly models both the proportion of reads mapping to mtDNA genes and the number of detected genes with mixture models in a probabilistic framework to predict the low-quality cells in a given dataset. We demonstrate how our QC metric easily adapts to different types of single-cell datasets to remove low-quality cells while preserving high-quality cells that can be used for downstream analyses. Our software package is available at
https://bioconductor.org/packages/miQC
.
Type 1 diabetes mellitus (T1DM) is a rare, but serious immune-related adverse event (irAE) of immune checkpoint inhibitors (ICIs). Our goal was to characterize treatment outcomes associated with ...ICI-induced T1DM through analysis of clinical, immunological and proteomic data.
This was a single-center case series of patients with solid tumors who received ICIs and subsequently had a new diagnosis of T1DM. ICD codes and C-peptide levels were used to identify patients for chart review to confirm ICI-induced T1DM. Baseline blood specimens were studied for proteomic and immunophenotypic changes.
Between 2011 and 2023, 18 of 3744 patients treated at Huntsman Cancer Institute with ICIs were confirmed to have ICI-induced T1DM (0.48%). Eleven of the 18 patients received anti-PD1 monotherapy, 4 received anti-PD1 plus chemotherapy or targeted therapy, and 3 received ipilimumab plus nivolumab. The mean time to onset was 218 days (range 22-418 days). Patients had sudden elevated serum glucose within 2-3 weeks prior to diagnosis. Sixteen (89%) presented with diabetic ketoacidosis. Three of 12 patients had positive T1DM-associated autoantibodies. All patients with T1DM became insulin-dependent through follow-up. At median follow-up of 21.9 months (range 8.4-82.4), no patients in the melanoma group had progressed or died from disease. In the melanoma group, best responses were 2 complete response and 2 partial response while on active treatment; none in the adjuvant group had disease recurrence. Proteomic analysis of baseline blood suggested low inflammatory (IL-6, OSMR) markers and high metabolic (GLO1, DXCR) markers in ICI-induced T1DM cohort.
Our case series demonstrates rapid onset and irreversibility of ICI-induced T1DM. Melanoma patients with ICI-induced T1DM display excellent clinical response and survival. Limited proteomic data also suggested a unique proteomic profile. Our study helps clinicians to understand the unique clinical presentation and long-term outcomes of this rare irAE for best clinical management.
Abstract
Background
Pooling cells from multiple biological samples prior to library preparation within the same single-cell RNA sequencing experiment provides several advantages, including lower ...library preparation costs and reduced unwanted technological variation, such as batch effects. Computational demultiplexing tools based on natural genetic variation between individuals provide a simple approach to demultiplex samples, which does not require complex additional experimental procedures. However, to our knowledge these tools have not been evaluated in cancer, where somatic variants, which could differ between cells from the same sample, may obscure the signal in natural genetic variation.
Results
Here, we performed in silico benchmark evaluations by combining raw sequencing reads from multiple single-cell samples in high-grade serous ovarian cancer, which has a high copy number burden, and lung adenocarcinoma, which has a high tumor mutational burden. Our results confirm that genetic demultiplexing tools can be effectively deployed on cancer tissue using a pooled experimental design, although high proportions of ambient RNA from cell debris reduce performance.
Conclusions
This strategy provides significant cost savings through pooled library preparation. To facilitate similar analyses at the experimental design phase, we provide freely accessible code and a reproducible Snakemake workflow built around the best-performing tools found in our in silico benchmark evaluations, available at https://github.com/lmweber/snp-dmx-cancer.
The aldehyde dehydrogenase 2 (ALDH2) polymorphism rs671 (Glu504Lys) causes ALDH2 inactivation and adverse acetaldehyde exposure among Asians, but little is known of the association between alcohol ...consumption and rs671 and ovarian cancer (OvCa) in Asians. We conducted a pooled analysis of Asian ancestry participants in the Ovarian Cancer Association Consortium. We included seven case‐control studies and one cohort study comprising 460 invasive OvCa cases, 37 borderline mucinous OvCa and 1274 controls of Asian descent with information on recent alcohol consumption. Pooled odds ratios (OR) with 95% confidence intervals (CI) for OvCa risk associated with alcohol consumption, rs671 and their interaction were estimated using logistic regression models adjusted for potential confounders. No significant association was observed for daily alcohol intake with invasive OvCa (OR comparing any consumption to none = 0.83; 95% CI = 0.58‐1.18) or with individual histotypes. A significant decreased risk was seen for carriers of one or both Lys alleles of rs671 for invasive mucinous OvCa (OR = 0.44; 95% CI = 0.20‐0.97) and for invasive and borderline mucinous tumors combined (OR = 0.48; 95% CI = 0.26‐0.89). No significant interaction was observed between alcohol consumption and rs671 genotypes. In conclusion, self‐reported alcohol consumption at the quantities estimated was not associated with OvCa risk among Asians. Because the rs671 Lys allele causes ALDH2 inactivation leading to increased acetaldehyde exposure, the observed inverse genetic association with mucinous ovarian cancer is inferred to mean that alcohol intake may be a risk factor for this histotype. This association will require replication in a larger sample.
We observed a significant decreased risk among carriers of one or both Lys alleles of rs671 for invasive mucinous ovarian cancer and for invasive and borderline mucinous tumors. Because the rs671 Lys allele causes ALDH2 inactivation leading to increased acetaldehyde exposure, the observed inverse association with mucinous ovarian cancer is inferred to mean that alcohol intake may be a risk factor for this histotype.
Display omitted
•Class imbalance and class distribution degrade model performance at deployment time.•Class-specialized ensembles improve classification performance of rare cancer types.•Performance ...gain obtained with traditional ensembles comes from the majority classes.•Test accuracy improvements gained from two-phase learning can be misleading.
In the last decade, the widespread adoption of electronic health record documentation has created huge opportunities for information mining. Natural language processing (NLP) techniques using machine and deep learning are becoming increasingly widespread for information extraction tasks from unstructured clinical notes. Disparities in performance when deploying machine learning models in the real world have recently received considerable attention. In the clinical NLP domain, the robustness of convolutional neural networks (CNNs) for classifying cancer pathology reports under natural distribution shifts remains understudied. In this research, we aim to quantify and improve the performance of the CNN for text classification on out-of-distribution (OOD) datasets resulting from the natural evolution of clinical text in pathology reports. We identified class imbalance due to different prevalence of cancer types as one of the sources of performance drop and analyzed the impact of previous methods for addressing class imbalance when deploying models in real-world domains. Our results show that our novel class-specialized ensemble technique outperforms other methods for the classification of rare cancer types in terms of macro F1 scores. We also found that traditional ensemble methods perform better in top classes, leading to higher micro F1 scores. Based on our findings, we formulate a series of recommendations for other ML practitioners on how to build robust models with extremely imbalanced datasets in biomedical NLP applications.
Mammographic density (MD) is strongly associated with breast cancer risk. We examined whether body mass index (BMI) partially explains racial and ethnic variation in MD.
We used multivariable Poisson ...regression to estimate associations between BMI and binary MD Breast Imaging Reporting and Database System (BI-RADS) A&B versus BI-RADS C&D among 160,804 women in the Utah mammography cohort. We estimated associations overall and within racial and ethnic subgroups and calculated population attributable risk percents (PAR%).
We observed the lowest BMI and highest MD among Asian women, the highest BMI among Native Hawaiian and Pacific Islander women, and the lowest MD among American Indian and Alaska Native (AIAN) and Black women. BMI was inversely associated with MD RRBMI≥30 vs. BMI<25 = 0.43; 95% confidence interval (CI), 0.42-0.44 in the full cohort, and estimates in all racial and ethnic subgroups were consistent with this strong inverse association. For women less than 45 years of age, although there was statistical evidence of heterogeneity in associations between BMI and MD by race and ethnicity (P = 0.009), magnitudes of association were similar across groups. PAR%s for BMI and MD among women less than 45 years were considerably higher in White women (PAR% = 29.2, 95% CI = 28.4-29.9) compared with all other groups with estimates ranging from PAR%Asain = 17.2%; 95% CI, 8.5 to 25.8 to PAR%Hispanic = 21.5%; 95% CI, 19.4 to 23.6. For women ≥55 years, PAR%s for BMI and MD were highest among AIAN women (PAR% = 37.5; 95% CI, 28.1-46.9).
While we observed substantial differences in the distributions of BMI and MD by race and ethnicity, associations between BMI and MD were generally similar across groups.
Distributions of BMI and MD may be important contributors to breast cancer disparities.
The degree to which uterine cancer metastatic to the ovary is misdiagnosed as synchronous stage I uterine and ovarian cancers is unclear. We sought to determine whether patients with synchronous ...cancers had mortality patterns similar to either stage IIIA uterine, stage I uterine, or stage I ovarian cancers alone.
The Surveillance, Epidemiology, and End Results database was used to compare mortality of patients with synchronous stage I uterine and stage I ovarian cancers versus those with stage IIIA uterine, stage I uterine, or stage I ovarian cancers alone. We calculated age-adjusted mortality hazard ratios (HR) and 95% confidence intervals (CI) accounting for calendar year and grade, adjuvant treatment, grade 1 endometrioid cancers, grade 3 endometrioid cancers, and stage IA cancers.
Among the 9,321 patients, we observed lower age-adjusted mortality in patients with stage I synchronous cancers (n = 937) compared to those with stage IIIA uterine (n = 531; HR, 0.45 95% CI, 0.35-0.58), stage I uterine (n = 6,919; HR, 0.74; 95% CI, 0.60-0.91), and stage I ovarian cancers (n = 934; HR, 0.52; 95% CI, 0.41-0.67). Results were similar after taking into account diagnosis year and grade, and limiting to those receiving adjuvant therapy, grade 1 or grade 3 endometrioid cancers, or stage IA cancers.
We observed lower mortality for synchronous stage I uterine and ovarian cancers, which was not explained by younger age, earlier stage, lower grade, histology type, or adjuvant therapy.
The possible misdiagnosis associated with clinicopathologic of synchronous uterine and ovarian cancers does not appear to worsen survival on a population level.
Circadian disruption has been linked to carcinogenesis in animal models, but the evidence in humans is inconclusive. Genetic variation in circadian rhythm genes provides a tool to investigate such ...associations. We examined associations of genetic variation in nine core circadian rhythm genes and six melatonin pathway genes with risk of colorectal, lung, ovarian and prostate cancers using data from the Genetic Associations and Mechanisms in Oncology (GAME‐ON) network. The major results for prostate cancer were replicated in the Prostate, Lung, Colorectal and Ovarian (PLCO) cancer screening trial, and for colorectal cancer in the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO). The total number of cancer cases and controls was 15,838/18,159 for colorectal, 14,818/14,227 for prostate, 12,537/17,285 for lung and 4,369/9,123 for ovary. For each cancer site, we conducted gene‐based and pathway‐based analyses by applying the summary‐based Adaptive Rank Truncated Product method (sARTP) on the summary association statistics for each SNP within the candidate gene regions. Aggregate genetic variation in circadian rhythm and melatonin pathways were significantly associated with the risk of prostate cancer in data combining GAME‐ON and PLCO, after Bonferroni correction (ppathway < 0.00625). The two most significant genes were NPAS2 (pgene = 0.0062) and AANAT (pgene = 0.00078); the latter being significant after Bonferroni correction. For colorectal cancer, we observed a suggestive association with the circadian rhythm pathway in GAME‐ON (ppathway = 0.021); this association was not confirmed in GECCO (ppathway = 0.76) or the combined data (ppathway = 0.17). No significant association was observed for ovarian and lung cancer. These findings support a potential role for circadian rhythm and melatonin pathways in prostate carcinogenesis. Further functional studies are needed to better understand the underlying biologic mechanisms.
What's new?
Circadian disruption has been linked to carcinogenesis in animal models, but the evidence in humans is inconclusive. In this large SNP study, the authors found a significant association between both circadian‐rhythm and melatonin‐pathway gene variants and prostate‐cancer risk. These results support a role for circadian‐rhythm and melatonin pathways in prostate carcinogenesis.
Purpose
National Cancer Institute (NCI)-Designated Cancer Centers are required to assess and address the needs of their catchments. In rural regions, catchment areas are vast, populations small, and ...infrastructure for data capture limited, making analyses of cancer patterns challenging.
Methods
The four NCI-Designated Comprehensive Cancer Centers in the southern Rocky Mountain region formed the Four Corners Collaboration (4C2) to address these challenges. Colorectal cancer (CRC) was identified as a disease site where disparities exist. The 4C2 leaders examined how geographic and sociodemographic characteristics were correlated to stage at diagnosis and survival in the region and compared those relationships to a sample from the surveillance, epidemiology, and end results (SEER) program.
Results
In 4C2, Hispanics were more likely to live in socioeconomically disadvantaged areas relative to their counterparts in the SEER program. These residency patterns were positively correlated with later stage diagnosis and higher mortality. Living in an area with high-income inequality was positively associated with mortality for Non-Hispanic whites in 4C2. In SEER, Hispanics had a slightly higher likelihood of distant stage disease, and disadvantaged socioeconomic status was associated with poor survival.
Conclusion
CRC interventions in 4C2 will target socioeconomically disadvantaged areas, especially those with higher income inequality, to improve outcomes among Hispanics and Non-Hispanic whites. The collaboration demonstrates how bringing NCI-Designated Cancer Centers together to identify and address common population catchment issues provides opportunity for pooled analyses of small, but important populations, and thus, capitalize on synergies among researchers to reduce cancer disparities.