Abstract
The accuracy of current deconvolution methods largely relies on the quality of cell-type expression references. However, single-cell (sc) and single-nuclei (sn) RNA-seq data used for ...building the reference are usually generated from independent studies that are distinct from the bulk RNA-seq data to be deconvolved. This study design inherently introduces technical confounding factors as unwanted variations, which is not fully addressed by current methods. To evaluate the impact of this variation on deconvolution accuracy, we generated a benchmark dataset where bulk and snRNA-seq profiling were performed from the same aliquot of single-nuclei that were extracted from 24 healthy retina samples. All donor eye samples were collected within six hours post-mortem and were absent of any disease. This study design guarantees the matched sequencing data to present the same cell-type compositions, so that cross-platform technical artifacts become the remaining confounding factor. We used the benchmark dataset to evaluate the performance of seven current deconvolution methods and found they performed much worse in matched real-bulk data than in matched pseudo-bulks that were summations of the single-cell data. This finding suggests that none of these methods have fully addressed the major technical artifacts between bulk and single-cell sequencing platforms. We therefore propose DeMix.SC, a new deconvolution framework that optimizes deconvolution parameters using a small set of matched bulk and sc/snRNA-seq data from the same tissue type. DeMix.SC includes two major steps. First, we measure the technical variations across genes and across platforms using the benchmark data. Second, we introduce a new weight function for each gene that produces a ranking order that accounts for both the platform-specific technical variations and cell-type specific expressions at gene level. Using the benchmark data for retina, we applied DeMix.SC to previously published human retinal RNA-seq data from 523 individuals with different stages of age-related macular degeneration (AMD). We observed that DeMix.SC can accurately capture the cell-type composition shifts in the AMD retina. DeMix.SC revealed a significant drop of rod cells as well as increased astrocytes, bipolar cells, and Müller cells in the AMD retina compared to the non-AMD group. The proportion changes of the later three minor cell types were not identified by other methods, while DeMix.SC could reveal such tendency. In summary, DeMix.SC integrates benchmark data to improve the deconvolution accuracy in retina samples. Our method is generic and can be applied to other disease conditions, such as deciphering the cell-type heterogeneity in cancer. We expect DeMix.SC will help revolutionize the downstream cell-type specific analysis of bulk RNA-seq data and identify cellular targets of human diseases.
Citation Format: Shuai Guo, Xuesen Cheng, Andrew Koval, Shuangxi Ji, Qingnan Liang, Yumei Li, Leah A. Owen, Ivana K. Kim, John Weinstein, Scott Kopetz, John Paul Shen, Margaret M. DeAngelis, Rui Chen, Wenyi Wang. Integration with benchmark data of paired bulk and single-cell RNA sequencing data substantially improves the accuracy of bulk tissue deconvolution. abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 4273.
Ewing's sarcoma is a highly aggressive pediatric malignancy of the bone and soft tissues. Current therapies, while extensive and toxic, are also not highly efficacious. It is thought that more ...successful, targeted therapies will come from an improved knowledge of the biology of Ewing's sarcoma. Ewing's sarcomas characteristically harbor a reciprocal translocation t(11;22)(q24;q12). The resultant fusion product is known to be an aberrant transcription factor, EWS/FLI, which is required for the oncogenic phenotype of the disease. Thus, identification of downstream EWS/FLI targets may allow for greater understanding of the pathogenesis of Ewing's sarcoma. We analyzed the EWS/FLI transcriptome in Ewing's sarcoma cells through the combined use of RNA-interference and microarray analysis, and identified NKX2.2 as a target gene of EWS/FLI. NKX2.2 is a homeobox transcription factor with a defined role in development of the central nervous system and endocrine pancreas. We found that NKX2.2 is expressed in a highly sensitive and specific manner in primary human Ewing's tumor samples. Furthermore, knockdown of NKX2.2 in patient derived Ewing's cell lines with RNAi results in a loss of transformation in vitro and decreased tumor formation in vivo. Thus, NKX2.2 is an EWS/FLI target with relevance to the human disease, which is required for oncogenic transformation. To understand the mechanism(s) by which NKX2.2 participates in oncogenic transformation in Ewing's sarcoma, we performed microarray experiments coupled with a structure/function analysis of the protein. These studies indicate that NKX2.2-mediated transcriptional repression is both necessary and sufficient for NKX2.2's role in Ewing's sarcoma tumorigenesis. Furthermore, blockade of TLE or HDAC function, two protein families known to mediate the repressive function of NKX2.2 during development, inhibits the transformed phenotype, and reverses the NKX2.2 transcriptional profile in Ewing's sarcoma cells. Thus, our model for characterization of the EWS/FLI transcriptome has identified NKX2.2 as a critical EWS/FLI target that is necessary for the transformed phenotype. In addition, we have shown that NKX2.2 behaves as a transcriptional repressor to mediate Ewing's cell tumorigenesis. As such, our analysis of NKX2.2 function in Ewing's cell oncogenesis suggests a novel therapeutic approach for treatment of this disease.