Homeobox (HOX) transcription factors, encoded by a subset of homeodomain superfamily genes, play pivotal roles in many aspects of cellular physiology, embryonic development, and tissue homeostasis. ...Findings over the past decade have revealed that mutations in HOX genes can lead to increased cancer predisposition, and HOX genes might mediate the effect of many other cancer susceptibility factors by recognizing or executing altered genetic information. Remarkably, several lines of evidence highlight the interplays between HOX transcription factors and cancer risk loci discovered by genome-wide association studies, thereby gaining molecular and biological insight into cancer etiology. In addition, deregulated HOX gene expression impacts various aspects of cancer progression, including tumor angiogenesis, cell autophagy, proliferation, apoptosis, tumor cell migration, and metabolism. In this review, we will discuss the fundamental roles of HOX genes in cancer susceptibility and progression, highlighting multiple molecular mechanisms of HOX involved gene misregulation, as well as their potential implications in clinical practice.
With the development of advanced genomic methods, a large amount of long non-coding RNAs (lncRNAs) have been found to be important for cancer initiation and progression. Given that most of the ...genome-wide association study (GWAS)-identified cancer risk SNPs are located in the noncoding region, the expression and function of lncRNAs are more likely to be affected by the SNPs. The SNPs may affect the expression of lncRNAs directly through disrupting the binding of transcription factors or indirectly by affecting the expression of regulatory factors. Moreover, SNPs may disrupt the interaction between lncRNAs and other RNAs orproteins. Unveiling the relationship of lncRNA, protein-coding genes, transcription factors and miRNAs from the angle of genomics will improve the accuracy of disease prediction and help find new therapeutic targets.
Prostate cancer (PCa) is the second most common cancer in men and is a highly heritable disease that affects millions of individuals worldwide. Genome-wide association studies have to date discovered ...nearly 270 genetic loci harboring hundreds of single nucleotide polymorphisms (SNPs) that are associated with PCa susceptibility. In contrast, the functional characterization of the mechanisms underlying PCa risk association is still growing. Given that PCa risk-associated SNPs are highly enriched in noncoding cis-regulatory genomic regions, accumulating evidence suggests a widespread modulation of transcription factor chromatin binding and allelic enhancer activity by these noncoding SNPs, thereby dysregulating gene expression. Emerging studies have shown that a proportion of noncoding variants can modulate the formation of transcription factor complexes at enhancers and CTCF-mediated 3D genome architecture. Interestingly, DNA methylation-regulated CTCF binding could orchestrate a long-range chromatin interaction between PCa risk enhancer and causative genes. Additionally, one-causal-variant-two-risk genes or multiple-risk-variant-multiple-genes are prevalent in some PCa risk-associated loci. In this review, we will discuss the current understanding of the general principles of SNP-mediated gene regulation, experimental advances, and functional evidence supporting the mechanistic roles of several PCa genetic loci with potential clinical impact on disease prevention and treatment.
•GWASs have identified near 270 prostate cancer risk-associated loci with SNPs highly enriched in noncoding genomic regions.•Noncoding SNPs are prevalent in altering the transcription factor DNA-binding affinity at risk gene regulatory enhancers.•Noncoding SNPs can modulate formation of transcription factor complex at enhancers and long-range 3D genome architecture via epigenetic control.•One causal variant two risk genes or multiple risk variants multiple genes are often observed at prostate cancer risk loci.
With the development of advanced genomic methods, a large amount of long non-coding RNAs (lncRNAs) has been found to be important for cancer initiation and progression. Given that most of the ...genome-wide association study (GWAS)-identified cancer risk SNPs are located in the noncoding region, the expression and function of lncRNAs are more likely to be affected by the SNPs. The SNPs may affect the expression of lncRNAs directly through disrupting the binding of transcription factors or indirectly by affecting the expression of regulatory factors. Moreover, SNPs may disrupt the interaction between lncRNAs and other RNAs or proteins. Unveiling the relationship of lncRNA, protein-coding genes, transcription factors and miRNAs from the angle of genomics will improve the accuracy of disease prediction and help find new therapeutic targets.
Most cancer risk-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) are noncoding and it is challenging to assess their functional impacts. To ...systematically identify the SNPs that affect gene expression by modulating activities of distal regulatory elements, we adapt the self-transcribing active regulatory region sequencing (STARR-seq) strategy, a high-throughput technique to functionally quantify enhancer activities.
From 10,673 SNPs linked with 996 cancer risk-associated SNPs identified in previous GWAS studies, we identify 575 SNPs in the fragments that positively regulate gene expression, and 758 SNPs in the fragments with negative regulatory activities. Among them, 70 variants are regulatory variants for which the two alleles confer different regulatory activities. We analyze in depth two regulatory variants-breast cancer risk SNP rs11055880 and leukemia risk-associated SNP rs12142375-and demonstrate their endogenous regulatory activities on expression of ATF7IP and PDE4B genes, respectively, using a CRISPR-Cas9 approach.
By identifying regulatory variants associated with cancer susceptibility and studying their molecular functions, we hope to help the interpretation of GWAS results and provide improved information for cancer risk assessment.
Expression quantitative trait locus (eQTL) analysis has emerged as an important tool in elucidating the link between genetic variants and gene expression, thereby bridging the gap between risk SNPs ...and associated diseases. We recently identified and validated a specific case where the methylation of a CpG site influences the relationship between the genetic variant and gene expression.
Here, to systematically evaluate this regulatory mechanism, we develop an extended eQTL mapping method, termed DNA methylation modulated eQTL (memo-eQTL). Applying this memo-eQTL mapping method to 128 normal prostate samples enables identification of 1063 memo-eQTLs, the majority of which are not recognized as conventional eQTLs in the same cohort. We observe that the methylation of the memo-eQTL CpG sites can either enhance or insulate the interaction between SNP and gene expression by altering CTCF-based chromatin 3D structure.
This study demonstrates the prevalence of memo-eQTLs paving the way to identify novel causal genes for traits or diseases associated with genetic variations.
Abstract
Cellular senescence (CS), a state of permanent growth arrest, is intertwined with tumorigenesis. Due to the absence of specific markers, characterizing senescence levels and ...senescence-related phenotypes across cancer types remain unexplored. Here, we defined computational metrics of senescence levels as CS scores to delineate CS landscape across 33 cancer types and 29 normal tissues and explored CS-associated phenotypes by integrating multiplatform data from ~20 000 patients and ~212 000 single-cell profiles. CS scores showed cancer type-specific associations with genomic and immune characteristics and significantly predicted immunotherapy responses and patient prognosis in multiple cancers. Single-cell CS quantification revealed intra-tumor heterogeneity and activated immune microenvironment in senescent prostate cancer. Using machine learning algorithms, we identified three CS genes as potential prognostic predictors in prostate cancer and verified them by immunohistochemical assays in 72 patients. Our study provides a comprehensive framework for evaluating senescence levels and clinical relevance, gaining insights into CS roles in cancer- and senescence-related biomarker discovery.
Members of the large ETS family of transcription factors (TFs) have highly similar DNA‐binding domains (DBDs)—yet they have diverse functions and activities in physiology and oncogenesis. Some ...differences in DNA‐binding preferences within this family have been described, but they have not been analysed systematically, and their contributions to targeting remain largely uncharacterized. We report here the DNA‐binding profiles for all human and mouse ETS factors, which we generated using two different methods: a high‐throughput microwell‐based TF DNA‐binding specificity assay, and protein‐binding microarrays (PBMs). Both approaches reveal that the ETS‐binding profiles cluster into four distinct classes, and that all ETS factors linked to cancer, ERG, ETV1, ETV4 and FLI1, fall into just one of these classes. We identify amino‐acid residues that are critical for the differences in specificity between all the classes, and confirm the specificities in vivo using chromatin immunoprecipitation followed by sequencing (ChIP‐seq) for a member of each class. The results indicate that even relatively small differences in in vitro binding specificity of a TF contribute to site selectivity in vivo.
Genome-wide association studies along with expression quantitative trait locus (eQTL) mapping have identified hundreds of single-nucleotide polymorphisms (SNPs) and their target genes in prostate ...cancer (PCa), yet functional characterization of these risk loci remains challenging. To screen for potential regulatory SNPs, we designed a CRISPRi library containing 9,133 guide RNAs (gRNAs) to cover 2,166 candidate SNP loci implicated in PCa and identified 117 SNPs that could regulate 90 genes for PCa cell growth advantage. Among these, rs60464856 was covered by multiple gRNAs significantly depleted in screening (FDR < 0.05). Pooled SNP association analysis in the PRACTICAL and FinnGen cohorts showed significantly higher PCa risk for the rs60464856 G allele (p value = 1.2 × 10−16 and 3.2 × 10−7, respectively). Subsequent eQTL analysis revealed that the G allele is associated with increased RUVBL1 expression in multiple datasets. Further CRISPRi and xCas9 base editing confirmed that the rs60464856 G allele leads to elevated RUVBL1 expression. Furthermore, SILAC-based proteomic analysis demonstrated allelic binding of cohesin subunits at the rs60464856 region, where the HiC dataset showed consistent chromatin interactions in prostate cell lines. RUVBL1 depletion inhibited PCa cell proliferation and tumor growth in a xenograft mouse model. Gene-set enrichment analysis suggested an association of RUVBL1 expression with cell-cycle-related pathways. Increased expression of RUVBL1 and activation of cell-cycle pathways were correlated with poor PCa survival in TCGA datasets. Our CRISPRi screening prioritized about one hundred regulatory SNPs essential for prostate cell proliferation. In combination with proteomics and functional studies, we characterized the mechanistic role of rs60464856 and RUVBL1 in PCa progression.
Display omitted
A major goal of post-GWAS studies is to functionally characterize causal SNPs that confer increased risk to disease phenotypes. Here, we applied CRISPRi and proteomics screening to identify regulatory SNPs at prostate-cancer risk loci and functionally characterized the impact of the rs60464856-RUVBL1 locus via in vitro and in vivo approaches.
Functional characterization of disease-causing variants at risk loci has been a significant challenge. Here we report a high-throughput single-nucleotide polymorphisms sequencing (SNPs-seq) ...technology to simultaneously screen hundreds to thousands of SNPs for their allele-dependent protein-binding differences. This technology takes advantage of higher retention rate of protein-bound DNA oligos in protein purification column to quantitatively sequence these SNP-containing oligos. We apply this technology to test prostate cancer-risk loci and observe differential allelic protein binding in a significant number of selected SNPs. We also test a unique application of self-transcribing active regulatory region sequencing (STARR-seq) in characterizing allele-dependent transcriptional regulation and provide detailed functional analysis at two risk loci (RGS17 and ASCL2). Together, we introduce a powerful high-throughput pipeline for large-scale screening of functional SNPs at disease risk loci.