Abstract
Transposable elements (TE) are repetitive genomic elements that harbor binding sites for human transcription factors (TF). A regulatory role for TEs has been suggested in embryonal ...development and diseases such as cancer but systematic investigation of their functions has been limited by their widespread silencing in the genome. Here, we utilize unbiased massively parallel reporter assay data using a whole human genome library to identify TEs with functional enhancer activity in two human cancer types of endodermal lineage, colorectal and liver cancers. We show that the identified TE enhancers are characterized by genomic features associated with active enhancers, such as epigenetic marks and TF binding. Importantly, we identify distinct TE subfamilies that function as tissue-specific enhancers, namely MER11- and LTR12-elements in colon and liver cancers, respectively. These elements are bound by distinct TFs in each cell type, and they have predicted associations to differentially expressed genes. In conclusion, these data demonstrate how different cancer types can utilize distinct TEs as tissue-specific enhancers, paving the way for comprehensive understanding of the role of TEs as bona fide enhancers in the cancer genomes.
Inflammatory bowel disease (IBD) is a chronic, relapsing inflammatory disorder associated with an elevated risk of colorectal cancer (CRC). IBD-associated CRC (IBD-CRC) may represent a distinct ...pathway of tumorigenesis compared to sporadic CRC (sCRC). Our aim was to comprehensively characterize IBD-associated tumorigenesis integrating multiple high-throughput approaches, and to compare the results with in-house data sets from sCRCs.
Whole-genome sequencing, single nucleotide polymorphism arrays, RNA sequencing, genome-wide methylation analysis, and immunohistochemistry were performed using fresh-frozen and formalin-fixed tissue samples of tumor and corresponding normal tissues from 31 patients with IBD-CRC.
Transcriptome-based tumor subtyping revealed the complete absence of canonical epithelial tumor subtype associated with WNT signaling in IBD-CRCs, dominated instead by mesenchymal stroma-rich subtype. Negative WNT regulators AXIN2 and RNF43 were strongly down-regulated in IBD-CRCs and chromosomal gains at HNF4A, a negative regulator of WNT-induced epithelial–mesenchymal transition (EMT), were less frequent compared to sCRCs. Enrichment of hypomethylation at HNF4α binding sites was detected solely in sCRC genomes. PIGR and OSMR involved in mucosal immunity were dysregulated via epigenetic modifications in IBD-CRCs. Genome-wide analysis showed significant enrichment of noncoding mutations to 5′untranslated region of TP53 in IBD-CRCs. As reported previously, somatic mutations in APC and KRAS were less frequent in IBD-CRCs compared to sCRCs.
Distinct mechanisms of WNT pathway dysregulation skew IBD-CRCs toward mesenchymal tumor subtype, which may affect prognosis and treatment options. Increased OSMR signaling may favor the establishment of mesenchymal tumors in patients with IBD.
Display omitted
Although the proteins that read the gene regulatory code, transcription factors (TFs), have been largely identified, it is not well known which sequences TFs can recognize. We have analyzed the ...sequence-specific binding of human TFs using high-throughput SELEX and ChIP sequencing. A total of 830 binding profiles were obtained, describing 239 distinctly different binding specificities. The models represent the majority of human TFs, approximately doubling the coverage compared to existing systematic studies. Our results reveal additional specificity determinants for a large number of factors for which a partial specificity was known, including a commonly observed A- or T-rich stretch that flanks the core motifs. Global analysis of the data revealed that homodimer orientation and spacing preferences, and base-stacking interactions, have a larger role in TF-DNA binding than previously appreciated. We further describe a binding model incorporating these features that is required to understand binding of TFs to DNA.
Display omitted
► High-resolution binding profiles representing most human transcription factors ► High-throughput SELEX can identify long and dimeric sites ► Full-length protein and DNA-binding domain specificities are similar ► Adjacent bases affect TF-DNA binding more than previously thought
High-throughput SELEX is used to determine high-resolution binding profiles representing most human transcription factors. Base-stacking interactions, and dimer orientation and spacing preferences, have a larger role in TF-DNA binding than previously appreciated.
To further investigate susceptibility loci identified by genome-wide association studies, we genotyped 5,500 SNPs across 14 associated regions in 8,000 samples from a control group and 3 diseases: ...type 2 diabetes (T2D), coronary artery disease (CAD) and Graves' disease. We defined, using Bayes theorem, credible sets of SNPs that were 95% likely, based on posterior probability, to contain the causal disease-associated SNPs. In 3 of the 14 regions, TCF7L2 (T2D), CTLA4 (Graves' disease) and CDKN2A-CDKN2B (T2D), much of the posterior probability rested on a single SNP, and, in 4 other regions (CDKN2A-CDKN2B (CAD) and CDKAL1, FTO and HHEX (T2D)), the 95% sets were small, thereby excluding most SNPs as potentially causal. Very few SNPs in our credible sets had annotated functions, illustrating the limitations in understanding the mechanisms underlying susceptibility to common diseases. Our results also show the value of more detailed mapping to target sequences for functional studies.
Uterine leiomyomas (ULs) are benign tumors that are a major burden to women's health. A genome-wide association study on 15,453 UL cases and 392,628 controls was performed, followed by replication of ...the genomic risk in six cohorts. Effects of the risk alleles were evaluated in view of molecular and clinical characteristics. 22 loci displayed a genome-wide significant association. The likely predisposition genes could be grouped to two biological processes. Genes involved in genome stability were represented by
- highlighting the role of telomere maintenance -
and
. Genes involved in genitourinary development,
and uterine stem cell marker antigen
formed another strong subgroup. The combined risk contributed by the 22 loci was associated with
mutation-positive tumors. The findings link genes for uterine development and genetic stability to leiomyomagenesis, and in part explain the more frequent occurrence of UL in women of African origin.
Therapies targeting somatic bystander genetic events represent a new avenue for cancer treatment. We recently identified a subset of colorectal cancer (CRC) patients who are heterozygous for a ...wild-type and a low activity allele (NAT2*6) but lack the wild-type allele in their tumors due to loss of heterozygosity (LOH) at 8p22. These tumors were sensitive to treatment with a cytotoxic substrate of NAT2 (6-(4-aminophenyl)-N-(3,4,5-trimethoxyphenyl)pyrazin-2-amine, APA), and pointed to NAT2 loss being a therapeutically exploitable vulnerability of CRC tumors. To better estimate the total number of treatable CRC patients, we here determined whether tumor cells retaining also other NAT2 low activity variants after LOH respond to APA treatment. The prevalent low activity alleles NAT2*5 and NAT2*14, but not NAT2*7, were found to be low metabolizers with high sensitivity to APA. By analysis of two different CRC patient cohorts, we detected heterozygosity for NAT2 alleles targetable by APA, along with allelic imbalances pointing to LOH, in ~ 24% of tumors. Finally, to haplotype the NAT2 locus in tumor and patient-matched normal samples in a clinical setting, we develop and demonstrate a long-read sequencing based assay. In total, > 79.000 CRC patients per year fulfil genetic criteria for high sensitivity to a NAT2 LOH therapy and their eligibility can be assessed by clinical sequencing.
Mechanical forces in a constrained cellular environment were recently established as a facilitator of chromosomal damage. Whether this could contribute to tumorigenesis is not known. Uterine ...leiomyomas are common neoplasms that display relatively few chromosomal aberrations. We hypothesized that if mechanical forces contribute to chromosomal damage, signs of this could be seen in uterine leiomyomas from parous women. We examined the karyotypes of 1946 tumors, and found a striking overrepresentation of chromosomal damage associated with parity. We then subjected myometrial cells to physiological forces similar to those encountered during pregnancy, and found this to cause DNA breaks and a DNA repair response. While mechanical forces acting in constrained cellular environments may thus contribute to neoplastic degeneration, and genesis of uterine leiomyoma, further studies are needed to prove possible causality of the observed association. No evidence for progression to malignancy was found.
Members of the large ETS family of transcription factors (TFs) have highly similar DNA‐binding domains (DBDs)—yet they have diverse functions and activities in physiology and oncogenesis. Some ...differences in DNA‐binding preferences within this family have been described, but they have not been analysed systematically, and their contributions to targeting remain largely uncharacterized. We report here the DNA‐binding profiles for all human and mouse ETS factors, which we generated using two different methods: a high‐throughput microwell‐based TF DNA‐binding specificity assay, and protein‐binding microarrays (PBMs). Both approaches reveal that the ETS‐binding profiles cluster into four distinct classes, and that all ETS factors linked to cancer, ERG, ETV1, ETV4 and FLI1, fall into just one of these classes. We identify amino‐acid residues that are critical for the differences in specificity between all the classes, and confirm the specificities in vivo using chromatin immunoprecipitation followed by sequencing (ChIP‐seq) for a member of each class. The results indicate that even relatively small differences in in vitro binding specificity of a TF contribute to site selectivity in vivo.
The genetic code-the binding specificity of all transfer-RNAs--defines how protein primary structure is determined by DNA sequence. DNA also dictates when and where proteins are expressed, and this ...information is encoded in a pattern of specific sequence motifs that are recognized by transcription factors. However, the DNA-binding specificity is only known for a small fraction of the approximately 1400 human transcription factors (TFs). We describe here a high-throughput method for analyzing transcription factor binding specificity that is based on systematic evolution of ligands by exponential enrichment (SELEX) and massively parallel sequencing. The method is optimized for analysis of large numbers of TFs in parallel through the use of affinity-tagged proteins, barcoded selection oligonucleotides, and multiplexed sequencing. Data are analyzed by a new bioinformatic platform that uses the hundreds of thousands of sequencing reads obtained to control the quality of the experiments and to generate binding motifs for the TFs. The described technology allows higher throughput and identification of much longer binding profiles than current microarray-based methods. In addition, as our method is based on proteins expressed in mammalian cells, it can also be used to characterize DNA-binding preferences of full-length proteins or proteins requiring post-translational modifications. We validate the method by determining binding specificities of 14 different classes of TFs and by confirming the specificities for NFATC1 and RFX3 using ChIP-seq. Our results reveal unexpected dimeric modes of binding for several factors that were thought to preferentially bind DNA as monomers.
Small bowel adenocarcinoma (SBA) is an aggressive disease with limited treatment options. Despite previous studies, its molecular genetic background has remained somewhat elusive. To comprehensively ...characterize the mutational landscape of this tumor type, and to identify possible targets of treatment, we conducted the first large exome sequencing study on a population-based set of SBA samples from all three small bowel segments. Archival tissue from 106 primary tumors with appropriate clinical information were available for exome sequencing from a patient series consisting of a majority of confirmed SBA cases diagnosed in Finland between the years 2003-2011. Paired-end exome sequencing was performed using Illumina HiSeq 4000, and OncodriveFML was used to identify driver genes from the exome data. We also defined frequently affected cancer signalling pathways and performed the first extensive allelic imbalance (AI) analysis in SBA. Exome data analysis revealed significantly mutated genes previously linked to SBA (TP53, KRAS, APC, SMAD4, and BRAF), recently reported potential driver genes (SOX9, ATM, and ARID2), as well as novel candidate driver genes, such as ACVR2A, ACVR1B, BRCA2, and SMARCA4. We also identified clear mutation hotspot patterns in ERBB2 and BRAF. No BRAF V600E mutations were observed. Additionally, we present a comprehensive mutation signature analysis of SBA, highlighting established signatures 1A, 6, and 17, as well as U2 which is a previously unvalidated signature. Finally, comparison of the three small bowel segments revealed differences in tumor characteristics. This comprehensive work unveils the mutational landscape and most frequently affected genes and pathways in SBA, providing potential therapeutic targets, and novel and more thorough insights into the genetic background of this tumor type.