The precision-recall curve (PRC) and the area under the precision-recall curve (AUPRC) are useful for quantifying classification performance. They are commonly used in situations with imbalanced ...classes, such as cancer diagnosis and cell type annotation. We evaluate 10 popular tools for plotting PRC and computing AUPRC, which were collectively used in more than 3000 published studies. We find the AUPRC values computed by the tools rank classifiers differently and some tools produce overly-optimistic results.
Identification of noncoding drivers from thousands of somatic alterations in a typical tumor is a difficult and unsolved problem. We report a computational framework, FunSeq2, to annotate and ...prioritize these mutations. The framework combines an adjustable data context integrating large-scale genomics and cancer resources with a streamlined variant-prioritization pipeline. The pipeline has a weighted scoring system combining: inter- and intra-species conservation;loss- and gain-of-function events for transcription-factor binding; enhancer-gene linkages and network centrality; and per-element recurrence across samples. We further highlight putative drivers with information specific to a particular sample, such as differential expression. FunSeq2 is available from funseq2.gersteinlab.org.
Nasopharyngeal carcinoma (NPC) is an aggressive head and neck cancer characterized by Epstein-Barr virus (EBV) infection and dense lymphocyte infiltration. The scarcity of NPC genomic data hinders ...the understanding of NPC biology, disease progression and rational therapy design. Here we performed whole-exome sequencing (WES) on 111 micro-dissected EBV-positive NPCs, with 15 cases subjected to further whole-genome sequencing (WGS), to determine its mutational landscape. We identified enrichment for genomic aberrations of multiple negative regulators of the NF-κB pathway, including CYLD, TRAF3, NFKBIA and NLRC5, in a total of 41% of cases. Functional analysis confirmed inactivating CYLD mutations as drivers for NPC cell growth. The EBV oncoprotein latent membrane protein 1 (LMP1) functions to constitutively activate NF-κB signalling, and we observed mutual exclusivity among tumours with somatic NF-κB pathway aberrations and LMP1-overexpression, suggesting that NF-κB activation is selected for by both somatic and viral events during NPC pathogenesis.
The lack of representative nasopharyngeal carcinoma (NPC) models has seriously hampered research on EBV carcinogenesis and preclinical studies in NPC. Here we report the successful growth of five NPC ...patient-derived xenografts (PDXs) from fifty-eight attempts of transplantation of NPC specimens into NOD/SCID mice. The take rates for primary and recurrent NPC are 4.9% and 17.6%, respectively. Successful establishment of a new EBV-positive NPC cell line, NPC43, is achieved directly from patient NPC tissues by including Rho-associated coiled-coil containing kinases inhibitor (Y-27632) in culture medium. Spontaneous lytic reactivation of EBV can be observed in NPC43 upon withdrawal of Y-27632. Whole-exome sequencing (WES) reveals a close similarity in mutational profiles of these NPC PDXs with their corresponding patient NPC. Whole-genome sequencing (WGS) further delineates the genomic landscape and sequences of EBV genomes in these newly established NPC models, which supports their potential use in future studies of NPC.
Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one ...can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome.
We propose a new method for determining the target genes of transcriptional enhancers in specific cells and tissues. It combines global trends across many samples and sample-specific information, and ...considers the joint effect of multiple enhancers. Our method outperforms existing methods when predicting the target genes of enhancers in unseen samples, as evaluated by independent experimental data. Requiring few types of input data, we are able to apply our method to reconstruct the enhancer-target networks in 935 samples of human primary cells, tissues and cell lines, which constitute by far the largest set of enhancer-target networks. The similarity of these networks from different samples closely follows their cell and tissue lineages. We discover three major co-regulation modes of enhancers and find defense-related genes often simultaneously regulated by multiple enhancers bound by different transcription factors. We also identify differentially methylated enhancers in hepatocellular carcinoma (HCC) and experimentally confirm their altered regulation of HCC-related genes.
Full text
Available for:
IJS, NUK, SBMB, UL, UM, UPUK
The phylum Cnidaria represents a close outgroup to Bilateria and includes familiar animals including sea anemones, corals, hydroids, and jellyfish. Here we report genome sequencing and assembly for ...true jellyfish Sanderia malayensis and Rhopilema esculentum. The homeobox gene clusters are characterised by interdigitation of Hox, NK, and Hox-like genes revealing an alternate pathway of ANTP class gene dispersal and an intact three gene ParaHox cluster. The mitochondrial genomes are linear but, unlike in Hydra, we do not detect nuclear copies, suggesting that linear plastid genomes are not necessarily prone to integration. Genes for sesquiterpenoid hormone production, typical for arthropods, are also now found in cnidarians. Somatic and germline cells both express piwi-interacting RNAs in jellyfish revealing a conserved cnidarian feature, and evidence for tissue-specific microRNA arm switching as found in Bilateria is detected. Jellyfish genomes reveal a mosaic of conserved and divergent genomic characters evolved from a shared ancestral genetic architecture.
Transcription factors bind in a combinatorial fashion to specify the on-and-off states of genes; the ensemble of these binding events forms a regulatory network, constituting the wiring diagram for a ...cell. To examine the principles of the human transcriptional regulatory network, we determined the genomic binding information of 119 transcription-related factors in over 450 distinct experiments. We found the combinatorial, co-association of transcription factors to be highly context specific: distinct combinations of factors bind at specific genomic locations. In particular, there are significant differences in the binding proximal and distal to genes. We organized all the transcription factor binding into a hierarchy and integrated it with other genomic information (for example, microRNA regulation), forming a dense meta-network. Factors at different levels have different properties; for instance, top-level transcription factors more strongly influence expression and middle-level ones co-regulate targets to mitigate information-flow bottlenecks. Moreover, these co-regulations give rise to many enriched network motifs (for example, noise-buffering feed-forward loops). Finally, more connected network components are under stronger selection and exhibit a greater degree of allele-specific activity (that is, differential binding to the two parental alleles). The regulatory information obtained in this study will be crucial for interpreting personal genome sequences and understanding basic principles of human biology and disease.
Full text
Available for:
DOBA, IJS, IZUM, KILJ, KISLJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Abstract
Interplay between EBV infection and acquired genetic alterations during nasopharyngeal carcinoma (NPC) development remains vague. Here we report a comprehensive genomic analysis of 70 NPCs, ...combining whole-genome sequencing (WGS) of microdissected tumor cells with EBV oncogene expression to reveal multiple aspects of cellular-viral co-operation in tumorigenesis. Genomic aberrations along with EBV-encoded LMP1 expression underpin constitutive NF-κB activation in 90% of NPCs. A similar spectrum of somatic aberrations and viral gene expression undermine innate immunity in 79% of cases and adaptive immunity in 47% of cases; mechanisms by which NPC may evade immune surveillance despite its pro-inflammatory phenotype. Additionally, genomic changes impairing
TGFBR2
promote oncogenesis and stabilize EBV infection in tumor cells. Fine-mapping of
CDKN2A/CDKN2B
deletion breakpoints reveals homozygous
MTAP
deletions in 32-34% of NPCs that confer marked sensitivity to MAT2A inhibition. Our work concludes that NPC is a homogeneously NF-κB-driven and immune-protected, yet potentially druggable, cancer.
G9a is a lysine methyltransferase that regulates epigenetic modifications, transcription, and genome organization. However, whether these properties are dependent on one another or represent distinct ...functions of G9a remains unclear. In this study, we observe widespread DNA methylation loss in G9a depleted and catalytic mutant embryonic stem cells. Furthermore, we define how G9a regulates chromatin accessibility, epigenetic modifications, and transcriptional silencing in both catalytic-dependent and -independent manners. Reactivated retrotransposons provide alternative promoters and splice sites leading to the upregulation of neighboring genes and the production of chimeric transcripts. Moreover, while topologically associated domains and compartment A/B definitions are largely unaffected, the loss of G9a leads to altered chromatin states, aberrant CTCF and cohesin binding, and differential chromatin looping, especially at retrotransposons. Taken together, our findings reveal how G9a regulates the epigenome, transcriptome, and higher-order chromatin structures in distinct mechanisms.
Display omitted
•Catalytic activity of G9a but not H3K9me2 functions in maintaining DNA methylation•G9a regulates chromatin states and transcription through distinct mechanisms•G9a maintains chromatin loops and TADs boundary strengths•H3K9me2 prevents aberrant CTCF and cohesin binding, particularly at retrotransposons
G9a is an epigenetic modifier with essential roles in development. Jiang et al. show the regulation of epigenetic modifications, transcription, and chromatin structures by G9a. This protein functions through both catalytic-dependent and -independent mechanisms to repress regulatory elements and retrotransposons. Moreover, G9a affects chromatin looping by preventing aberrant CTCF/cohesin binding.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP