Nuclear localization signals (NLSs) are amino acid sequences that target cargo proteins into the nucleus. Rigorous characterization of NLS motifs is essential to understanding and predicting pathways ...for nuclear import. The best-characterized NLS is the classical NLS (cNLS), which is recognized by the cNLS receptor, importin-α. cNLSs are conventionally defined as having one (monopartite) or two clusters of basic amino acids separated by a 9-12 aa linker (bipartite). Motivated by the finding that Ty1 integrase, which contains an unconventional putative bipartite cNLS with a 29 aa linker, exploits the classical nuclear import machinery, we assessed the functional boundaries for linker length within a bipartite cNLS. We confirmed that the integrase cNLS is a bona fide bipartite cNLS, then carried out a systematic analysis of linker length in an obligate bipartite cNLS cargo, which revealed that some linkers longer than conventionally defined can function in nuclear import. Linker function is dependent on the sequence and likely the inherent flexibility of the linker. Subsequently, we interrogated the Saccharomyces cerevisiae proteome to identify cellular proteins containing putative long bipartite cNLSs. We experimentally confirmed that Rrp4 contains a bipartite cNLS with a 25 aa linker. Our studies show that the traditional definition of bipartite cNLSs is too restrictive and linker length can vary depending on amino acid composition.
Although numerous algorithms have been developed to identify structural variations (SVs) in genomic sequences, there is a dearth of approaches that can be used to evaluate their results. This is ...significant as the accurate identification of structural variation is still an outstanding but important problem in genomics. The emergence of new sequencing technologies that generate longer sequence reads can, in theory, provide direct evidence for all types of SVs regardless of the length of the region through which it spans. However, current efforts to use these data in this manner require the use of large computational resources to assemble these sequences as well as visual inspection of each region. Here we present VaPoR, a highly efficient algorithm that autonomously validates large SV sets using long-read sequencing data. We assessed the performance of VaPoR on SVs in both simulated and real genomes and report a high-fidelity rate for overall accuracy across different levels of sequence depths. We show that VaPoR can interrogate a much larger range of SVs while still matching existing methods in terms of false positive validations and providing additional features considering breakpoint precision and predicted genotype. We further show that VaPoR can run quickly and efficiency without requiring a large processing or assembly pipeline. VaPoR provides a long read–based validation approach for genomic SVs that requires relatively low read depth and computing resources and thus will provide utility with targeted or low-pass sequencing coverage for accurate SV assessment. The VaPoR Software is available at: https://github.com/mills-lab/vapor.
Complex chromosomal rearrangements are structural genomic alterations involving multiple instances of deletions, duplications, inversions, or translocations that co-occur either on the same ...chromosome or represent different overlapping events on homologous chromosomes. We present SVelter, an algorithm that identifies regions of the genome suspected to harbor a complex event and then resolves the structure by iteratively rearranging the local genome structure, in a randomized fashion, with each structure scored against characteristics of the observed sequencing data. SVelter is able to accurately reconstruct complex chromosomal rearrangements when compared to well-characterized genomes that have been deeply sequenced with both short and long reads.
Background
Human papillomavirus (HPV) is a well‐established driver of malignant transformation at a number of sites, including head and neck, cervical, vulvar, anorectal, and penile squamous cell ...carcinomas; however, the impact of HPV integration into the host human genome on this process remains largely unresolved. This is due to the technical challenge of identifying HPV integration sites, which includes limitations of existing informatics approaches to discovering viral‐host breakpoints from low‐read‐coverage sequencing data.
Methods
To overcome this limitation, the authors developed SearcHPV, a new HPV detection pipeline based on targeted capture technology, and applied the algorithm to targeted capture data. They performed an integrated analysis of SearcHPV‐defined breakpoints with genome‐wide linked‐read sequencing to identify potential HPV‐related structural variations.
Results
Through an analysis of HPV+ models, the authors showed that SearcHPV detected HPV‐host integration sites with a higher sensitivity and specificity than 2 other commonly used HPV detection callers. SearcHPV uncovered HPV integration sites adjacent to known cancer‐related genes, including TP63, MYC, and TRAF2, and near regions of large structural variation. The authors further validated the junction contig assembly feature of SearcHPV, which helped to accurately identify viral‐host junction breakpoint sequences. They found that viral integration occurred through a variety of DNA repair mechanisms, including nonhomologous end joining, alternative end joining, and microhomology‐mediated repair.
Conclusions
In summary, SearcHPV is a new optimized tool for the accurate detection of HPV‐human integration sites from targeted capture DNA sequencing data.
To overcome technical challenges of detecting viral integrations in human papillomavirus (HPV)–related cancers, a new pipeline called SearcHPV has been optimized. Using this tool, the authors have found frequent integration near genes and areas of large structural rearrangements in HPV–positive models.
Policy elites use rhetoric in speeches and press releases to provide framing that is intended to influence public opinion. These rhetorical events can be treated as instances in which speech usefully ...promotes particular discourses. Indeed, elected officials are able to influence how individuals think about problems and solutions through speeches and press releases. Two important rhetorical events in which political elites advance frames for social issues are annual state of the state addresses (SoSA) given by U.S. governors and gubernatorial press releases that inform media reporting about state policy. This study employed policy discourse and rhetorical analyses to examine SoSAs and press releases as rhetorical events within the context of educational policy. Our findings show that governors framed the roles of state government, governors, and educational stakeholders within a discourse that perpetuates a neoliberal version of education. In this framing, governors situated education's purpose as being workforce and economic development, ignoring its role in addressing social issues and preparing informed, engaged participants for democratic society. Given that individuals make decisions about how to address social issues and understand public institutions based on framing provided by political elites, these findings raise implications for state educational policies and the public purposes of education.
Although nucleotide resolution maps of genomic structural variants (SVs) have provided insights into the origin and impact of phenotypic diversity in humans, comparable maps in nonhuman primates have ...thus far been lacking. Using massively parallel DNA sequencing, we constructed fine-resolution genomic structural variation maps in five chimpanzees, five orang-utans, and five rhesus macaques. The SV maps, which are comprised of thousands of deletions, duplications, and mobile element insertions, revealed a high activity of retrotransposition in macaques compared with great apes. By comparison, nonallelic homologous recombination is specifically active in the great apes, which is correlated with architectural differences between the genomes of great apes and macaque. Transcriptome analyses across nonhuman primates and humans revealed effects of species-specific whole-gene duplication on gene expression. We identified 13 gene duplications coinciding with the species-specific gain of tissue-specific gene expression in keeping with a role of gene duplication in the promotion of diversification and the acquisition of unique functions. Differences in the present day activity of SV formation mechanisms that our study revealed may contribute to ongoing diversification and adaptation of great ape and Old World monkey lineages.
Abstract
Background
Multiple myeloma (MM) is a hematological cancer caused by abnormal accumulation of monoclonal plasma cells in bone marrow. With the increase in treatment options, risk-adapted ...therapy is becoming more and more important. Survival analysis is commonly applied to study progression or other events of interest and stratify the risk of patients.
Results
In this study, we present the current state-of-the-art model for MM prognosis and the molecular biomarker set for stratification: the winning algorithm in the 2017 Multiple Myeloma DREAM Challenge, Sub-Challenge 3. Specifically, we built a non-parametric complete hazard ranking model to map the right-censored data into a linear space, where commonplace machine learning techniques, such as Gaussian process regression and random forests, can play their roles. Our model integrated both the gene expression profile and clinical features to predict the progression of MM. Compared with conventional models, such as Cox model and random survival forests, our model achieved higher accuracy in 3 within-cohort predictions. In addition, it showed robust predictive power in cross-cohort validations. Key molecular signatures related to MM progression were identified from our model, which may function as the core determinants of MM progression and provide important guidance for future research and clinical practice. Functional enrichment analysis and mammalian gene-gene interaction network revealed crucial biological processes and pathways involved in MM progression. The model is dockerized and publicly available at https://www.synapse.org/#!Synapse:syn11459638. Both data and reproducible code are included in the docker.
Conclusions
We present the current state-of-the-art prognostic model for MM integrating gene expression and clinical features validated in an independent test set.
Summary
Caspases are a group of proteolytic enzymes involved in the co‐ordination of cellular processes, including cellular homeostasis, inflammation and apoptosis. Altered activity of caspases, ...particularly caspase‐1, has been implicated in the development of intestinal diseases, such as inflammatory bowel disease (IBD) and colorectal cancer (CRC). However, the involvement of two related inflammatory caspase members, caspases‐4 and ‐5, during intestinal homeostasis and disease has not yet been established. This study demonstrates that caspases‐4 and ‐5 are involved in IBD‐associated intestinal inflammation. Furthermore, we found a clear correlation between stromal caspase‐4 and ‐5 expression levels, inflammation and disease activity in ulcerative colitis patients. Deregulated intestinal inflammation in IBD patients is associated with an increased risk of developing CRC. We found robust expression of caspases‐4 and ‐5 within intestinal epithelial cells, exclusively within neoplastic tissue, of colorectal tumours. An examination of adjacent normal, inflamed and tumour tissue from patients with colitis‐associated CRC confirmed that stromal expression of caspases‐4 and ‐5 is increased in inflamed and dysplastic tissue, while epithelial expression is restricted to neoplastic tissue. In addition to identifying caspases‐4 and ‐5 as potential targets for limiting intestinal inflammation, this study has identified epithelial‐expressed caspases‐4 and ‐5 as biomarkers with diagnostic and therapeutic potential in CRC.
Abstract
The transfer and integration of whole and partial mitochondrial genomes into the nuclear genomes of eukaryotes is an ongoing process that has facilitated the transfer of genes and ...contributed to the evolution of various cellular pathways. Many previous studies have explored the impact of these insertions, referred to as NumtS, but have focused primarily on older events that have become fixed and are therefore present in all individual genomes for a given species. We previously developed an approach to identify novel Numt polymorphisms from next-generation sequence data and applied it to thousands of human genomes. Here, we extend this analysis to 79 individuals of other great ape species including chimpanzee, bonobo, gorilla, orang-utan and also an old world monkey, macaque. We show that recent Numt insertions are prevalent in each species though at different apparent rates, with chimpanzees exhibiting a significant increase in both polymorphic and fixed Numt sequences as compared to other great apes. We further assessed positional effects in each species in terms of evolutionary time and rate of insertion and identified putative hotspots on chromosome 5 for Numt integration, providing insight into both recent polymorphic and older fixed reference NumtS in great apes in comparison to human events.
African American and European American individuals have a similar prevalence of gastroesophageal reflux disease (GERD), yet esophageal adenocarcinoma (EAC) disproportionately affects European ...American individuals. We investigated whether the esophageal squamous mucosa of African American individuals has features that protect against GERD-induced damage, compared with European American individuals.
We performed transcriptional profile analysis of esophageal squamous mucosa tissues from 20 African American and 20 European American individuals (24 with no disease and 16 with Barrett’s esophagus and/or EAC). We confirmed our findings in a cohort of 56 patients and analyzed DNA samples from patients to identify associated variants. Observations were validated using matched genomic sequence and expression data from lymphoblasts from the 1000 Genomes Project. A panel of esophageal samples from African American and European American subjects was used to confirm allele-related differences in protein levels. The esophageal squamous-derived cell line Het-1A and a rat esophagogastroduodenal anastomosis model for reflux-generated esophageal damage were used to investigate the effects of the DNA-damaging agent cumene-hydroperoxide (cum-OOH) and a chemopreventive cranberry proanthocyanidin (C-PAC) extract, respectively, on levels of protein and messenger RNA (mRNA).
We found significantly higher levels of glutathione S-transferase theta 2 (GSTT2) mRNA in squamous mucosa from African American compared with European American individuals and associated these with variants within the GSTT2 locus in African American individuals. We confirmed that 2 previously identified genomic variants at the GSTT2 locus, a 37-kb deletion and a 17-bp promoter duplication, reduce expression of GSTT2 in tissues from European American individuals. The nonduplicated 17-bp promoter was more common in tissue samples from populations of African descendant. GSTT2 protected Het-1A esophageal squamous cells from cum-OOH–induced DNA damage. Addition of C-PAC increased GSTT2 expression in Het-1A cells incubated with cum-OOH and in rats with reflux-induced esophageal damage. C-PAC also reduced levels of DNA damage in reflux-exposed rat esophagi, as observed by reduced levels of phospho-H2A histone family member X.
We found GSTT2 to protect esophageal squamous cells against DNA damage from genotoxic stress and that GSTT2 expression can be induced by C-PAC. Increased levels of GSTT2 in esophageal tissues of African American individuals might protect them from GERD-induced damage and contribute to the low incidence of EAC in this population.
Display omitted