Whole genome sequencing of viruses directly from clinical samples is integral for understanding the genetics of host-virus interactions. Here, we report the use of sample sparing target enrichment ...(by hybridisation) for viral nucleic acid separation and deep-sequencing of herpesvirus genomes directly from a range of clinical samples including saliva, blood, virus vesicles, cerebrospinal fluid, and tumour cell lines. We demonstrate the effectiveness of the method by deep-sequencing 13 highly cell-associated human herpesvirus genomes and generating full length genome alignments at high read depth. Moreover, we show the specificity of the method enables the study of viral population structures and their diversity within a range of clinical samples types.
Characterizing complex viral transcriptomes by conventional RNA sequencing approaches is complicated by high gene density, overlapping reading frames, and complex splicing patterns. Direct RNA ...sequencing (direct RNA-seq) using nanopore arrays offers an exciting alternative whereby individual polyadenylated RNAs are sequenced directly, without the recoding and amplification biases inherent to other sequencing methodologies. Here we use direct RNA-seq to profile the herpes simplex virus type 1 (HSV-1) transcriptome during productive infection of primary cells. We show how direct RNA-seq data can be used to define transcription initiation and RNA cleavage sites associated with all polyadenylated viral RNAs and demonstrate that low level read-through transcription produces a novel class of chimeric HSV-1 transcripts, including a functional mRNA encoding a fusion of the viral E3 ubiquitin ligase ICP0 and viral membrane glycoprotein L. Thus, direct RNA-seq offers a powerful method to characterize the changing transcriptional landscape of viruses with complex genomes.
Primary infection with varicella-zoster virus (VZV) causes varicella and the establishment of lifelong latency in sensory ganglion neurons. In one-third of infected individuals VZV reactivates from ...latency to cause herpes zoster, often complicated by difficult-to-treat chronic pain. Experimental infection of non-human primates with simian varicella virus (SVV) recapitulates most features of human VZV disease, thereby providing the opportunity to study the pathogenesis of varicella and herpes zoster in vivo. However, compared to VZV, the transcriptome and the full coding potential of SVV remains incompletely understood. Here, we performed nanopore direct RNA sequencing to annotate the SVV transcriptome in lytically SVV-infected African green monkey (AGM) and rhesus macaque (RM) kidney epithelial cells. We refined structures of canonical SVV transcripts and uncovered numerous RNA isoforms, splicing events, fusion transcripts and non-coding RNAs, mostly unique to SVV. We verified the expression of canonical and newly identified SVV transcripts in vivo, using lung samples from acutely SVV-infected cynomolgus macaques. Expression of selected transcript isoforms, including those located in the unique left-end of the SVV genome, was confirmed by reverse transcription PCR. Finally, we performed detailed characterization of the SVV homologue of the VZV latency-associated transcript (VLT), located antisense to ORF61. Analogous to VZV VLT, SVV VLT is multiply spliced and numerous isoforms are generated using alternative transcription start sites and extensive splicing. Conversely, low level expression of a single spliced SVV VLT isoform defines in vivo latency. Notably, the genomic location of VLT core exons is highly conserved between SVV and VZV. This work thus highlights the complexity of lytic SVV gene expression and provides new insights into the molecular biology underlying lytic and latent SVV infection. The identification of the SVV VLT homolog further underlines the value of the SVV non-human primate model to develop new strategies for prevention of herpes zoster.
Transcriptome profiling has become routine in studies of many biological processes. However, the favored approaches such as short-read Illumina RNA sequencing are giving way to long-read sequencing ...platforms better suited to interrogating the complex transcriptomes typical of many RNA and DNA viruses. Here, we provide a guide-tailored to molecular virologists-to the ins and outs of viral transcriptome sequencing and discuss the strengths and weaknesses of the major RNA sequencing technologies as tools to analyze the abundance and diversity of the viral transcripts made during infection.
The chemical modification of ribonucleotides plays an integral role in the biology of diverse viruses and their eukaryotic host cells. Mapping the precise identity, location, and abundance of ...modified ribonucleotides remains a key goal of many studies aimed at characterizing the function and importance of a given modification. While mapping of specific RNA modifications through short-read sequencing approaches has powered a wealth of new discoveries in the past decade, this approach is limited by inherent biases and an absence of linkage information. Moreover, in viral contexts, the challenge is increased due to the compact nature of viral genomes giving rise to many overlapping transcript isoforms that cannot be adequately resolved using short-read sequencing approaches. The recent emergence of nanopore sequencing, specifically the ability to directly sequence native RNAs from virus-infected host cells, provides not just a new methodology for mapping modified ribonucleotides but also a new conceptual framework for what can be derived from the resulting sequencing data. In this minireview, we provide a detailed overview of how nanopore direct RNA sequencing works, the computational approaches applied to identify modified ribonucleotides, and the core concepts underlying both. We further highlight recent studies that have applied this approach to interrogating viral biology and finish by discussing key experimental considerations and how we predict that these methodologies will continue to evolve.
Abstract
Motivation
The chemical modification of ribonucleotides regulates the structure, stability and interactions of RNAs. Profiling of these modifications using short-read (Illumina) sequencing ...techniques provides high sensitivity but low-to-medium resolution i.e. modifications cannot be assigned to specific transcript isoforms in regions of sequence overlap. An alternative strategy uses current fluctuations in nanopore-based long read direct RNA sequencing (DRS) to infer the location and identity of nucleotides that differ between two experimental conditions. While highly sensitive, these signal-level analyses require high-quality transcriptome annotations and thus are best suited to the study of model organisms. By contrast, the detection of RNA modifications in microbial organisms which typically have no or low-quality annotations requires an alternative strategy. Here, we demonstrate that signal fluctuations directly influence error rates during base-calling and thus provides an alternative approach for identifying modified nucleotides.
Results
DRUMMER (Detection of Ribonucleic acid Modifications Manifested in Error Rates) (i) utilizes a range of statistical tests and background noise correction to identify modified nucleotides with high confidence, (ii) operates with similar sensitivity to signal-level analysis approaches and (iii) correlates very well with orthogonal approaches. Using well-characterized DRS datasets supported by independent meRIP-Seq and miCLIP-Seq datasets we demonstrate that DRUMMER operates with high sensitivity and specificity.
Availability and implementation
DRUMMER is written in Python 3 and is available as open source in the GitHub repository: https://github.com/DepledgeLab/DRUMMER.
Supplementary information
Supplementary data are available at Bioinformatics online.
Leishmania parasites cause a spectrum of clinical pathology in humans ranging from disfiguring cutaneous lesions to fatal visceral leishmaniasis. We have generated a reference genome for Leishmania ...mexicana and refined the reference genomes for Leishmania major, Leishmania infantum, and Leishmania braziliensis. This has allowed the identification of a remarkably low number of genes or paralog groups (2, 14, 19, and 67, respectively) unique to one species. These were found to be conserved in additional isolates of the same species. We have predicted allelic variation and find that in these isolates, L. major and L. infantum have a surprisingly low number of predicted heterozygous SNPs compared with L. braziliensis and L. mexicana. We used short read coverage to infer ploidy and gene copy numbers, identifying large copy number variations between species, with 200 tandem gene arrays in L. major and 132 in L. mexicana. Chromosome copy number also varied significantly between species, with nine supernumerary chromosomes in L. infantum, four in L. mexicana, two in L. braziliensis, and one in L. major. A significant bias against gene arrays on supernumerary chromosomes was shown to exist, indicating that duplication events occur more frequently on disomic chromosomes. Taken together, our data demonstrate that there is little variation in unique gene content across Leishmania species, but large-scale genetic heterogeneity can result through gene amplification on disomic chromosomes and variation in chromosome number. Increased gene copy number due to chromosome amplification may contribute to alterations in gene expression in response to environmental conditions in the host, providing a genetic basis for disease tropism.
The rapid identification of antimicrobial resistance is essential for effective treatment of highly resistant Mycobacterium tuberculosis. Whole-genome sequencing provides comprehensive data on ...resistance mutations and strain typing for monitoring transmission, but unlike for conventional molecular tests, this has previously been achievable only from cultures of M. tuberculosis. Here we describe a method utilizing biotinylated RNA baits designed specifically for M. tuberculosis DNA to capture full M. tuberculosis genomes directly from infected sputum samples, allowing whole-genome sequencing without the requirement of culture. This was carried out on 24 smear-positive sputum samples, collected from the United Kingdom and Lithuania where a matched culture sample was available, and 2 samples that had failed to grow in culture. M. tuberculosis sequencing data were obtained directly from all 24 smear-positive culture-positive sputa, of which 20 were of high quality (>20× depth and >90% of the genome covered). Results were compared with those of conventional molecular and culture-based methods, and high levels of concordance between phenotypical resistance and predicted resistance based on genotype were observed. High-quality sequence data were obtained from one smear-positive culture-negative case. This study demonstrated for the first time the successful and accurate sequencing of M. tuberculosis genomes directly from uncultured sputa. Identification of known resistance mutations within a week of sample receipt offers the prospect for personalized rather than empirical treatment of drug-resistant tuberculosis, including the use of antimicrobial-sparing regimens, leading to improved outcomes.
During latent infections with herpes simplex virus 1 (HSV-1), viral transcription is restricted and the genomes are mostly maintained in silenced chromatin, whereas in lytically infected cells all ...viral genes are transcribed and the genomes are dynamically chromatinized. Histones in the viral chromatin bear markers of silenced chromatin at early times in lytic infection or of active transcription at later times. The virion protein VP16 activates transcription of the immediate-early (IE) genes by recruiting transcription activators and chromatin remodelers to their promoters. Two IE proteins, ICP0 and ICP4 which modulate chromatin epigenetics, then activate transcription of early and late genes. Although chromatin is involved in the mechanism of activation of HSV- transcription, its precise role is not entirely understood. In the cellular genome, chromatin dynamics often modulate transcription competence whereas promoter-specific transcription factors determine transcription activity. Here, biophysical fractionation of serially digested HSV-1 chromatin followed by short-read deep sequencing indicates that nuclear HSV-1 DNA has different biophysical properties than protein-free or encapsidated HSV-1 DNA. The entire HSV-1 genomes in infected cells were equally accessible. The accessibility of transcribed or non-transcribed genes under any given condition did not differ, and each gene was entirely sampled in both the most and least accessible chromatin. However, HSV-1 genomes fractionated differently under conditions of generalized or restricted transcription. Approximately 1/3 of the HSV-1 DNA including fully sampled genes resolved to the most accessible chromatin when HSV-1 transcription was active, but such enrichment was reduced to only 3% under conditions of restricted HSV-1 transcription. Short sequences of restricted accessibility separated genes with different transcription levels. Chromatin dynamics thus provide a first level of regulation on HSV-1 transcription, dictating the transcriptional competency of the genomes during lytic infections, whereas the transcription of individual genes is then most likely activated by specific transcription factors. Moreover, genes transcribed to different levels are separated by short sequences with limited accessibility.
Varicella-zoster virus (VZV) establishes latency in human sensory and cranial nerve ganglia during primary infection (varicella), and the virus can reactivate and cause zoster after primary ...infection. The mechanism of how the virus establishes and maintains latency and how it reactivates is poorly understood, largely due to the lack of robust models. We found that axonal infection of neurons derived from hESCs in a microfluidic device with cell-free parental Oka (POka) VZV resulted in latent infection with inability to detect several viral mRNAs by reverse transcriptase-quantitative PCR, no production of infectious virus, and maintenance of the viral DNA genome in endless configuration, consistent with an episome configuration. With deep sequencing, however, multiple viral mRNAs were detected. Treatment of the latently infected neurons with Ab to NGF resulted in production of infectious virus in about 25% of the latently infected cultures. Axonal infection of neurons with vaccine Oka (VOka) VZV resulted in a latent infection similar to infection with POka; however, in contrast to POka, VOka-infected neurons were markedly impaired for reactivation after treatment with Ab to NGF. In addition, viral transcription was markedly reduced in neurons latently infected with VOka compared with POka. Our in vitro system recapitulates both VZV latency and reactivation in vivo and may be used to study viral vaccines for their ability to establish latency and reactivate.