High-Throughput Sequencing Technologies Reuter, Jason A.; Spacek, Damek V.; Snyder, Michael P.
Molecular cell,
05/2015, Letnik:
58, Številka:
4
Journal Article
Recenzirano
Odprti dostop
The human genome sequence has profoundly altered our understanding of biology, human diversity, and disease. The path from the first draft sequence to our nascent era of personal genomes and genomic ...medicine has been made possible only because of the extraordinary advancements in DNA sequencing technologies over the past 10 years. Here, we discuss commonly used high-throughput sequencing platforms, the growing array of sequencing assays developed around them, as well as the challenges facing current sequencing platforms and their clinical application.
Reuter et al. discuss common used high-throughput sequencing platforms, challenges, and applications.
A new wave of portable biosensors allows frequent measurement of health-related physiology. We investigated the use of these devices to monitor human physiological changes during various activities ...and their role in managing health and diagnosing and analyzing disease. By recording over 250,000 daily measurements for up to 43 individuals, we found personalized circadian differences in physiological parameters, replicating previous physiological findings. Interestingly, we found striking changes in particular environments, such as airline flights (decreased peripheral capillary oxygen saturation SpO2 and increased radiation exposure). These events are associated with physiological macro-phenotypes such as fatigue, providing a strong association between reduced pressure/oxygen and fatigue on high-altitude flights. Importantly, we combined biosensor information with frequent medical measurements and made two important observations: First, wearable devices were useful in identification of early signs of Lyme disease and inflammatory responses; we used this information to develop a personalized, activity-based normalization framework to identify abnormal physiological signals from longitudinal data for facile disease detection. Second, wearables distinguish physiological differences between insulin-sensitive and -resistant individuals. Overall, these results indicate that portable biosensors provide useful information for monitoring personal activities and physiology and are likely to play an important role in managing health and enabling affordable health care access to groups traditionally limited by socioeconomic class or remote geography.
Advances in genome sequencing have progressed at a rapid pace, with increased throughput accompanied by plunging costs. But these advances go far beyond faster and cheaper. High‐throughput sequencing ...technologies are now routinely being applied to a wide range of important topics in biology and medicine, often allowing researchers to address important biological questions that were not possible before. In this review, we discuss these innovative new approaches—including ever finer analyses of transcriptome dynamics, genome structure and genomic variation—and provide an overview of the new insights into complex biological systems catalyzed by these technologies. We also assess the impact of genotyping, genome sequencing and personal omics profiling on medical applications, including diagnosis and disease monitoring. Finally, we review recent developments in single‐cell sequencing, and conclude with a discussion of possible future advances and obstacles for sequencing in biology and health.
Genome sequencing technologies have advanced rapidly, dramatically decreasing cost and increasing throughput. But beyond faster and cheaper, these advances have also stimulated the development of innovative new experimental approaches, and are opening new doors in human medicine and health.
Personal transcriptomes in which all of an individual's genetic variants (e.g., single nucleotide variants) and transcript isoforms (transcription start sites, splice sites, and polyA sites) are ...defined and quantified for full-length transcripts are expected to be important for understanding individual biology and disease, but have not been described previously. To obtain such transcriptomes, we sequenced the lymphoblastoid transcriptomes of three family members (GM12878 and the parents GM12891 and GM12892) by using a Pacific Biosciences long-read approach complemented with Illumina 101-bp sequencing and made the following observations. First, we found that reads representing all splice sites of a transcript are evident for most sufficiently expressed genes ≤3 kb and often for genes longer than that. Second, we added and quantified previously unidentified splicing isoforms to an existing annotation, thus creating the first personalized annotation to our knowledge. Third, we determined SNVs in a de novo manner and connected them to RNA haplotypes, including HLA haplotypes, thereby assigning single full-length RNA molecules to their transcribed allele, and demonstrated Mendelian inheritance of RNA molecules. Fourth, we show how RNA molecules can be linked to personal variants on a one-by-one basis, which allows us to assess differential allelic expression (DAE) and differential allelic isoforms (DAI) from the phased full-length isoform reads. The DAI method is largely independent of the distance between exon and SNV—in contrast to fragmentation-based methods. Overall, in addition to improving eukaryotic transcriptome annotation, these results describe, to our knowledge, the first large-scale and full-length personal transcriptome.
Transposable elements (TEs) have been shown to contain functional binding sites for certain transcription factors (TFs). However, the extent to which TEs contribute to the evolution of TF binding ...sites is not well known. We comprehensively mapped binding sites for 26 pairs of orthologous TFs in two pairs of human and mouse cell lines (representing two cell lineages), along with epigenomic profiles, including DNA methylation and six histone modifications. Overall, we found that 20% of binding sites were embedded within TEs. This number varied across different TFs, ranging from 2% to 40%. We further identified 710 TF-TE relationships in which genomic copies of a TE subfamily contributed a significant number of binding peaks for a TF, and we found that LTR elements dominated these relationships in human. Importantly, TE-derived binding peaks were strongly associated with open and active chromatin signatures, including reduced DNA methylation and increased enhancer-associated histone marks. On average, 66% of TE-derived binding events were cell type-specific with a cell type-specific epigenetic landscape. Most of the binding sites contributed by TEs were species-specific, but we also identified binding sites conserved between human and mouse, the functional relevance of which was supported by a signature of purifying selection on DNA sequences of these TEs. Interestingly, several TFs had significantly expanded binding site landscapes only in one species, which were linked to species-specific gene functions, suggesting that TEs are an important driving force for regulatory innovation. Taken together, our data suggest that TEs have significantly and continuously shaped gene regulatory networks during mammalian evolution.
Attention is the process that selects which sensory information is preferentially processed and ultimately reaches our awareness. Attention, however, is not a unitary process; it can be captured by ...unexpected or salient events (stimulus driven) or it can be deployed under voluntary control (goal directed), and these two forms of attention are implemented by largely distinct ventral and dorsal parieto-frontal networks. For coherent behavior and awareness to emerge, stimulus-driven and goal-directed behavior must ultimately interact. We found that the ventral, but not dorsal, network can account for stimulus-driven attentional limits to conscious perception, and that stimulus-driven and goal-directed attention converge in the lateral prefrontal component of that network. Although these results do not rule out dorsal network involvement in awareness when goal-directed task demands are present, they point to a general role for the lateral prefrontal cortex in the control of attention and awareness.
We define the chromatin accessibility and transcriptional landscapes in 13 human primary blood cell types that span the hematopoietic hierarchy. Exploiting the finding that the enhancer landscape ...better reflects cell identity than mRNA levels, we enable 'enhancer cytometry' for enumeration of pure cell types from complex populations. We identify regulators governing hematopoietic differentiation and further show the lineage ontogeny of genetic elements linked to diverse human diseases. In acute myeloid leukemia (AML), chromatin accessibility uncovers unique regulatory evolution in cancer cells with a progressively increasing mutation burden. Single AML cells exhibit distinctive mixed regulome profiles corresponding to disparate developmental stages. A method to account for this regulatory heterogeneity identified cancer-specific deviations and implicated HOX factors as key regulators of preleukemic hematopoietic stem cell characteristics. Thus, regulome dynamics can provide diverse insights into hematopoietic development and disease.
Trimethylation of histone H3 at lysine 4 (H3K4me3) is a chromatin modification known to mark the transcription start sites of active genes. Here, we show that H3K4me3 domains that spread more broadly ...over genes in a given cell type preferentially mark genes that are essential for the identity and function of that cell type. Using the broadest H3K4me3 domains as a discovery tool in neural progenitor cells, we identify novel regulators of these cells. Machine learning models reveal that the broadest H3K4me3 domains represent a distinct entity, characterized by increased marks of elongation. The broadest H3K4me3 domains also have more paused polymerase at their promoters, suggesting a unique transcriptional output. Indeed, genes marked by the broadest H3K4me3 domains exhibit enhanced transcriptional consistency rather than increased transcriptional levels, and perturbation of H3K4me3 breadth leads to changes in transcriptional consistency. Thus, H3K4me3 breadth contains information that could ensure transcriptional precision at key cell identity/function genes.
Display omitted
•Broad H3K4me3 domains mark cell identity genes and can be used as a discovery tool•Broad H3K4me3 domains are a distinct entity defined by specific Pol II regulation•Genes marked by broad H3K4me3 domains have increased transcriptional consistency•Perturbation of H3K4me3 breadth leads to changes in transcriptional consistency
Genes marked by broad H3K4me3 domains have increased transcriptional consistency.
The coronavirus disease 2019 (COVID-19) global pandemic continues to spread worldwide with approximately 216 million confirmed cases and 4.49 million deaths to date. Intensive efforts are ongoing to ...combat this disease by suppressing viral transmission, understanding its pathogenesis, developing vaccination strategies, and identifying effective therapeutic targets. Individuals with preexisting diabetes also show higher incidence of COVID-19 illness and poorer prognosis upon infection. Likewise, an increased frequency of diabetes onset and diabetes complications has been reported in patients following COVID-19 diagnosis. COVID-19 may elevate the risk of hyperglycemia and other complications in patients with and without prior diabetes history. It is unclear whether the virus induces type 1 or type 2 diabetes or instead causes a novel atypical form of diabetes. Moreover, it remains unknown if recovering COVID-19 patients exhibit a higher risk of developing new-onset diabetes or its complications going forward. The aim of this review is to summarize what is currently known about the epidemiology and mechanisms of this bidirectional relationship between COVID-19 and diabetes. We highlight major challenges that hinder the study of COVID-19-induced new-onset of diabetes and propose a potential framework for overcoming these obstacles. We also review state-of-the-art wearables and microsampling technologies that can further study diabetes management and progression in new-onset diabetes cases. We conclude by outlining current research initiatives investigating the bidirectional relationship between COVID-19 and diabetes, some with emphasis on wearable technology.
In parallel to the genetic code for protein synthesis, a second layer of information is embedded in all RNA transcripts in the form of RNA structure. RNA structure influences practically every step ...in the gene expression program. However, the nature of most RNA structures or effects of sequence variation on structure are not known. Here we report the initial landscape and variation of RNA secondary structures (RSSs) in a human family trio (mother, father and their child). This provides a comprehensive RSS map of human coding and non-coding RNAs. We identify unique RSS signatures that demarcate open reading frames and splicing junctions, and define authentic microRNA-binding sites. Comparison of native deproteinized RNA isolated from cells versus refolded purified RNA suggests that the majority of the RSS information is encoded within RNA sequence. Over 1,900 transcribed single nucleotide variants (approximately 15% of all transcribed single nucleotide variants) alter local RNA structure. We discover simple sequence and spacing rules that determine the ability of point mutations to impact RSSs. Selective depletion of 'riboSNitches' versus structurally synonymous variants at precise locations suggests selection for specific RNA shapes at thousands of sites, including 3' untranslated regions, binding sites of microRNAs and RNA-binding proteins genome-wide. These results highlight the potentially broad contribution of RNA structure and its variation to gene regulation.