UNI-MB - logo
UMNIK - logo
 
E-viri
Celotno besedilo
  • Hematopoietic cell populati...
    Deslattes Mays, Anne

    01/2015
    Dissertation

    "Progress in science results from new technologies, new discoveries and new ideas, probably in that order." Nobel Laureate Sydney Brenner (1927 - ) Sequencing the human genome was a critical first step in setting the groundwork to understanding the molecular programming that is involved in transforming a cell from a healthy to a cancerous state. Cellular transcriptome complexity has become increasingly more apparent as technological advances have exposed us to its diversity. Full-length RNA sequencing is crucial for an unbiased analysis of transcriptome complexity. This complexity is due to posttranscriptional processing of primary transcripts that results in a variety of isoforms generated from the same genomic loci. Distinct cell lineages are defined by their transcript isoform expression profiles, and the annotation of cells can be derived from the expression of transcript isoforms that can result in functionally different proteins. Alternate splice site utilization provides cells with a powerful regulatory mechanism of gene expression that can impact the composition of the protein product, and influence the rate of translation of transcripts from multi-exon genes. The overall goal of this project was to delineate the hematopoietic transcriptome revealed by full-length sequencing and assess the shortcomings of transcriptome reconstruction using fragmented-read sequencing. The aims were to (a) evaluate the complexity of the hematopoietic transcriptome using full-length RNA sequencing, to (b) compare the full-length RNA-sequencing transcriptome with the reconstructed transcriptome from fragmented-read sequencing and to (c) evaluate whether hematopoietic cell subpopulations show distinct transcriptome patterns. Sequencing and reconstructing transcripts through transcriptome reconstruction from fragmented read sequencing have advanced our understanding of the transcriptome. Here we show that full-length transcriptome sequencing is necessary to faithfully expose the transcriptome and understand its complexities. Abundance information and pathway analysis support this. Also, full-length sequencing illustrates open reading frames that code for contiguous canonical or fusion proteins that can be validated with peptides. This transcriptome diversity is consistent with distinct phenotypes of cell subpopulations present in tissues. Accurate transcriptome measurement builds a foundation that can be relied upon to ensure higher success rates for therapeutics and lower false discovery rates for biomarkers of disease. The analysis of transcripts of a set of selected genes as well as the potential for posttranscriptional processing predicts for a highly complex transcriptome and an abundance of hitherto unknown protein isoforms. Classic approaches have not allowed full testing of this hypothesis due to limitations in sequencing lengths. Taking advantage of full-length sequencing technology provides us with an opportunity to uncover transcripts that cannot be obtained through traditional transcript reconstruction techniques.