Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a ...reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.
Acute infection is known to induce rapid expansion of hematopoietic stem cells (HSCs), but the mechanisms supporting this expansion remain incomplete. Using mouse models, we show that inducible CD36 ...is required for free fatty acid uptake by HSCs during acute infection, allowing the metabolic transition from glycolysis towards β-oxidation. Mechanistically, high CD36 levels promote FFA uptake, which enables CPT1A to transport fatty acyl chains from the cytosol into the mitochondria. Without CD36-mediated FFA uptake, the HSCs are unable to enter the cell cycle, subsequently enhancing mortality in response to bacterial infection. These findings enhance our understanding of HSC metabolism in the bone marrow microenvironment, which supports the expansion of HSCs during pathogenic challenge.
High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for ...only a few non-microbial species
. To address this issue, the international Genome 10K (G10K) consortium
has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
Hematopoietic stem cells (HSCs) undergo rapid expansion in response to stress stimuli. Here we investigate the bioenergetic processes which facilitate the HSC expansion in response to infection. We ...find that infection by Gram-negative bacteria drives an increase in mitochondrial mass in mammalian HSCs, which results in a metabolic transition from glycolysis toward oxidative phosphorylation. The initial increase in mitochondrial mass occurs as a result of mitochondrial transfer from the bone marrow stromal cells (BMSCs) to HSCs through a reactive oxygen species (ROS)-dependent mechanism. Mechanistically, ROS-induced oxidative stress regulates the opening of connexin channels in a system mediated by phosphoinositide 3-kinase (PI3K) activation, which allows the mitochondria to transfer from BMSCs into HSCs. Moreover, mitochondria transfer from BMSCs into HSCs, in the response to bacterial infection, occurs before the HSCs activate their own transcriptional program for mitochondrial biogenesis. Our discovery demonstrates that mitochondrial transfer from the bone marrow microenvironment to HSCs is an early physiologic event in the mammalian response to acute bacterial infection and results in bioenergetic changes which underpin emergency granulopoiesis.
The evolutionary history of a gene helps predict its function and relationship to phenotypic traits. Although sequence conservation is commonly used to decipher gene function and assess medical ...relevance, methods for functional inference from comparative expression data are lacking. Here, we use RNA-seq across seven tissues from 17 mammalian species to show that expression evolution across mammals is accurately modeled by the Ornstein-Uhlenbeck process, a commonly proposed model of continuous trait evolution. We apply this model to identify expression pathways under neutral, stabilizing, and directional selection. We further demonstrate novel applications of this model to quantify the extent of stabilizing selection on a gene's expression, parameterize the distribution of each gene's optimal expression level, and detect deleterious expression levels in expression data from individual patients. Our work provides a statistical framework for interpreting expression data across species and in disease.
We used 20 de novo genome assemblies to probe the speciation history and architecture of gene flow in rapidly radiating
butterflies. Our tests to distinguish incomplete lineage sorting from ...introgression indicate that gene flow has obscured several ancient phylogenetic relationships in this group over large swathes of the genome. Introgressed loci are underrepresented in low-recombination and gene-rich regions, consistent with the purging of foreign alleles more tightly linked to incompatibility loci. Here, we identify a hitherto unknown inversion that traps a color pattern switch locus. We infer that this inversion was transferred between lineages by introgression and is convergent with a similar rearrangement in another part of the genus. These multiple de novo genome sequences enable improved understanding of the importance of introgression and selective processes in adaptive radiation.
Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying ...repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine-freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine-freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature.
Lymphoma is the most common hematological malignancy in developed countries. Outcome is strongly determined by molecular subtype, reflecting a need for new and improved treatment options. Dogs ...spontaneously develop lymphoma, and the predisposition of certain breeds indicates genetic risk factors. Using the dog breed structure, we selected three lymphoma predisposed breeds developing primarily T-cell (boxer), primarily B-cell (cocker spaniel), and with equal distribution of B- and T-cell lymphoma (golden retriever), respectively. We investigated the somatic mutations in B- and T-cell lymphomas from these breeds by exome sequencing of tumor and normal pairs. Strong similarities were evident between B-cell lymphomas from golden retrievers and cocker spaniels, with recurrent mutations in TRAF3-MAP3K14 (28% of all cases), FBXW7 (25%), and POT1 (17%). The FBXW7 mutations recurrently occur in a specific codon; the corresponding codon is recurrently mutated in human cancer. In contrast, T-cell lymphomas from the predisposed breeds, boxers and golden retrievers, show little overlap in their mutation pattern, sharing only one of their 15 most recurrently mutated genes. Boxers, which develop aggressive T-cell lymphomas, are typically mutated in the PTEN-mTOR pathway. T-cell lymphomas in golden retrievers are often less aggressive, and their tumors typically showed mutations in genes involved in cellular metabolism. We identify genes with known involvement in human lymphoma and leukemia, genes implicated in other human cancers, as well as novel genes that could allow new therapeutic options.
Elephantids are the world’s most iconic megafaunal family, yet there is no comprehensive genomic assessment of their relationships. We report a total of 14 genomes, including 2 from the American ...mastodon, which is an extinct elephantid relative, and 12 spanning all three extant and three extinct elephantid species including an ∼120,000-y-old straight-tusked elephant, a Columbian mammoth, and woolly mammoths. Earlier genetic studies modeled elephantid evolution via simple bifurcating trees, but here we show that interspecies hybridization has been a recurrent feature of elephantid evolution. We found that the genetic makeup of the straight-tusked elephant, previously placed as a sister group to African forest elephants based on lower coverage data, in fact comprises three major components. Most of the straight-tusked elephant’s ancestry derives from a lineage related to the ancestor of African elephants while its remaining ancestry consists of a large contribution from a lineage related to forest elephants and another related to mammoths. Columbian and woolly mammoths also showed evidence of interbreeding, likely following a latitudinal cline across North America. While hybridization events have shaped elephantid history in profound ways, isolation also appears to have played an important role. Our data reveal nearly complete isolation between the ancestors of the African forest and savanna elephants for ∼500,000 y, providing compelling justification for the conservation of forest and savanna elephants as separate species.
The genetic changes underlying the initial steps of animal domestication are still poorly understood. We generated a high-quality reference genome for the rabbit and compared it to resequencing data ...from populations of wild and domestic rabbits. We identified more than 100 selective sweeps specific to domestic rabbits but only a relatively small number of fixed (or nearly fixed) single-nucleotide polymorphisms (SNPs) for derived alleles. SNPs with marked allele frequency differences between wild and domestic rabbits were enriched for conserved noncoding sites. Enrichment analyses suggest that genes affecting brain and neuronal development have often been targeted during domestication. We propose that because of a truly complex genetic background, tame behavior in rabbits and other domestic animals evolved by shifts in allele frequencies at many loci, rather than by critical changes at only a few domestication loci.