Tapeworms (Cestoda) cause neglected diseases that can be fatal and are difficult to treat, owing to inefficient drugs. Here we present an analysis of tapeworm genome sequences using the ...human-infective species Echinococcus multilocularis, E. granulosus, Taenia solium and the laboratory model Hymenolepis microstoma as examples. The 115- to 141-megabase genomes offer insights into the evolution of parasitism. Synteny is maintained with distantly related blood flukes but we find extreme losses of genes and pathways that are ubiquitous in other animals, including 34 homeobox families and several determinants of stem cell fate. Tapeworms have specialized detoxification pathways, metabolism that is finely tuned to rely on nutrients scavenged from their hosts, and species-specific expansions of non-canonical heat shock proteins and families of known antigens. We identify new potential drug targets, including some on which existing pharmaceuticals may act. The genomes provide a rich resource to underpin the development of urgently needed treatments and control.
Genes of the Hox cluster are restricted to the animal kingdom and play a central role in axial patterning in divergent animal phyla. Despite its evolutionary and developmental significance, the ...origin of the Hox gene cluster is obscure. The consensus is that a primordial Hox cluster arose by tandem gene duplication close to animal origins. Several homeobox genes with high sequence identity to Hox genes are found outside the Hox cluster and are known as 'dispersed' Hox-like genes; these genes may have been transposed away from an expanding cluster. Here we show that three of these dispersed homeobox genes form a novel gene cluster in the cephalochordate amphioxus. We argue that this 'ParaHox' gene cluster is an ancient paralogue (evolutionary sister) of the Hox gene cluster; the two gene clusters arose by duplication of a ProtoHox gene cluster. Furthermore, we show that amphioxus ParaHox genes have co-linear developmental expression patterns in anterior, middle and posterior tissues. We propose that the origin of distinct Hox and ParaHox genes by gene-cluster duplication facilitated an increase in body complexity during the Cambrian explosion.
Biosilicification is widespread across the eukaryotes and requires concentration of silicon in intracellular vesicles. Knowledge of the molecular mechanisms underlying this process remains limited, ...with unrelated silicon-transporting proteins found in the eukaryotic clades previously studied. Here, we report the identification of silicon transporter (SIT)-type genes from the siliceous loricate choanoflagellates Stephanoeca diplocostata and Diaphanoeca grandis. Until now, the SIT gene family has been identified only in diatoms and other siliceous stramenopiles, which are distantly related to choanoflagellates among the eukaryotes. This is the first evidence of similarity between SITs from different eukaryotic supergroups. Phylogenetic analysis indicates that choanoflagellate and stramenopile SITs form distinct monophyletic groups. The absence of putative SIT genes in any other eukaryotic groups, including non-siliceous choanoflagellates, leads us to propose that SIT genes underwent a lateral gene transfer event between stramenopiles and loricate choanoflagellates. We suggest that the incorporation of a foreign SIT gene into the stramenopile or choanoflagellate genome resulted in a major metabolic change: the acquisition of biomineralized silica structures. This hypothesis implies that biosilicification has evolved multiple times independently in the eukaryotes, and paves the way for a better understanding of the biochemical basis of silicon transport through identification of conserved sequence motifs.
We present a genome assembly from an individual male Mythimna albipuncta (the White-point; Arthropoda; Insecta; Lepidoptera; Noctuidae). The genome sequence is 698.6 megabases in span. Most of the ...assembly is scaffolded into 31 chromosomal pseudomolecules, including the Z sex chromosome. The mitochondrial genome has also been assembled and is 15.38 kilobases in length. Gene annotation of this assembly on Ensembl identified 13,679 protein coding genes.
Homeobox genes encode transcription factors with essential roles in patterning and cell fate in developing animal embryos. Many homeobox genes, including Hox and NK genes, are arranged in gene ...clusters, a feature likely related to transcriptional control. Sparse taxon sampling and fragmentary genome assemblies mean that little is known about the dynamics of homeobox gene evolution across Lepidoptera or about how changes in homeobox gene number and organization relate to diversity in this large order of insects. Here we analyze an extensive data set of high-quality genomes to characterize the number and organization of all homeobox genes in 123 species of Lepidoptera from 23 taxonomic families. We find most Lepidoptera have around 100 homeobox loci, including an unusual Hox gene cluster in which the
gene is repositioned and the
gene is next to
A topologically associating domain spans much of the gene cluster, suggesting deep regulatory conservation of the Hox cluster arrangement in this insect order. Most Lepidoptera have four Shx genes, divergent
-derived loci, but these loci underwent dramatic duplication in several lineages, with some moths having over 165 homeobox loci in the Hox gene cluster; this expansion is associated with local LINE element density. In contrast, the NK gene cluster content is more stable, although there are differences in organization compared with other insects, as well as major rearrangements within butterflies. Our analysis represents the first description of homeobox gene content across the order Lepidoptera, exemplifying the potential of newly generated genome assemblies for understanding genome and gene family evolution.
MicroRNAs (miRNAs) are involved in posttranscriptional regulation of gene expression. Because several miRNAs are known to affect the stability or translation of developmental regulatory genes, the ...origin of novel miRNAs may have contributed to the evolution of developmental processes and morphology. Lepidoptera (butterflies and moths) is a species-rich clade with a well-established phylogeny and abundant genomic resources, thereby representing an ideal system in which to study miRNA evolution. We sequenced small RNA libraries from developmental stages of two divergent lepidopterans, Cameraria ohridella (Horse chestnut Leafminer) and Pararge aegeria (Speckled Wood butterfly), discovering 90 and 81 conserved miRNAs, respectively, and many species-specific miRNA sequences. Mapping miRNAs onto the lepidopteran phylogeny reveals rapid miRNA turnover and an episode of miRNA fixation early in lepidopteran evolution, implying that miRNA acquisition accompanied the early radiation of the Lepidoptera. One lepidopteran-specific miRNA gene, miR-2768, is located within an intron of the homeobox gene invected, involved in insect segmental and wing patterning. We identified cubitus interruptus (ci) as a likely direct target of miR-2768, and validated this suppression using a luciferase assay system. We propose a model by which miR-2768 modulates expression of ci in the segmentation pathway and in patterning of lepidopteran wing primordia.
The Pacific oyster Crassostrea gigas belongs to one of the most species-rich but genomically poorly explored phyla, the Mollusca. Here we report the sequencing and assembly of the oyster genome using ...short reads and a fosmid-pooling strategy, along with transcriptomes of development and stress response and the proteome of the shell. The oyster genome is highly polymorphic and rich in repetitive sequences, with some transposable elements still actively shaping variation. Transcriptome studies reveal an extensive set of genes responding to environmental stress. The expansion of genes coding for heat shock protein 70 and inhibitors of apoptosis is probably central to the oyster's adaptation to sessile life in the highly stressful intertidal zone. Our analyses also show that shell formation in molluscs is more complex than currently understood and involves extensive participation of cells and their exosomes. The oyster genome sequence fills a void in our understanding of the Lophotrochozoa.
We present a genome assembly from an individual female
(the White-barred Gold; Arthropoda, Insecta, Lepidoptera; Micropterigidae). The genome sequence is 1,079 megabases in span. Most of the assembly ...is scaffolded into 31 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 15.0 kilobases in length.
The emergence of multicellular organisms from single-celled ancestors – which occurred several times, independently in different branches of the eukaryotic tree – is one of the most profound ...evolutionary transitions in the history of life. These events not only radically changed the course of life on Earth but also created new challenges, including the need for cooperation and communication between cells, and the division of labor among different cell types. However, the genetic changes that accompanied the several origins of multicellularity remain elusive. Recently, the National Human Genome Research Institute (NHGRI) endorsed a multi-taxon genome-sequencing initiative that aims to gain insights into how multicellularity first evolved. This initiative (which we have termed UNICORN) will generate extensive genomic data from some of the closest extant unicellular relatives of both animals and fungi. Here, we introduce this initiative and the biological questions that underpin it, summarize the rationale guiding the choice of organisms and discuss the anticipated benefits to the broader scientific community.
Homeobox genes are a large and diverse group of genes, many of which play important roles in transcriptional regulation during embryonic development. Comparison of homeobox genes between species may ...provide insights into the evolution of developmental mechanisms.
Here we report an extensive survey of human and mouse homeobox genes based on their most recent genome assemblies, providing the first comprehensive analysis of mouse homeobox genes and updating an earlier survey of human homeobox genes. In total we recognize 333 human homeobox loci comprising 255 probable genes and 78 probable pseudogenes, and 324 mouse homeobox loci comprising 279 probable genes and 45 probable pseudogenes (accessible at http://homeodb.zoo.ox.ac.uk). Comparison to partial genome sequences from other species allows us to resolve which differences are due to gain of genes and which are due to gene losses.
We find there has been much more homeobox gene loss in the rodent evolutionary lineage than in the primate lineage. While humans have lost only the Msx3 gene, mice have lost Ventx, Argfx, Dprx, Shox, Rax2, LOC647589, Tprx1 and Nanognb. This analysis provides insight into the patterns of homeobox gene evolution in the mammals, and a step towards relating genomic evolution to phenotypic evolution.