Abstract
Centromeres are functionally conserved chromosomal loci essential for proper chromosome segregation during cell division, yet they show high sequence diversity across species. Despite their ...variation, a near universal feature of centromeres is the presence of repetitive sequences, such as DNA satellites and transposable elements (TEs). Because of their rapidly evolving karyotypes, gibbons represent a compelling model to investigate divergence of functional centromere sequences across short evolutionary timescales. In this study, we use ChIP-seq, RNA-seq, and fluorescence in situ hybridization to comprehensively investigate the centromeric repeat content of the four extant gibbon genera (Hoolock, Hylobates, Nomascus, and Siamang). In all gibbon genera, we find that CENP-A nucleosomes and the DNA-proteins that interface with the inner kinetochore preferentially bind retroelements of broad classes rather than satellite DNA. A previously identified gibbon-specific composite retrotransposon, LAVA, known to be expanded within the centromere regions of one gibbon genus (Hoolock), displays centromere- and species-specific sequence differences, potentially as a result of its co-option to a centromeric function. When dissecting centromere satellite composition, we discovered the presence of the retroelement-derived macrosatellite SST1 in multiple centromeres of Hoolock, whereas alpha-satellites represent the predominate satellite in the other genera, further suggesting an independent evolutionary trajectory for Hoolock centromeres. Finally, using de novo assembly of centromere sequences, we determined that transcripts originating from gibbon centromeres recapitulate the species-specific TE composition. Combined, our data reveal dynamic shifts in the repeat content that define gibbon centromeres and coincide with the extensive karyotypic diversity within this lineage.
Mobile elements and repetitive genomic regions are sources of lineage-specific genomic innovation and uniquely fingerprint individual genomes. Comprehensive analyses of such repeat elements, ...including those found in more complex regions of the genome, require a complete, linear genome assembly. We present a de novo repeat discovery and annotation of the T2T-CHM13 human reference genome. We identified previously unknown satellite arrays, expanded the catalog of variants and families for repeats and mobile elements, characterized classes of complex composite repeats, and located retroelement transduction events. We detected nascent transcription and delineated CpG methylation profiles to define the structure of transcriptionally active retroelements in humans, including those in centromeres. These data expand our insight into the diversity, distribution, and evolution of repetitive regions that have shaped the human genome.
Chromosome reshuffling events are often a foundational mechanism by which speciation can occur, giving rise to highly derivative karyotypes even amongst closely related species. Yet, the features ...that distinguish lineages prone to such rapid chromosome evolution from those that maintain stable karyotypes across evolutionary time are still to be defined. In this review, we summarize lineages prone to rapid karyotypic evolution in the context of Simpson's rates of evolution-tachytelic, horotelic, and bradytelic-and outline the mechanisms proposed to contribute to chromosome rearrangements, their fixation, and their potential impact on speciation events. Furthermore, we discuss relevant genomic features that underpin chromosome variation, including patterns of fusions/fissions, centromere positioning, and epigenetic marks such as DNA methylation. Finally, in the era of telomere-to-telomere genomics, we discuss the value of gapless genome resources to the future of research focused on the plasticity of highly rearranged karyotypes.
The Javan gibbon, Hylobates moloch, is an endangered gibbon species restricted to the forest remnants of western and central Java, Indonesia, and one of the rarest of the Hylobatidae family. ...Hylobatids consist of 4 genera (Holoock, Hylobates, Symphalangus, and Nomascus) that are characterized by different numbers of chromosomes, ranging from 38 to 52. The underlying cause of this karyotype plasticity is not entirely understood, at least in part, due to the limited availability of genomic data. Here we present the first scaffold-level assembly for H. moloch using a combination of whole-genome Illumina short reads, 10X Chromium linked reads, PacBio, and Oxford Nanopore long reads and proximity-ligation data. This Hylobates genome represents a valuable new resource for comparative genomics studies in primates.
Current genome sequencing technologies have made it possible to generate highly contiguous genome assemblies for non-model animal species. Despite advances in genome assembly methods, there is still ...room for improvement in the delineation of specific gene features in the genomes. Here we present genome visualization and annotation tools to support seven livestock species (bovine, chicken, goat, horse, pig, sheep, and water buffalo), available in a new resource called AgAnimalGenomes. In addition to supporting the manual refinement of gene models, these browsers provide visualization tracks for hundreds of RNAseq experiments, as well as data generated by the Functional Annotation of Animal Genomes (FAANG) Consortium. For species with predicted gene sets from both Ensembl and RefSeq, the browsers provide special tracks showing the thousands of protein-coding genes that disagree across the two gene sources, serving as a valuable resource to alert researchers to gene model issues that may affect data interpretation. We describe the data and search methods available in the new genome browsers and how to use the provided tools to edit and create new gene models.
The eastern quoll (Dasyurus viverrinus) is an endangered marsupial native to Australia. Since the extirpation of its mainland populations in the 20th century, wild eastern quolls have been restricted ...to two islands at the southern end of their historical range. Eastern quolls are the subject of captive breeding programs and attempts have been made to re-establish a population in mainland Australia. However, few resources currently exist to guide the genetic management of this species. Here, we generated a reference genome for the eastern quoll with gene annotations supported by multi-tissue transcriptomes. Our assembly is among the most complete marsupial genomes currently available. Using this assembly, we infer the species' demographic history, identifying potential evidence of a long-term decline beginning in the late Pleistocene. Finally, we identify a deletion at the ASIP locus that likely underpins pelage color differences between the eastern quoll and the closely related Tasmanian devil (Sarcophilus harrisii).
The complete sequence of a human genome Nurk, Sergey; Koren, Sergey; Rhie, Arang ...
Science (American Association for the Advancement of Science),
04/2022, Volume:
376, Issue:
6588
Journal Article
Peer reviewed
Open access
Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining ...8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
Complete genomic and epigenetic maps of human centromeres Altemose, Nicolas; Logsdon, Glennis A; Bzikadze, Andrey V ...
Science (American Association for the Advancement of Science),
04/2022, Volume:
376, Issue:
6588
Journal Article
Peer reviewed
Open access
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which ...include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications
. As a ...result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished
. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome
and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.