The complete sequence of a human genome Nurk, Sergey; Koren, Sergey; Rhie, Arang ...
Science (American Association for the Advancement of Science),
04/2022, Volume:
376, Issue:
6588
Journal Article
Peer reviewed
Open access
Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining ...8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering ...common health care-associated infections nearly impossible to treat. To determine the diversity of carbapenemase-encoding plasmids and assess their mobility among bacterial species, we performed comprehensive surveillance and genomic sequencing of carbapenem-resistant Enterobacteriaceae in the National Institutes of Health (NIH) Clinical Center patient population and hospital environment. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem resistance genes on a wide array of plasmids. K. pneumoniae and E. cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, indicating that plasmid transfer between organisms was unlikely within this patient. We did, however, find evidence of horizontal transfer of carbapenemase-encoding plasmids between K. pneumoniae, E. cloacae, and C. freundii in the hospital environment. Our data, including full plasmid identification, challenge assumptions about horizontal gene transfer events within patients and identify possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by K. pneumoniae, E. coli, E. cloacae, and Pantoea species, in unrelated patients and in the hospital environment.
Calcium-activated photoproteins are luciferase variants found in photocyte cells of bioluminescent jellyfish (Phylum Cnidaria) and comb jellies (Phylum Ctenophora). The complete genomic sequence from ...the ctenophore Mnemiopsis leidyi, a representative of the earliest branch of animals that emit light, provided an opportunity to examine the genome of an organism that uses this class of luciferase for bioluminescence and to look for genes involved in light reception. To determine when photoprotein genes first arose, we examined the genomic sequence from other early-branching taxa. We combined our genomic survey with gene trees, developmental expression patterns, and functional protein assays of photoproteins and opsins to provide a comprehensive view of light production and light reception in Mnemiopsis.
The Mnemiopsis genome has 10 full-length photoprotein genes situated within two genomic clusters with high sequence conservation that are maintained due to strong purifying selection and concerted evolution. Photoprotein-like genes were also identified in the genomes of the non-luminescent sponge Amphimedon queenslandica and the non-luminescent cnidarian Nematostella vectensis, and phylogenomic analysis demonstrated that photoprotein genes arose at the base of all animals. Photoprotein gene expression in Mnemiopsis embryos begins during gastrulation in migrating precursors to photocytes and persists throughout development in the canals where photocytes reside. We identified three putative opsin genes in the Mnemiopsis genome and show that they do not group with well-known bilaterian opsin subfamilies. Interestingly, photoprotein transcripts are co-expressed with two of the putative opsins in developing photocytes. Opsin expression is also seen in the apical sensory organ. We present evidence that one opsin functions as a photopigment in vitro, absorbing light at wavelengths that overlap with peak photoprotein light emission, raising the hypothesis that light production and light reception may be functionally connected in ctenophore photocytes. We also present genomic evidence of a complete ciliary phototransduction cascade in Mnemiopsis.
This study elucidates the genomic organization, evolutionary history, and developmental expression of photoprotein and opsin genes in the ctenophore Mnemiopsis leidyi, introduces a novel dual role for ctenophore photocytes in both bioluminescence and phototransduction, and raises the possibility that light production and light reception are linked in this early-branching non-bilaterian animal.
Genome-wide association studies (GWAS) have identified >100 independent SNPs that modulate the risk of type 2 diabetes (T2D) and related traits. However, the pathogenic mechanisms of most of these ...SNPs remain elusive. Here, we examined genomic, epigenomic, and transcriptomic profiles in human pancreatic islets to understand the links between genetic variation, chromatin landscape, and gene expression in the context of T2D. We first integrated genome and transcriptome variation across 112 islet samples to produce dense cis-expression quantitative trait loci (cis-eQTL) maps. Additional integration with chromatin-state maps for islets and other diverse tissue types revealed that cis-eQTLs for islet-specific genes are specifically and significantly enriched in islet stretch enhancers. High-resolution chromatin accessibility profiling using assay for transposase-accessible chromatin sequencing (ATAC-seq) in two islet samples enabled us to identify specific transcription factor (TF) footprints embedded in active regulatory elements, which are highly enriched for islet cis-eQTL. Aggregate allelic bias signatures in TF footprints enabled us de novo to reconstruct TF binding affinities genetically, which support the high-quality nature of the TF footprint predictions. Interestingly, we found that T2D GWAS loci were strikingly and specifically enriched in islet Regulatory Factor X (RFX) footprints. Remarkably, within and across independent loci, T2D risk alleles that overlap with RFX footprints uniformly disrupt the RFX motifs at high-information content positions. Together, these results suggest that common regulatory variations have shaped islet TF footprints and the transcriptome and that a confluent RFX regulatory grammar plays a significant role in the genetic component of T2D predisposition.
Human microbiome studies have revealed the intricate interplay of host immunity and bacterial communities to achieve homeostatic balance. Healthy skin microbial communities are dominated by bacteria ...with low viral representation
, mainly bacteriophage. Specific eukaryotic viruses have been implicated in both common and rare skin diseases, but cataloging skin viral communities has been limited. Alterations in host immunity provide an opportunity to expand our understanding of microbial-host interactions. Primary immunodeficient patients manifest with various viral, bacterial, fungal, and parasitic infections, including skin infections
. Dedicator of cytokinesis 8 (DOCK8) deficiency is a rare primary human immunodeficiency characterized by recurrent cutaneous and systemic infections, as well as atopy and cancer susceptibility
. DOCK8, encoding a guanine nucleotide exchange factor highly expressed in lymphocytes, regulates actin cytoskeleton, which is critical for migration through collagen-dense tissues such as skin
. Analyzing deep metagenomic sequencing data from DOCK8-deficient skin samples demonstrated a notable increase in eukaryotic viral representation and diversity compared with healthy volunteers. De novo assembly approaches identified hundreds of novel human papillomavirus genomes, illuminating microbial dark matter. Expansion of the skin virome in DOCK8-deficient patients underscores the importance of immune surveillance in controlling eukaryotic viral colonization and infection.
DNA methylation plays a key role in X-chromosome inactivation (XCI), a process that achieves dosage compensation for X-encoded gene products between mammalian female and male cells. However, ...differential sex chromosome dosage complicates genome-wide epigenomic assessments, and the X chromosome is frequently excluded from female-to-male comparative analyses. Using the X chromosome in the sexually dimorphic mouse liver as a model, we provide a general framework for comparing base-resolution DNA methylation patterns across samples that have different chromosome numbers and ask at a systematic level if predictions by historical analyses of X-linked DNA methylation hold true at a base-resolution chromosome-wide level. We demonstrate that sex-specific methylation patterns on the X chromosome largely reflect the effects of XCI. While our observations concur with longstanding observations of XCI at promoter-proximal CpG islands, we provide evidence that sex-specific DNA methylation differences are not limited to CpG island boundaries. Moreover, these data support a model in which maintenance of CpG islands in the inactive state does not require complete regional methylation. Further, we validate an intragenic non-CpG methylation signature in genes escaping XCI in mouse liver. Our analyses provide insight into underlying methylation patterns that should be considered when assessing sex differences in genome-wide methylation analyses.
DNA methylation is an essential epigenetic process in mammals, intimately involved in gene regulation. Here we address the extent to which genetics, sex, and pregnancy influence genomic DNA ...methylation by intercrossing 2 inbred mouse strains, C57BL/6N and C3H/HeN, and analyzing DNA methylation in parents and offspring using whole-genome bisulfite sequencing. Differential methylation across genotype is detected at thousands of loci and is preserved on parental alleles in offspring. In comparison of autosomal DNA methylation patterns across sex, hundreds of differentially methylated regions are detected. Comparison of animals with different histories of pregnancy within our study reveals a CpG methylation pattern that is restricted to female animals that had borne offspring. Collectively, our results demonstrate the stability of CpG methylation across generations, clarify the interplay of epigenetics with genetics and sex, and suggest that CpG methylation may serve as an epigenetic record of life events in somatic tissues at loci whose expression is linked to the relevant biology.
Three types of DNA methyl modifications have been detected in bacterial genomes, and mechanistic studies have demonstrated roles for DNA methylation in physiological functions ranging from phage ...defense to transcriptional control of virulence and host-pathogen interactions. Despite the ubiquity of methyltransferases and the immense variety of possible methylation patterns, epigenomic diversity remains unexplored for most bacterial species. Members of the Bacteroides fragilis group (BFG) reside in the human gastrointestinal tract as key players in symbiotic communities but also can establish anaerobic infections that are increasingly multi-drug resistant. In this work, we utilize long-read sequencing technologies to perform pangenomic (n = 383) and panepigenomic (n = 268) analysis of clinical BFG isolates cultured from infections seen at the NIH Clinical Center over four decades. Our analysis reveals that single BFG species harbor hundreds of DNA methylation motifs, with most individual motif combinations occurring uniquely in single isolates, implying immense unsampled methylation diversity within BFG epigenomes. Mining of BFG genomes identified more than 6000 methyltransferase genes, approximately 1000 of which were associated with intact prophages. Network analysis revealed substantial gene flow among disparate phage genomes, implying a role for genetic exchange between BFG phages as one of the ultimate sources driving BFG epigenome diversity.
Spinal muscular atrophy (SMA) is the most common genetic disease in children. SMA is generally caused by mutations in the gene
. The survival of motor neurons (SMN) complex consists of SMN1, Gemins ...(2-8), and Strap/Unrip. We previously demonstrated
and
inhibited tissue regeneration in zebrafish. Here we investigated each individual SMN complex member and identified
as another regeneration-essential gene. These three genes are likely pan-regenerative, since they affect the regeneration of hair cells, liver, and caudal fin. RNA-Seq analysis reveals that
,
, and
are linked to a common set of genetic pathways, including the tp53 and ErbB pathways. Additional studies indicated all three genes facilitate regeneration by inhibiting the ErbB pathway, thereby allowing cell proliferation in the injured neuromasts. This study provides a new understanding of the SMN complex and a potential etiology for SMA and potentially other rare unidentified genetic diseases with similar symptoms.