Genetic changes causing brain size expansion in human evolution have remained elusive. Notch signaling is essential for radial glia stem cell proliferation and is a determinant of neuronal number in ...the mammalian cortex. We find that three paralogs of human-specific NOTCH2NL are highly expressed in radial glia. Functional analysis reveals that different alleles of NOTCH2NL have varying potencies to enhance Notch signaling by interacting directly with NOTCH receptors. Consistent with a role in Notch signaling, NOTCH2NL ectopic expression delays differentiation of neuronal progenitors, while deletion accelerates differentiation into cortical neurons. Furthermore, NOTCH2NL genes provide the breakpoints in 1q21.1 distal deletion/duplication syndrome, where duplications are associated with macrocephaly and autism and deletions with microcephaly and schizophrenia. Thus, the emergence of human-specific NOTCH2NL genes may have contributed to the rapid evolution of the larger human neocortex, accompanied by loss of genomic stability at the 1q21.1 locus and resulting recurrent neurodevelopmental disorders.
Display omitted
•NOTCH2NLA,B,C expressed in human fetal brain radial glia stem cells, arose 0.5–4 mya•These genes encode secreted, Notch-related proteins that enhance Notch signaling•Overexpression of NOTCH2NL delays neuronal differentiation, while deletion accelerates it•NOTCH2NLA and NOTCH2NLB serve as breakpoints in 1q21.1 deletion/duplication syndrome
Human-specific Notch paralogs are expressed in radial glia, enhance Notch signaling, and impact neuronal differentiation.
Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered ...using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as "Linked-Reads". This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes
,
, and
Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.
Abstract
Summary
Bulk RNA sequencing studies have demonstrated that human leukocyte antigen (HLA) genes may be expressed in a cell type-specific and allele-specific fashion. Single-cell gene ...expression assays have the potential to further resolve these expression patterns, but currently available methods do not perform allele-specific quantification at the molecule level. Here, we present scHLAcount, a post-processing workflow for single-cell RNA-seq data that computes allele-specific molecule counts of the HLA genes based on a personalized reference constructed from the sample’s HLA genotypes.
Availability and implementation
scHLAcount is available under the MIT license at https://github.com/10XGenomics/scHLAcount.
Supplementary information
Supplementary data are available at Bioinformatics online.
The cerebral cortex has expanded in size and complexity in primates, yet the molecular innovations that enabled primate-specific brain attributes remain obscure. We generated cerebral cortex ...organoids from human, chimpanzee, orangutan, and rhesus pluripotent stem cells and sequenced their transcriptomes at weekly time points for comparative analysis. We used transcript structure and expression conservation to discover gene regulatory long non-coding RNAs (lncRNAs). Of 2,975 human, multi-exonic lncRNAs, 2,472 were structurally conserved in at least one other species and 920 were conserved in all. Three hundred eighty-six human lncRNAs were transiently expressed (TrEx) and many were also TrEx in great apes (46%) and rhesus (31%). Many TrEx lncRNAs are expressed in specific cell types by single-cell RNA sequencing. Four TrEx lncRNAs selected based on cell-type specificity, gene structure, and expression pattern conservation were ectopically expressed in HEK293 cells by CRISPRa. All induced trans gene expression changes were consistent with neural gene regulatory activity.
•New orangutan and chimpanzee iPSC lines enable pan-primate gene expression analysis•Transiently expressed (TrEx) lncRNAs identified in primate cerebral cortex organoids•Conserved TrEx pattern correlates with cell-type specificity by single-cell RNA-seq•Four primate-conserved TrEx lncRNAs influence neural genes when ectopically expressed
In this article, Salama and colleagues identified transiently expressed (TrEx) lncRNAs from human, chimpanzee, orangutan, and rhesus pluripotent stem cell-derived cerebral cortex organoids and assessed their structural and expression conservation. Transient expression correlated with cell-type specificity as measured by single-cell RNA-seq. Ectopic expression of TrEx lncRNAs by CRISPRa modulated expression of neural genes in trans, suggesting regulatory function.
The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation
. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed ...without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes
and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.
Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand ...benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, we include 92% of the autosomal GRCh38 assembly while excluding regions problematic for benchmarking small variants, such as copy number variants, that should not have been in the previous version, which included 85% of GRCh38. It identifies eight times more false negatives in a short read variant call set relative to our previous benchmark. We demonstrate that this benchmark reliably identifies false positives and false negatives across technologies, enabling ongoing methods development.
Display omitted
•The Genome in a Bottle Consortium presents an expanded benchmark for 7 genomes•Long and linked reads expanded the benchmark to regions challenging for short reads•The expanded regions include challenging medically relevant genes like PMS2•This enables development of new technologies and bioinformatics methods
Analogous to placing puzzle pieces that look similar, mapping sequences to regions of the genome that look similar is challenging. Wagner et al. describe a new Genome in a Bottle Consortium resource for benchmarking accuracy of human genome sequencing in more challenging regions of the genome.
Abstract
Reptiles exhibit a variety of modes of sex determination, including both temperature-dependent and genetic mechanisms. Among those species with genetic sex determination, sex chromosomes of ...varying heterogamety (XX/XY and ZZ/ZW) have been observed with different degrees of differentiation. Karyotype studies have demonstrated that Gila monsters (Heloderma suspectum) have ZZ/ZW sex determination and this system is likely homologous to the ZZ/ZW system in the Komodo dragon (Varanus komodoensis), but little else is known about their sex chromosomes. Here, we report the assembly and analysis of the Gila monster genome. We generated a de novo draft genome assembly for a male using 10X Genomics technology. We further generated and analyzed short-read whole genome sequencing and whole transcriptome sequencing data for three males and three females. By comparing female and male genomic data, we identified four putative Z chromosome scaffolds. These putative Z chromosome scaffolds are homologous to Z-linked scaffolds identified in the Komodo dragon. Further, by analyzing RNAseq data, we observed evidence of incomplete dosage compensation between the Gila monster Z chromosome and autosomes and a lack of balance in Z-linked expression between the sexes. In particular, we observe lower expression of the Z in females (ZW) than males (ZZ) on a global basis, though we find evidence suggesting local gene-by-gene compensation. This pattern has been observed in most other ZZ/ZW systems studied to date and may represent a general pattern for female heterogamety in vertebrates.
Graphical Abstract
Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a ...reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33 Gb in EquCab2 to 2.41 Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5 Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.
We performed shallow single-cell sequencing of genomic DNA across 1475 cells from a cell-line, COLO829, to resolve overall complexity and clonality. This melanoma tumor-line has been previously ...characterized by multiple technologies and is a benchmark for evaluating somatic alterations. In some of these studies, COLO829 has shown conflicting and/or indeterminate copy number and, thus, single-cell sequencing provides a tool for gaining insight. Following shallow single-cell sequencing, we first identified at least four major sub-clones by discriminant analysis of principal components of single-cell copy number data. Based on clustering, break-point and loss of heterozygosity analysis of aggregated data from sub-clones, we identified distinct hallmark events that were validated within bulk sequencing and spectral karyotyping. In summary, COLO829 exhibits a classical Dutrillaux's monosomic/trisomic pattern of karyotype evolution with endoreduplication, where consistent sub-clones emerge from the loss/gain of abnormal chromosomes. Overall, our results demonstrate how shallow copy number profiling can uncover hidden biological insights.
The Zoonomia Project is investigating the genomics of shared and specialized traits in eutherian mammals. Here we provide genome assemblies for 131 species, of which all but 9 are previously ...uncharacterized, and describe a whole-genome alignment of 240 species of considerable phylogenetic diversity, comprising representatives from more than 80% of mammalian families. We find that regions of reduced genetic diversity are more abundant in species at a high risk of extinction, discern signals of evolutionary selection at high resolution and provide insights from individual reference genomes. By prioritizing phylogenetic diversity and making data available quickly and without restriction, the Zoonomia Project aims to support biological discovery, medical research and the conservation of biodiversity.