Current clinical approaches for mutation discovery are based on short sequence reads (100-300 bp) of exons and flanking splice sites targeted by multigene panels or whole exomes. Short-read ...sequencing is highly accurate for detection of single nucleotide variants, small indels and simple copy number differences but is of limited use for identifying complex insertions and deletions and other structural rearrangements. We used CRISPR-Cas9 to excise complete
and
genomic regions from lymphoblast cells of patients with breast cancer, then sequenced these regions with long reads (>10 000 bp) to fully characterise all non-coding regions for structural variation. In a family severely affected with early-onset bilateral breast cancer and with negative (normal) results by gene panel and exome sequencing, we identified an intronic SINE-VNTR-Alu retrotransposon insertion that led to the creation of a pseudoexon in the
message and introduced a premature truncation. This combination of CRISPR-Cas9 excision and long-read sequencing reveals a class of complex, damaging and otherwise cryptic mutations that may be particularly frequent in tumour suppressor genes replete with intronic repeats.
Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length ...complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single- to mega-base pair-sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.
X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease that is endemic to the Philippines and is associated with a founder haplotype. We integrated multiple genome and ...transcriptome assembly technologies to narrow the causal mutation to the TAF1 locus, which included a SINE-VNTR-Alu (SVA) retrotransposition into intron 32 of the gene. Transcriptome analyses identified decreased expression of the canonical cTAF1 transcript among XDP probands, and de novo assembly across multiple pluripotent stem-cell-derived neuronal lineages discovered aberrant TAF1 transcription that involved alternative splicing and intron retention (IR) in proximity to the SVA that was anti-correlated with overall TAF1 expression. CRISPR/Cas9 excision of the SVA rescued this XDP-specific transcriptional signature and normalized TAF1 expression in probands. These data suggest an SVA-mediated aberrant transcriptional mechanism associated with XDP and may provide a roadmap for layered technologies and integrated assembly-based analyses for other unsolved Mendelian disorders.
Display omitted
•Genome assembly narrows the causal XDP locus to TAF1•An XDP-specific SVA insertion causes intron retention and reduced expression of TAF1•CRISPR/Cas9 excision of SVA rescues aberrant splicing and TAF1 expression in XDP•Expression profiling implicates neurodevelopment and dystonia pathways in XDP
A Mendelian form of dystonia arises from altered splicing and intron retention within a general transcription factor.
Drylands are Earth's largest terrestrial biome and support one‐third of the global population. However, they are also highly vulnerable to land degradation. Despite widespread demand for dryland ...restoration and rehabilitation, little information is available to help land managers effectively re‐establish native perennial vegetation across drylands.
RestoreNet is an emerging dryland restoration network that systematically tests revegetation techniques across environmental gradients. Using the RestoreNet experimental framework, we tested the effectiveness of restoration treatments (i.e. ConMod nurse plant structures, mulch, pits) that increase soil moisture and seed mixes with different climatic niches to achieve revegetation goals.
Across sites, seedling recruitment was consistently influenced by treatment and seed mix type. Pit and mulch treatments increased total seedling density, with pits promoting the highest seeded species recruitment while limiting non‐native species establishment. Seeding increased total seedling density regardless of seed mix type, but cooler‐adapted seed mixes promoted greater seeded species density and resulted in lower density of unseeded (non‐native) species relative to warmer‐adapted mixes.
Seedling recruitment was also controlled by the temporal and environmental context of restoration with the positive effect of high precipitation greatest in the weeks immediately following seeding. Above‐average precipitation during the study period across most of the sites may partially explain why the highest seeded species recruitment occurred in pit treatments and seed mixes with cooler, wetter niche requirements.
Synthesis and applications. Results from the dryland restoration network, RestoreNet help to better understand variation in seeding and restoration treatment success across space and time in drylands. Relationships between restoration practices and environmental conditions in our study suggest the importance of anticipatory restoration strategies that forecast seasonal and sub‐seasonal weather conditions and select plant species with climate niche requirements appropriate for current and future climate conditions. This information is critical to land managers tasked with improving ecosystem conditions across degraded dryland regions.
Results from the dryland restoration network, RestoreNet help to better understand variation in seeding and restoration treatment success across space and time in drylands. Relationships between restoration practices and environmental conditions in our study suggest the importance of anticipatory restoration strategies that forecast seasonal and sub‐seasonal weather conditions and select plant species with climate niche requirements appropriate for current and future climate conditions. This information is critical to land managers tasked with improving ecosystem conditions across degraded dryland regions.
In many repeat diseases, such as Huntington's disease (HD), ongoing repeat expansions in affected tissues contribute to disease onset, progression and severity. Inducing contractions of expanded ...repeats by exogenous agents is not yet possible. Traditional approaches would target proteins driving repeat mutations. Here we report a compound, naphthyridine-azaquinolone (NA), that specifically binds slipped-CAG DNA intermediates of expansion mutations, a previously unsuspected target. NA efficiently induces repeat contractions in HD patient cells as well as en masse contractions in medium spiny neurons of HD mouse striatum. Contractions are specific for the expanded allele, independently of DNA replication, require transcription across the coding CTG strand and arise by blocking repair of CAG slip-outs. NA-induced contractions depend on active expansions driven by MutSβ. NA injections in HD mouse striatum reduce mutant HTT protein aggregates, a biomarker of HD pathogenesis and severity. Repeat-structure-specific DNA ligands are a novel avenue to contract expanded repeats.
Drylands are highly vulnerable to land degradation, and despite increasing efforts, restoration success remains low. Although often ignored in the design and deployment of management strategies, soil ...microbial communities might be critical for dryland restoration due to their central role in promoting soil stability, nutrient cycling and plant establishment.
We collected soil samples from eight dryland restoration sites within RestoreNet, a restoration field trial network, and determined their soil microbiome using 16S rRNA (bacteria and archaea) and ITS (fungi) amplicon sequencing. Each previously degraded site was treated with monoculture (single species) and polyculture (multiple species) seedling plantings.
Contrary to our initial expectations, we found that these different revegetation interventions did not trigger changes in microbial diversity, composition or relative abundance of functional groups across sites after 1 year of revegetation.
Synthesis and applications. Considering the crucial role of soil micro‐organisms in dryland ecosystem functions, our results suggest that site‐specific targeted microbiome restoration should be considered to accelerate the establishment of desired microbial communities. Plant community‐based restoration practices such as revegetation have a limited impact on soil micro‐organisms in the short term.
旱地生态系统中植物多样性的恢复是否能引起土壤微生物组的伴随性恢复?
旱地生态系统极易发生水土流失, 且生态恢复代价高,难度大。土壤微生物在过去的生态恢复的规划与实施中常常被忽略, 但其在土壤稳定性、营养循环、植物建成方面的核心作用显示土壤微生物对于旱地生态系统恢复可能是至关重要的。
我们从由美国地质调查局 (USDA) 主导的旱地恢复实地试验网络 (RestoreNet) 中选取了八个正在进行生态恢复的场地采集土样, 提取并扩增细菌与古菌的16S核糖体核糖核酸(rRNA)与真菌的内部转录间隔区 (ITS) 序列并测序, 用于分析单一种植试验田与混合种植试验田中土壤微生物系群的区别。
得到的试验结果与最初的预测相反, 在为期一年对生态恢复后, 单一种植试验田与混合栽培试验田中的微生物多样性、群落组成、或是微生物功能组的相对丰度没有显著区别。
分析与应用:土壤微生物在生态系统功能发挥着极其重要的功能, 我们的结果指出对于土壤微生物群落对生态恢复应当有针对性且因地制宜。传统植物群落主导的恢复措施如植被恢复在短时间内对土壤微生物的影响非常有限。
Considering the crucial role of soil micro‐organisms in dryland ecosystem functions, our results suggest that site‐specific targeted microbiome restoration should be considered to accelerate the establishment of desired microbial communities. Plant community‐based restoration practices such as revegetation have a limited impact on soil micro‐organisms in the short term.
Abstract
TRP channel-associated factor 1/2 (TCAF1/TCAF2) proteins antagonistically regulate the cold-sensor protein TRPM8 in multiple human tissues. Understanding their significance has been ...complicated given the locus spans a gap-ridden region with complex segmental duplications in GRCh38. Using long-read sequencing, we sequence-resolve the locus, annotate full-length
TCAF
models in primate genomes, and show substantial human-specific
TCAF
copy number variation. We identify two human super haplogroups, H4 and H5, and establish that
TCAF
duplications originated ~1.7 million years ago but diversified only in
Homo sapiens
by recurrent structural mutations. Conversely, in all archaic-hominin samples the fixation for a specific H4 haplotype without duplication is likely due to positive selection. Here, our results of
TCAF
copy number expansion, selection signals in hominins, and differential
TCAF2
expression between haplogroups and high
TCAF2
and
TRPM8
expression in liver and prostate in modern-day humans imply
TCAF
diversification among hominins potentially in response to cold or dietary adaptations.
The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in ...human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human-ape gene families, nuclear pore interacting protein (NPIP).
Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis.
LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.
Structural variation and single-nucleotide variation of the complement factor H (CFH) gene family underlie several complex genetic diseases, including age-related macular degeneration (AMD) and ...atypical hemolytic uremic syndrome (AHUS). To understand its diversity and evolution, we performed high-quality sequencing of this ∼360-kbp locus in six primate lineages, including multiple human haplotypes. Comparative sequence analyses reveal two distinct periods of gene duplication leading to the emergence of four CFH-related (CFHR) gene paralogs (CFHR2 and CFHR4 ∼25–35 Mya and CFHR1 and CFHR3 ∼7–13 Mya). Remarkably, all evolutionary breakpoints share a common ∼4.8-kbp segment corresponding to an ancestral CFHR gene promoter that has expanded independently throughout primate evolution. This segment is recurrently reused and juxtaposed with a donor duplication containing exons 8 and 9 from ancestral CFH, creating four CFHR fusion genes that include lineage-specific members of the gene family. Combined analysis of >5,000 AMD cases and controls identifies a significant burden of a rare missense mutation that clusters at the N terminus of CFH P = 5.81 × 10−8, odds ratio (OR) = 9.8 (3.67-Infinity). A bipolar clustering pattern of rare nonsynonymous mutations in patients with AMD (P < 10−3) and AHUS (P = 0.0079) maps to functional domains that show evidence of positive selection during primate evolution. Our structural variation analysis in >2,400 individuals reveals five recurrent rearrangement breakpoints that show variable frequency among AMD cases and controls. These data suggest a dynamic and recurrent pattern of mutation critical to the emergence of new CFHR genes but also in the predisposition to complex human genetic disease phenotypes.
Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing ...data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children—a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10−8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.
Display omitted