Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing ...approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS. We compare the genetic variant-calling precision and recall of Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) HiFi platforms over a range of sequence coverages. For read-based applications, LRS sensitivity begins to plateau around 12-fold coverage with a majority of variants called with reasonable accuracy (F
score above 0.5), and both platforms perform well for SV detection. Genome assembly increases variant-calling precision and recall of SVs and indels in HiFi data sets with HiFi outperforming ONT in quality as measured by the F
score of assembly-based variant call sets. While both technologies continue to evolve, our work offers guidance to design cost-effective experimental strategies that do not compromise on discovering novel biology.
The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation
. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed ...without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes
and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.
Restoration in dryland ecosystems often has poor success due to low and variable water availability, degraded soil conditions, and slow plant community recovery rates. Restoration treatments can ...mitigate these constraints but, because treatments and subsequent monitoring are typically limited in space and time, our understanding of their applicability across broader environmental gradients remains limited. To address this limitation, we implemented and monitored a standardized set of seeding and soil surface treatments (pits, mulch, and ConMod artificial nurse plants) designed to enhance soil moisture and seedling establishment across RestoreNet, a growing network of 21 diverse dryland restoration sites in the southwestern USA over 3 years. Generally, we found that the timing of precipitation relative to seeding and the use of soil surface treatments were more important in determining seeded species emergence, survival, and growth than site‐specific characteristics. Using soil surface treatments in tandem with seeding promoted up to 3× greater seedling emergence densities compared with seeding alone. The positive effect of soil surface treatments became more prominent with increased cumulative precipitation since seeding. The seed mix type with species currently found within or near a site and adapted to the historical climate promoted greater seedling emergence densities compared with the seed mix type with species from warmer, drier conditions expected to perform well under climate change. Seed mix and soil surface treatments had a diminishing effect as plants developed beyond the first season of establishment. However, we found strong effects of the initial period seeded and of the precipitation leading up to each monitoring date on seedling survival over time, especially for annual and perennial forbs. The presence of exotic species exerted a negative influence on seedling survival and growth, but not initial emergence. Our findings suggest that seeded species recruitment across drylands can generally be promoted, regardless of location, by (1) incorporation of soil surface treatments, (2) employment of near‐term seasonal climate forecasts, (3) suppression of exotic species, and (4) seeding at multiple times. Taken together, these results point to a multifaceted approach to ameliorate harsh environmental conditions for improved seeding success in drylands, both now and under expected aridification.
The ecosystems along the border between the United States and Mexico are at increasing risk to wildfire due to interactions among climate, land-use, and fuel loads. A wide range of fuel treatments ...have been implemented to mitigate wildfire and its threats to valued resources, yet we have little information about treatment effectiveness. To fill critical knowledge gaps, we reviewed wildfire risk and fuel treatment studies that were conducted near the US-Mexico border and published in the peer-reviewed literature between 1986 and 2019. The number of studies has grown during this time in warm desert to forest ecosystems on primarily federal lands. The most common study topics included fire effects on native species, the role of invasive species and woody encroachment on wildfire risk, historical fire regimes, and remote sensing and modeling to study wildfire risk across the landscape. A majority of fuel treatment studies focused on prescribed burns, and fuel treatments collectively had mixed effects on mitigating future wildfire risk and threats to ecosystems depending on vegetation and fire characteristics. The diversity of ecosystems and land ownership along the US-Mexico border present unique challenges for understanding and managing wildfire risk, and also create opportunities for collaboration and cross-site studies to promote knowledge across broad environmental gradients.
Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size
. As a result, patterns of human centromeric variation and models for ...their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions
. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome
. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.
Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical ...disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease.
The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date
and led to its systematic omission from genomic analyses. Here we present de ...novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes
. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established
boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.
Direct observation of transcription factor action in the living cell nucleus can provide important insights into gene regulatory mechanisms. Live-cell imaging techniques have enabled the ...visualization of a variety of intranuclear activities, from chromosome dynamics to gene expression. However, progress in studying transcription regulation of specific native genes has been limited, primarily as a result of difficulties in resolving individual gene loci and in detecting the small number of protein molecules functioning within active transcription units. Here we report that multiphoton microscopy imaging of polytene nuclei in living Drosophila salivary glands allows real-time analysis of transcription factor recruitment and exchange on specific native genes. After heat shock, we have visualized the recruitment of RNA polymerase II (Pol II) to native hsp70 gene loci 87A and 87C in real time. We show that heat shock factor (HSF), the transcription activator of hsp70, is localized to the nucleus before heat shock and translocates from nucleoplasm to chromosomal loci after heat shock. Assays based on fluorescence recovery after photobleaching show a rapid exchange of HSF at chromosomal loci under non-heat-shock conditions but a very slow exchange after heat shock. However, this is not a consequence of a change of HSF diffusibility, as shown here directly by fluorescence correlation spectroscopy. Our results provide strong evidence that activated HSF is stably bound to DNA in vivo and that turnover or disassembly of transcription activator is not required for rounds of hsp70 transcription. This and previous studies indicate that transcription activators display diverse dynamic behaviours in their associations with targeted loci in living cells. Our method can be applied to study the dynamics of many factors involved in transcription and RNA processing, and in their regulation at native heat shock genes in vivo.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Abstract
Human centromeres are mainly composed of alpha satellite DNA hierarchically organized as higher-order repeats (HORs). Alpha satellite dynamics is shown by sequence homogenization in ...centromeric arrays and by its transfer to other centromeric locations, for example, during the maturation of new centromeres. We identified during prenatal aneuploidy diagnosis by fluorescent in situ hybridization a de novo insertion of alpha satellite DNA from the centromere of chromosome 18 (D18Z1) into cytoband 15q26. Although bound by CENP-B, this locus did not acquire centromeric functionality as demonstrated by the lack of constriction and the absence of CENP-A binding. The insertion was associated with a 2.8-kbp deletion and likely occurred in the paternal germline. The site was enriched in long terminal repeats and located ∼10 Mbp from the location where a centromere was ancestrally seeded and became inactive in the common ancestor of humans and apes 20–25 million years ago. Long-read mapping to the T2T-CHM13 human genome assembly revealed that the insertion derives from a specific region of chromosome 18 centromeric 12-mer HOR array in which the monomer size follows a regular pattern. The rearrangement did not directly disrupt any gene or predicted regulatory element and did not alter the methylation status of the surrounding region, consistent with the absence of phenotypic consequences in the carrier. This case demonstrates a likely rare but new class of structural variation that we name “alpha satellite insertion.” It also expands our knowledge on alphoid DNA dynamics and conveys the possibility that alphoid arrays can relocate near vestigial centromeric sites.
We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 ...lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or ∼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.
Display omitted
•Long-read sequence assembly of eight primate genomes•Atlas of lineage-specific and recurrent structural variation•Structurally divergent regions (SDRs) associate with lineage-specific genes•Recurrent duplications diversify primate genes and predispose to human disease
Analysis of high-quality, haplotype-resolved primate genomes provides a more complete understanding of lineage-specific, recurrent mutations and structurally divergent regions associated with primate adaptive evolution and human diseases.