Human-specific duplications at chromosome 16p11.2 mediate recurrent pathogenic 600 kbp BP4–BP5 copy-number variations, which are among the most common genetic causes of autism. These copy-number ...polymorphic duplications are under positive selection and include three to eight copies of BOLA2, a gene involved in the maturation of cytosolic iron-sulfur proteins. To investigate the potential advantage provided by the rapid expansion of BOLA2, we assessed hematological traits and anemia prevalence in 379,385 controls and individuals who have lost or gained copies of BOLA2: 89 chromosome 16p11.2 BP4–BP5 deletion carriers and 56 reciprocal duplication carriers in the UK Biobank. We found that the 16p11.2 deletion is associated with anemia (18/89 carriers, 20%, p = 4e−7, OR = 5), particularly iron-deficiency anemia. We observed similar enrichments in two clinical 16p11.2 deletion cohorts, which included 6/63 (10%) and 7/20 (35%) unrelated individuals with anemia, microcytosis, low serum iron, or low blood hemoglobin. Upon stratification by BOLA2 copy number, our data showed an association between low BOLA2 dosage and the above phenotypes (8/15 individuals with three copies, 53%, p = 1e-4). In parallel, we analyzed hematological traits in mice carrying the 16p11.2 orthologous deletion or duplication, as well as Bola2+/− and Bola2−/− animals. The Bola2-deficient mice and the mice carrying the deletion showed early evidence of iron deficiency, including a mild decrease in hemoglobin, lower plasma iron, microcytosis, and an increased red blood cell zinc-protoporphyrin-to-heme ratio. Our results indicate that BOLA2 participates in iron homeostasis in vivo, and its expansion has a potential adaptive role in protecting against iron deficiency.
Recurrent rearrangements of Chromosome 8p23.1 are associated with congenital heart defects and developmental delay. The complexity of this region has led to inconsistencies in the current reference ...assembly, confounding studies of genetic variation. Using comparative sequence-based approaches, we generated a high-quality 6.3-Mbp alternate reference assembly of an inverted Chromosome 8p23.1 haplotype. Comparison with nonhuman primates reveals a 746-kbp duplicative transposition and two separate inversion events that arose in the last million years of human evolution. The breakpoints associated with these rearrangements map to an ape-specific interchromosomal core duplicon that clusters at sites of evolutionary inversion (P = 7.8 × 10
). Refinement of microdeletion breakpoints identifies a subgroup of patients that map to the same interchromosomal core involved in the evolutionary formation of the duplication blocks. Our results define a higher-order genomic instability element that has shaped the structure of specific chromosomes during primate evolution contributing to rearrangements associated with inversion and disease.
Background
Wildland fire in arid and semi-arid (dryland) regions can intensify when climatic, biophysical, and land-use factors increase fuel load and continuity. To inform wildland fire management ...under these conditions, we developed high-resolution (10-m) estimates of fine fuel across the Altar Valley in southern Arizona, USA, which spans dryland, grass-dominated ecosystems that are administered by multiple land managers and owners. We coupled field measurements at the end of the 2021 growing season with Sentinel-2 satellite imagery and vegetation indices acquired during and after the growing season to develop predictions of fine fuel across the entire valley. We then assessed how climate, soil, vegetation, and land-use factors influenced the amount and distribution of fine fuels. We connected fine fuels to fire management points, past ignition history, and socio-economic vulnerability to evaluate wildfire exposure and assessed how fuel related to habitat of the endangered masked bobwhite quail (
Colinus virginianus ridgwayi
).
Results
The high amount of fine fuel (400–3600 kg/ha; mean = 1392 kg/ha) predicted by our remote sensing model (
R
2
= 0.63) for 2021 compared to previous years in the valley was stimulated by near-record high growing season precipitation that was 177% of the 1990–2020 mean. Fine fuel increased across the valley if it was contained within the wildlife refuge boundary and had lower temperature and vapor pressure deficit, higher soil organic content, and abundant annual plants and an invasive perennial grass (
R
2
= 0.24). The index of potential exposure to wildfire showed a clustering of high exposure centered around roads and low-density housing development distant from fire management points and extending into the upper elevations flanking the valley. Within the Buenos Aires National Wildlife Refuge, fine fuel increased with habitat suitability for the masked bobwhite quail within and adjacent to core habitat areas, representing a natural resource value at risk, accompanied with higher overall mean fine fuel (1672 kg/ha) in relation to 2015 (1347 kg/ha) and 2020 (1363 kg/ha) means.
Conclusions
By connecting high-resolution estimates of fine fuel to climatic, biophysical and land-use factors, wildfire exposure, and a natural resource value at risk, we provide a pro-active and adaptive framework for fire risk management within highly variable and rapidly changing dryland landscapes.
Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled ...haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.
A draft human pangenome reference Liao, Wen-Wei; Asri, Mobin; Ebler, Jana ...
Nature (London),
05/2023, Letnik:
617, Številka:
7960
Journal Article
Recenzirano
Odprti dostop
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse ...individuals
. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.
The rhesus macaque (
) is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the ...sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.
Despite widespread clinical genetic testing, many individuals with suspected genetic conditions lack a precise diagnosis, limiting their opportunity to take advantage of state-of-the-art treatments. ...In some cases, testing reveals difficult-to-evaluate structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted. We performed targeted long-read sequencing (T-LRS) using adaptive sampling on the Oxford Nanopore platform on 40 individuals, 10 of whom lacked a complete molecular diagnosis. We computationally targeted up to 151 Mbp of sequence per individual and searched for pathogenic substitutions, structural variants, and methylation differences using a single data source. We detected all genomic aberrations—including single-nucleotide variants, copy number changes, repeat expansions, and methylation differences—identified by prior clinical testing. In 8/8 individuals with complex structural rearrangements, T-LRS enabled more precise resolution of the mutation, leading to changes in clinical management in one case. In ten individuals with suspected Mendelian conditions lacking a precise genetic diagnosis, T-LRS identified pathogenic or likely pathogenic variants in six and variants of uncertain significance in two others. T-LRS accurately identifies pathogenic structural variants, resolves complex rearrangements, and identifies Mendelian variants not detected by other technologies. T-LRS represents an efficient and cost-effective strategy to evaluate high-priority genes and regions or complex clinical testing results.
Here we present a finished sequence of human chromosome 15, together with a high-quality gene catalogue. As chromosome 15 is one of seven human chromosomes with a high rate of segmental duplication, ...we have carried out a detailed analysis of the duplication structure of the chromosome. Segmental duplications in chromosome 15 are largely clustered in two regions, on proximal and distal 15q; the proximal region is notable because recombination among the segmental duplications can result in deletions causing Prader-Willi and Angelman syndromes. Sequence analysis shows that the proximal and distal regions of 15q share extensive ancient similarity. Using a simple approach, we have been able to reconstruct many of the events by which the current duplication structure arose. We find that most of the intrachromosomal duplications seem to share a common ancestry. Finally, we demonstrate that some remaining gaps in the genome sequence are probably due to structural polymorphisms between haplotypes; this may explain a significant fraction of the gaps remaining in the human genome.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Crosslinking proteins to the nucleic acids they bind affords stable access to otherwise transient regulatory interactions. Photochemical crosslinking provides an attractive alternative to ...formaldehyde‐based protocols, but irradiation with conventional UV sources typically yields inadequate product amounts. Crosslinking with pulsed UV lasers has been heralded as a revolutionary technique to increase photochemical yield, but this method had only been tested on a few protein–nucleic acid complexes. To test the generality of the yield enhancement, we have investigated the benefits of using ∼150 fs UV pulses to crosslink TATA‐binding protein, glucocorticoid receptor and heat shock factor to oligonucleotides in vitro. For these proteins, we find that the quantum yields (and saturating yields) for forming crosslinks using the high‐peak intensity femtosecond laser do not improve on those obtained with low‐intensity continuous wave (CW) UV sources. The photodamage to the oligonucleotides and proteins also has comparable quantum yields. Measurements of the photochemical reaction yields of several small molecules selected to model the crosslinking reactions also exhibit nearly linear dependences on UV intensity instead of the previously predicted quadratic dependence. Unfortunately, these results disprove earlier assertions that femtosecond pulsed laser sources provide significant advantages over CW radiation for protein–nucleic acid crosslinking.