Whole-genome bisulfite sequencing (WGBS) has become the standard method for interrogating plant methylomes at base resolution. However, deep WGBS measurements remain cost prohibitive for large, ...complex genomes and for population-level studies. As a result, most published plant methylomes are sequenced far below saturation, with a large proportion of cytosines having either missing data or insufficient coverage.
Here we present METHimpute, a Hidden Markov Model (HMM) based imputation algorithm for the analysis of WGBS data. Unlike existing methods, METHimpute enables the construction of complete methylomes by inferring the methylation status and level of all cytosines in the genome regardless of coverage. Application of METHimpute to maize, rice and Arabidopsis shows that the algorithm infers cytosine-resolution methylomes with high accuracy from data as low as 6X, compared to data with 60X, thus making it a cost-effective solution for large-scale studies.
METHimpute provides methylation status calls and levels for all cytosines in the genome regardless of coverage, thus yielding complete methylomes even with low-coverage WGBS datasets. The method has been extensively tested in plants, but should also be applicable to other species. An implementation is available on Bioconductor.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The accumulation and removal of transposable elements (TEs) is a major driver of genome size evolution in eukaryotes. In plants, long terminal repeat (LTR) retrotransposons (LTR-RTs) represent the ...majority of TEs and form most of the nuclear DNA in large genomes. Unequal recombination (UR) between LTRs leads to removal of intervening sequence and formation of solo-LTRs. UR is a major mechanism of LTR-RT removal in many angiosperms, but our understanding of LTR-RT-associated recombination within the large, LTR-RT-rich genomes of conifers is quite limited. We employ a novel read-based methodology to estimate the relative rates of LTR-RT-associated UR within the genomes of four conifer and seven angiosperm species. We found the lowest rates of UR in the largest genomes studied, conifers and the angiosperm maize. Recombination may also resolve as gene conversion, which does not remove sequence, so we analyzed LTR-RT-associated gene conversion events (GCEs) in Norway spruce and six angiosperms. Opposite the trend for UR, we found the highest rates of GCEs in Norway spruce and maize. Unlike previous work in angiosperms, we found no evidence that rates of UR correlate with retroelement structural features in the conifers, suggesting that another process is suppressing UR in these species. Recent results from diverse eukaryotes indicate that heterochromatin affects the resolution of recombination, by favoring gene conversion over crossing-over, similar to our observation of opposed rates of UR and GCEs. Control of LTR-RT proliferation via formation of heterochromatin would be a likely step toward large genomes in eukaryotes carrying high LTR-RT content.
Norway spruce (
(L.) Karst.) is a conifer species of substanital economic and ecological importance. In common with most conifers, the
genome is very large (∼20 Gbp) and contains a high fraction of ...repetitive DNA. The current
genome assembly (v1.0) covers approximately 60% of the total genome size but is highly fragmented, consisting of >10 million scaffolds. The genome annotation contains 66,632 gene models that are at least partially validated (www.congenie.org), however, the fragmented nature of the assembly means that there is currently little information available on how these genes are physically distributed over the 12
chromosomes. By creating an ultra-dense genetic linkage map, we anchored and ordered scaffolds into linkage groups, which complements the fine-scale information available in assembly contigs. Our ultra-dense haploid consensus genetic map consists of 21,056 markers derived from 14,336 scaffolds that contain 17,079 gene models (25.6% of the validated gene models) that we have anchored to the 12 linkage groups. We used data from three independent component maps, as well as comparisons with previously published
maps to evaluate the accuracy and marker ordering of the linkage groups. We demonstrate that approximately 3.8% of the anchored scaffolds and 1.6% of the gene models covered by the consensus map have likely assembly errors as they contain genetic markers that map to different regions within or between linkage groups. We further evaluate the utility of the genetic map for the conifer research community by using an independent data set of unrelated individuals to assess genome-wide variation in genetic diversity using the genomic regions anchored to linkage groups. The results show that our map is sufficiently dense to enable detailed evolutionary analyses across the
genome.
The size and shape of tree leaves and their variation within the canopy are the result of both physiological plasticity and an overall adaptive strategy against unfavourable environmental conditions. ...In this study, diversity patterns at leaf morphological traits will be described within and among populations of trees with different phylogenetic background. Beech (
Fagus
sp.) is a widespread tree in Eurasia, represented by two species;
F. sylvatica
in Europe and F
. orientalis
in eastern Europe and Asia. Both species appear in the Rodopi mountains, in southeast Balkans. Five beech populations were sampled in the southern slopes of Rodopi along a west–east gradient representing an established transitional zone between the two beech species. The diversity of six leaf traits was examined in shade leaves and leaves exposed to direct irradiation. Significant differences appeared among populations and among the two shading classes. Western beech populations consisted of trees with smaller leaves and fewer veins and were morphologically closer to
F. sylvatica
, while eastern populations seemed to be closer to
F. orientalis
. Shade leaves were constantly larger and less round than light leaves, probably due to different light harvesting strategies. The differences between populations were larger for shade leaves than for light leaves and presented a clear east–west trend, consistent to the differentiation pattern provided by previous genetic studies in the same region. Our results indicate that shade leaves probably maintain their size and shape independent from light irradiation and therefore may better express genetic differences among populations.
Methylome evolution in plants Vidalis, Amaryllis; Živković, Daniel; Wardenaar, René ...
Genome Biology,
12/2016, Letnik:
17, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Despite major progress in dissecting the molecular pathways that control DNA methylation patterns in plants, little is known about the mechanisms that shape plant methylomes over evolutionary time. ...Drawing on recent intra- and interspecific epigenomic studies, we show that methylome evolution over long timescales is largely a byproduct of genomic changes. By contrast, methylome evolution over short timescales appears to be driven mainly by spontaneous epimutational events. We argue that novel methods based on analyses of the methylation site frequency spectrum (mSFS) of natural populations can provide deeper insights into the evolutionary forces that act at each timescale.
Summary
Norway spruce is a boreal forest tree species of significant ecological and economic importance. Hence there is a strong imperative to dissect the genetics underlying important wood quality ...traits in the species. We performed a functional genome‐wide association study (GWAS) of 17 wood traits in Norway spruce using 178 101 single nucleotide polymorphisms (SNPs) generated from exome genotyping of 517 mother trees. The wood traits were defined using functional modelling of wood properties across annual growth rings. We applied a Least Absolute Shrinkage and Selection Operator (LASSO‐based) association mapping method using a functional multilocus mapping approach that utilizes latent traits, with a stability selection probability method as the hypothesis testing approach to determine a significant quantitative trait locus. The analysis provided 52 significant SNPs from 39 candidate genes, including genes previously implicated in wood formation and tree growth in spruce and other species. Our study represents a multilocus GWAS for complex wood traits in Norway spruce. The results advance our understanding of the genetics influencing wood traits and identifies candidate genes for future functional studies.
Significance Statement
Wood provides both structural support and a transport route for water and solutes in trees. Our work provides a framework to dissect the genetic nature of wood formation and adds to our understanding of tree growth and development. With the current research focus on wood cell wall biosynthesis in general, and lignocellulose feedstock for biorefineries, we believe that this contribution will be of wide interest for the plant science community.
Erratum to: Methylome Evolution in plants Vidalis, Amaryllis; Živković, Daniel; Wardenaar, René ...
Genome Biology,
02/2017, Letnik:
18, Številka:
1
Journal Article
Recenzirano
Odprti dostop
A schematic representation of the five chromosomes is shown above (circle, centromere; dark gray, pericentromeric region; light gray, arm). b Annotation-specific CG epimutations produce distinct ...methylome diversity (CG meth. div.) patterns among mutation accumulation lines (MA-lines) that have diverged for merely 30 generations (average diversity was calculated in 1 Mb sliding windows, step size 100 kb). The theoretical model (see Box 1) provides an accurate fit to the observed genic CG methylation diversity patterns, suggesting that CG epimutations are a major factor in shaping methylome diversity in natural populations of A. thaliana over evolutionary timescales Notes The online version of the original article can be found under doi:10.1186/s13059-016-1127-5. Declarations Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Closely related Quercus species generally exhibit low levels of genetic differentiation despite their ecological and morphological differences. However, at a few so‐called ‘outlier’ loci they seem to ...remain genetically distinct. Isocitrate dehydrogenases (IDH) are key enzymes involved in the metabolic pathway of the citrate cycle. IDH has also been characterised as an ‘outlier’ marker, significantly differentiating the closely related Q. robur and Q. petraea with the isozyme technique. This ability to differentiate the species was tested here at molecular level: 13 single nucleotide polymorphism (SNP) markers were identified and developed within a NADP+‐specific IDH gene in Quercus spp. and applied as molecular markers in a four species mixed oak forest in eastern Europe, where Q. robur, Q. petraea, Q. pubescens and Q. frainetto naturally co‐exist. From the 13 developed SNPs, three groups were formed: non‐synonymous, synonymous and non‐coding SNPs. The levels of total gene diversity were moderate for all species investigated. The non‐synonymous SNPs showed lower levels of gene diversity. Overall, the four closely related Quercus spp. were significantly differentiated (except Q. petraea with Q. frainetto). Analysis of non‐random association of alleles revealed no clear physical clustering of the SNP sites in significant linkage disequilibrium (LD). However, separate LD analysis for each species showed a lower number of sites in significant LD for Q. robur than for the other species, possibly reflecting the history of the species in this specific geographical site and less efficient recombination effect due to the larger effective population size of Q. robur. Eleven statistically significant associations were found between seven SNPs and morphological traits that are commonly used to differentiate oak species.
Patterns of fine-scale spatial distribution of multilocus genotypes can provide valuable insights into the biology of forest tree species. Here we tested for the existence of spatial genetic ...structure (SGS) in a four-oak-species forest with contrasting species abundances and hybridization rates. A total of 483 adult trees were mapped over 8.6 ha and genotyped using 10 highly polymorphic genomic regions. A weak but significant SGS was observed in each of the four oak species, with Quercus frainetto, the species with the lowest density in the sampling plot, exhibiting the strongest SGS. The values of the Sp statistic were 0.0033, 0.0035, 0.0042, and 0.0098 for Q. petraea, Q. robur, Q. pubescens, and Q. frainetto, respectively. The spatial correlogram of the total population was significantly different when hybrids were removed from the analysis, which suggests that hybridization influenced the SGS. Interspecific SGSs were significantly correlated with the rates of hybridization. Implications of the obtained results for the conservation and management of forest genetic resources are discussed.
Patterns of fine-scale spatial distribution of multilocus genotypes can provide valuable insights into the biology of forest tree species. Here we tested for the existence of spatial genetic ...structure (SGS) in a four-oak-species forest with contrasting species abundances and hybridization rates. A total of 483 adult trees were mapped over 8.6 ha and genotyped using 10 highly polymorphic genomic regions. A weak but significant SGS was observed in each of the four oak species, with Quercus frainetto, the species with the lowest density in the sampling plot, exhibiting the strongest SGS. The values of the Sp statistic were 0.0033, 0.0035, 0.0042, and 0.0098 for Q. petraea, Q. robur, Q. pubescens, and Q. frainetto, respectively. The spatial correlogram of the total population was significantly different when hybrids were removed from the analysis, which suggests that hybridization influenced the SGS. Interspecific SGSs were significantly correlated with the rates of hybridization. Implications of the obtained results for the conservation and management of forest genetic resources are discussed.