While several studies have investigated general properties of the genetic architecture of natural variation in gene expression, few of these have considered natural, outbreeding populations. In ...parallel, systems biology has established that a general feature of biological networks is that they are scale-free, rendering them buffered against random mutations. To date, few studies have attempted to examine the relationship between the selective processes acting to maintain natural variation of gene expression and the associated co-expression network structure. Here we utilised RNA-Sequencing to assay gene expression in winter buds undergoing bud flush in a natural population of Populus tremula, an outbreeding forest tree species. We performed expression Quantitative Trait Locus (eQTL) mapping and identified 164,290 significant eQTLs associating 6,241 unique genes (eGenes) with 147,419 unique SNPs (eSNPs). We found approximately four times as many local as distant eQTLs, with local eQTLs having significantly higher effect sizes. eQTLs were primarily located in regulatory regions of genes (UTRs or flanking regions), regardless of whether they were local or distant. We used the gene expression data to infer a co-expression network and investigated the relationship between network topology, the genetic architecture of gene expression and signatures of selection. Within the co-expression network, eGenes were underrepresented in network module cores (hubs) and overrepresented in the periphery of the network, with a negative correlation between eQTL effect size and network connectivity. We additionally found that module core genes have experienced stronger selective constraint on coding and non-coding sequence, with connectivity associated with signatures of selection. Our integrated genetics and genomics results suggest that purifying selection is the primary mechanism underlying the genetic architecture of natural variation in gene expression assayed in flushing leaf buds of P. tremula and that connectivity within the co-expression network is linked to the strength of purifying selection.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
I have studied nucleotide polymorphism and linkage disequilibrium using multilocus data from 77 fragments, with an average length of fragments of 550 bp, in the deciduous tree Populus tremula ...(Salicaceae). The frequency spectrum across loci showed a modest excess of mutations segregating at low frequency and a marked excess of high-frequency derived mutations at silent sites, relative to neutral expectations. These excesses were also seen at replacement sites, but were not so pronounced for high-frequency derived mutations. There was a marked excess of low-frequency mutations at replacement sites, likely indicating deleterious amino acid-changing mutations that segregate at low frequencies in P. tremula. I used approximate Bayesian computation (ABC) to evaluate a number of different demographic scenarios and to estimate parameters for the best-fitting model. The data were found to be consistent with a historical reduction in the effective population size of P. tremula through a bottleneck. The timing inferred for this bottleneck is largely consistent with geological data and with data from several other long-lived plant species. The results show that P. tremula harbors substantial levels of nucleotide polymorphism with the posterior mode of the scaled mutation rate, theta = 0.0177 across loci. The ABC analyses also provided an estimate of the scaled recombination rate that indicates that recombination rates in P. tremula are likely to be 2-10 times higher than the mutation rate. This study reinforces the notion that linkage disequilibrium is low and decays to negligible levels within a few hundred base pairs in P. tremula.
Abstract
Hybridization and resulting introgression are important processes shaping the tree of life and appear to be far more common than previously thought. However, how the genome evolution was ...shaped by various genetic and evolutionary forces after hybridization remains unresolved. Here we used whole-genome resequencing data of 227 individuals from multiple widespread Populus species to characterize their contemporary patterns of hybridization and to quantify genomic signatures of past introgression. We observe a high frequency of contemporary hybridization and confirm that multiple previously ambiguous species are in fact F1 hybrids. Seven species were identified, which experienced different demographic histories that resulted in strikingly varied efficacy of selection and burdens of deleterious mutations. Frequent past introgression has been found to be a pervasive feature throughout the speciation of these Populus species. The retained introgressed regions, more generally, tend to contain reduced genetic load and to be located in regions of high recombination. We also find that in pairs of species with substantial differences in effective population size, introgressed regions are inferred to have undergone selective sweeps at greater than expected frequencies in the species with lower effective population size, suggesting that introgression likely have higher potential to provide beneficial variation for species with small populations. Our results, therefore, illustrate that demography and recombination have interplayed with both positive and negative selection in determining the genomic evolution after hybridization.
CONTENTS: Summary 909 I. Introduction 910 II. Genotyping 910 III. Phenotyping 911 IV. Study designs 912 V. The genetics of the ‘omics' 912 VI. Missing heritability: the dark matter of the genome 913 ...VII. Gene interactions 914 VIII. Many rare alleles 914 IX. Looking in the wrong place 914 X. Looking but not seeing 915 XI. Needles in a haystack 915 XII. Confounding effects 916 XIII. Replicating and verifying associations 916 XIV. The genetic architecture of quantitative traits in plants 917 XV. Outlook 918 Acknowledgements 919 References 919 SUMMARY: Association mapping is rapidly becoming the main method for dissecting the genetic architecture of complex traits in plants. Currently most association mapping studies in plants are preformed using sets of genes selected to be putative candidates for the trait of interest, but rapid developments in genomics will allow for genome-wide mapping in virtually any plant species in the near future. As the costs for genotyping are decreasing, the focus has shifted towards phenotyping. In plants, clonal replication and/or inbred lines allows for replicated phenotyping under many different environmental conditions. Reduced sequencing costs will increase the number of studies that use RNA sequencing data to perform expression quantitative trait locus (eQTL) mapping, which will increase our knowledge of how gene expression variation contributes to phenotypic variation. Current population sizes used in association mapping studies are modest in size and need to be greatly increased if mutations explaining less than a few per cent of the phenotypic variation are to be detected. Association mapping has started to yield insights into the genetic architecture of complex traits in plants, and future studies with greater genome coverage will help to elucidate how plants have managed to adapt to a wide variety of environmental conditions.
Populus is an important model organism in forest biology, but levels of nucleotide polymorphisms and linkage disequilibrium have never been investigated in natural populations. Here I present a study ...on levels of nucleotide polymorphism, haplotype structure, and population subdivision in five nuclear genes in the European aspen Populus tremula. Results show substantial levels of genetic variation. Levels of silent site polymorphisms, pi(s), averaged 0.016 across the five genes. Linkage disequilibrium was generally low, extending only a few hundred base pairs, suggesting that rates of recombination are high in this obligate outcrossing species. Significant genetic differentiation was found at all five genes, with an average estimate of F(ST) = 0.116. Levels of polymorphism in P. tremula are 2- to 10-fold higher than those in other woody, long-lived perennial plants, such as Pinus and Cryptomeria. The high levels of nucleotide polymorphism and low linkage disequilibrium suggest that it may be possible to map functional variation to very fine scales in P. tremula using association-mapping approaches.
The majority of variation in rates of molecular evolution among seed plants remains both unexplored and unexplained. Although some attention has been given to flowering plants, reports of molecular ...evolutionary rates for their sister plant clade (gymnosperms) are scarce, and to our knowledge differences in molecular evolution among seed plant clades have never been tested in a phylogenetic framework. Angiosperms and gymnosperms differ in a number of features, of which contrasting reproductive biology, life spans, and population sizes are the most prominent. The highly conserved morphology of gymnosperms evidenced by similarity of extant species to fossil records and the high levels of macrosynteny at the genomic level have led scientists to believe that gymnosperms are slow-evolving plants, although some studies have offered contradictory results. Here, we used 31,968 nucleotide sites obtained from orthologous genes across a wide taxonomic sampling that includes representatives of most conifers, cycads, ginkgo, and many angiosperms with a sequenced genome. Our results suggest that angiosperms and gymnosperms differ considerably in their rates of molecular evolution per unit time, with gymnosperm rates being, on average, seven times lower than angiosperm species. Longer generation times and larger genome sizes are some of the factors explaining the slow rates of molecular evolution found in gymnosperms. In contrast to their slow rates of molecular evolution, gymnosperms possess higher substitution rate ratios than angiosperm taxa. Finally, our study suggests stronger and more efficient purifying and diversifying selection in gymnosperm than in angiosperm species, probably in relation to larger effective population sizes.
We investigated the utility of association mapping to dissect the genetic basis of naturally occurring variation in bud phenology in European aspen (Populus tremula). With this aim, we surveyed ...nucleotide polymorphism in 13 fragments spanning an 80-kb region surrounding the phytochrome B2 (phyB2) locus. Although polymorphism varies substantially across the phyB2 region, we detected no signs for deviations from neutral expectations. We also identified a total of 41 single nucleotide polymorphisms (SNPs) that were subsequently scored in a mapping population consisting of 120 trees. We identified two nonsynonymous SNPs in the phytochrome B2 gene that were independently associated with variation in the timing of bud set and that explained between 1.5 and 5% of the observed phenotypic variation in bud set. Earlier studies have shown that the frequencies of both these SNPs vary clinally with latitude. Linkage disequilibrium across the region was low, suggesting that the SNPs we identified are strong candidates for being causally linked to variation in bud set in our mapping populations. One of the SNPs (T608N) is located in the "hinge region," close to the chromophore binding site of the phyB2 protein. The other SNP (L1078P) is located in a region supposed to mediate downstream signaling from the phyB2 locus. The lack of population structure, combined with low levels of linkage disequilibrium, suggests that association mapping is a fruitful method for dissecting naturally occurring variation in Populus tremula.
High-throughput DNA sequencing and genotyping technologies have enabled a new generation of research in plant genetics where combined quantitative and population genetic approaches can be used to ...better understand the relationship between naturally occurring genotypic and phenotypic diversity. Forest trees are highly amenable to such studies because of their combined undomesticated and partially domesticated state. Forest geneticists are using association genetics to dissect complex adaptive traits and discover the underlying genes. In parallel, they are using resequencing of candidate genes and modern population genetics methods to discover genes under natural selection. This combined approach is identifying the most important genes that determine patterns of complex trait adaptation observed in many tree populations.
Abstract
A genome-wide association study (GWAS) was used to identify associated loci with early vigor under simulated water deficit and grain yield under field drought in a diverse collection of ...Iranian bread wheat landraces. In addition, a meta-quantitative trait loci (MQTL) analysis was used to further expand our approach by retrieving already published quantitative trait loci (QTL) from recombinant inbred lines, double haploids, back-crosses, and F2 mapping populations. In the current study, around 16%, 14%, and 16% of SNPs were in significant linkage disequilibrium (LD) in the A, B, and D genomes, respectively, and varied between 5.44% (4A) and 21.85% (6A). Three main subgroups were identified among the landraces with different degrees of admixture, and population structure was further explored through principal component analysis. Our GWAS identified 54 marker-trait associations (MTAs) that were located across the wheat genome but with the highest number found in the B sub-genome. The gene ontology (GO) analysis of MTAs revealed that around 75% were located within or closed to protein-coding genes. In the MQTL analysis, 23 MQTLs, from a total of 215 QTLs, were identified and successfully projected onto the reference map. MQT-YLD4, MQT-YLD9, MQT-YLD13, MQT-YLD17, MQT-YLD18, MQT-YLD19, and MQTL-RL1 contributed to the highest number of projected QTLs and were therefore regarded as the most reliable and stable QTLs under water deficit conditions. These MQTLs greatly facilitate the identification of putative candidate genes underlying at each MQTL interval due to the reduced confidence of intervals associated with MQTLs. These findings provide important information on the genetic basis of early vigor traits and grain yield under water deficit conditions and set the foundation for future investigations into adaptation to water deficit in bread wheat.
The presupposition of genomic selection (GS) is that predictive accuracies should be based on population-wide linkage disequilibrium (LD). However, in species with large, highly complex genomes the ...limitation of marker density may preclude the ability to resolve LD accurately enough for GS. Here we investigate such an effect in two conifer species with ~ 20 Gbp genomes, Douglas-fir (Pseudotsuga menziesii Mirb. (Franco)) and Interior spruce (Picea glauca (Moench) Voss x Picea engelmannii Parry ex Engelm.). Random sampling of markers was performed to obtain SNP sets with totals in the range of 200-50,000, this was replicated 10 times. Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) was deployed as the GS method to test these SNP sets, and 10-fold cross-validation was performed on 1,321 Douglas-fir trees, representing 37 full-sib F.sub.1 families and on 1,126 Interior spruce trees, representing 25 open-pollinated (half-sib) families. Both trials are located on 3 sites in British Columbia, Canada. As marker number increased, so did GS predictive accuracy for both conifer species. However, a plateau in the gain of accuracy became apparent around 10,000-15,000 markers for both Douglas-fir and Interior spruce. Despite random marker selection, little variation in predictive accuracy was observed across replications. On average, Douglas-fir prediction accuracies were higher than those of Interior spruce, reflecting the difference between full- and half-sib families for Douglas-fir and Interior spruce populations, respectively, as well as their respective effective population size. Although possibly advantageous within an advanced breeding population, reducing marker density cannot be recommended for carrying out GS in conifers. Significant LD between markers and putative causal variants was not detected using 50,000 SNPS, and GS was enabled only through the tracking of relatedness in the populations studied. Dramatically increasing marker density would enable said markers to better track LD with causal variants in these large, genetically diverse genomes; as well as providing a model that could be used across populations, breeding programs, and traits.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK