Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation ...is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA), a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Previous comparative genomic studies of genes involved in olfactory behavior in Drosophila focused only on particular gene families such as odorant receptor and/or odorant binding proteins. However, ...olfactory behavior has a complex genetic architecture that is orchestrated by many interacting genes. In this paper, we present a comparative genomic study of olfactory behavior in Drosophila including an extended set of genes known to affect olfactory behavior. We took advantage of the recent burst of whole genome sequences and the development of powerful statistical tools to analyze genomic data and test evolutionary and functional hypotheses of olfactory genes in the six species of the Drosophila melanogaster species group for which whole genome sequences are available. Our study reveals widespread purifying selection and limited incidence of positive selection on olfactory genes. We show that the pace of evolution of olfactory genes is mostly independent of the life cycle stage, and of the number of life cycle stages, in which they participate in olfaction. However, we detected a relationship between evolutionary rates and the position that the gene products occupy in the olfactory system, genes occupying central positions tend to be more constrained than peripheral genes. Finally, we demonstrate that specialization to one host does not seem to be associated with bursts of adaptive evolution in olfactory genes in D. sechellia and D. erecta, the two specialists species analyzed, but rather different lineages have idiosyncratic evolutionary histories in which both historical and ecological factors have been involved.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
For decades, it has been hypothesized that gene regulation has had a central role in human evolution, yet much remains unknown about the genome-wide impact of regulatory mutations. Here we use ...whole-genome sequences and genome-wide chromatin immunoprecipitation and sequencing data to demonstrate that natural selection has profoundly influenced human transcription factor binding sites since the divergence of humans from chimpanzees 4-6 million years ago. Our analysis uses a new probabilistic method, called INSIGHT, for measuring the influence of selection on collections of short, interspersed noncoding elements. We find that, on average, transcription factor binding sites have experienced somewhat weaker selection than protein-coding genes. However, the binding sites of several transcription factors show clear evidence of adaptation. Several measures of selection are strongly correlated with predicted binding affinity. Overall, regulatory elements seem to contribute substantially to both adaptive substitutions and deleterious polymorphisms with key implications for human evolution and disease.
Full text
Available for:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Abstract
DNA methylation is a genetic occurrence that plays a principal role in the epigenetic regulation of gene expression. Genome wide association studies (GWAS) have linked differential ...methylation patterns, especially at cytosines in CpG context, to disease states like cancer and obesity. Well-known chemical and enzymatic techniques are available to detect the methylation state of cytosines throughout the genome. With Next-Generation Sequencing (NGS) becoming increasingly affordable and high-throughput, methylation sequencing has become a popular and accessible approach for disease detection. Correspondingly, there is an increasing demand for targeted methylation sequencing techniques that enrich for CpG-rich genomic regions of interest. Such techniques have historically been inefficient due to the reduced sequence complexity caused by bisulfite or enzymatic library preparation methods. These methods involve the conversion of unmethylated cytosines to uracils, subsequently resulting in thymines after PCR amplification, and inherently increase similarities between library sequences in a given population. This effect causes issues with specificity during hybrid capture since baits and targets with repetitive sub-sequences can incidentally hybridize to non-targeted molecules during capture.
In this poster, we present the Twist Targeted Methylation Sequencing Product, an end-to-end solution for preparing methyl-seq libraries and performing in-solution hybrid capture. We demonstrate benefits of using the NEBNext Enzymatic Methyl-seq Methylation Library Preparation protocol in our system, compared to traditional bisulfite treatment, and discuss approaches that were taken to maximize target enrichment specificity and uniformity. First, we introduce Twist's custom panel design approach, which leverages Twist's oligo manufacturing capabilities and uses predictive methods to remove probes most likely to cause poor capture performance. Second, we present methylation-specific Twist Fast Hybridization Target Enrichment protocol tunability, showing how FastHyb Wash Buffer 1 temperature and hybridization time can be used to manipulate performance metrics. Third, we introduce Twist's Methylation Enhancer, a novel blocker set that has shown to alleviate capture specificity issues across a wide range of methylation panels. Together, the data in this study showcase the Twist Targeted Methylation Sequencing Product as an efficient and flexible solution for custom methylation detection.
Citation Format: Brenton I. Graham, Leonardo Arbiza, Kristin Butcher, Siyuan Chen, Richard Gantt, Sabina Gude, Rebecca Nugent, Christina Thompson, Ramsey Zeitoun. Twist Fast Hybridization targeted methylation sequencing: a tunable target enrichment solution for methylation detection abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 2098.
The ratio of genetic diversity on chromosome X to that on the autosomes is sensitive to both natural selection and demography. On the basis of whole-genome sequences of 69 females, we report that ...whereas this ratio increases with genetic distance from genes across populations, it is lower in Europeans than in West Africans independent of proximity to genes. This relative reduction is most parsimoniously explained by differences in demographic history without the need to invoke natural selection.
Full text
Available for:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
In eutherian mammals, X-linked gene expression is normalized between XX females and XY males through the process of X chromosome inactivation (XCI). XCI results in silencing of transcription from one ...ChrX homolog per female cell. However, approximately 25% of human ChrX genes escape XCI to some extent and exhibit biallelic expression in females. The evolutionary basis of this phenomenon is not entirely clear, but high sequence conservation of XCI escapers suggests that purifying selection may directly or indirectly drive XCI escape at these loci. One hypothesis is that this signal results from contributions to developmental and physiological sex differences, but presently there is limited evidence supporting this model in humans. Another potential driver of this signal is selection for high and/or broad gene expression in both sexes, which are strong predictors of reduced nucleotide substitution rates in mammalian genes. Here, we compared purifying selection and gene expression patterns of human XCI escapers with those of X-inactivated genes in both sexes. When we accounted for the functional status of each ChrX gene's Y-linked homolog (or "gametolog"), we observed that XCI escapers exhibit greater degrees of purifying selection in the human lineage than X-inactivated genes, as well as higher and broader gene expression than X-inactivated genes across tissues in both sexes. These results highlight a significant role for gene expression in both sexes in driving purifying selection on XCI escapers, and emphasize these genes' potential importance in human disease.
The gradual accumulation of mutations by any of a number of mutational processes is a major driving force of divergence and evolution. Here, we investigate a potentially novel mutational process that ...is based on the activity of members of the AID/APOBEC family of deaminases. This gene family has been recently shown to introduce-in multiple types of cancer-enzyme-induced clusters of co-occurring somatic mutations caused by cytosine deamination. Going beyond somatic mutations, we hypothesized that APOBEC3-following its rapid expansion in primates-can introduce unique germline mutation clusters that can play a role in primate evolution. In this study, we tested this hypothesis by performing a comprehensive comparative genomic screen for APOBEC3-induced mutagenesis patterns across different hominids. We detected thousands of mutation clusters introduced along primate evolution which exhibit features that strongly fit the known patterns of APOBEC3G mutagenesis. These results suggest that APOBEC3G-induced mutations have contributed to the evolution of all genomes we studied. This is the first indication of site-directed, enzyme-induced genome evolution, which played a role in the evolution of both modern and archaic humans. This novel mutational mechanism exhibits several unique features, such as its higher tendency to mutate transcribed regions and regulatory elements and its ability to generate clusters of concurrent point mutations that all occur in a single generation. Our discovery demonstrates the exaptation of an anti-viral mechanism as a new source of genomic variation in hominids with a strong potential for functional consequences.
Abstract
Library construction for Next-Generation Sequencing (NGS) using formalin-fixed paraffin-embedded (FFPE) samples offers unique challenges in acquiring high-quality sequencing data due to wide ...distribution of sample quality. Specifically, differences in formalin fixation methods lead to crosslinked and/or degraded nucleic acid and inconsistent extraction yields. Hence, FFPE extraction and library construction methods must be carefully considered for target enrichment applications. In collaboration, Covaris and Twist Bioscience demonstrate a complete library preparation and target enrichment solution that generates ready-to-sequence multiplexed libraries directly from FFPE tissue.This workflow leverages the Covaris truXTRAC FFPE total Nucleic Acid Plus Kit and oneTUBE-10 shearing with the world-class performance of Twist Bioscience’s Target Enrichment Solutions. Covaris, the Gold Standard for DNA shearing in NGS, also offers pre-analytical products that leverage Adaptive Focused Acoustics® (AFA)® technology. In this FFPE-specific application, the Covaris truXTRAC FFPE total Nucleic Acid Plus Kit and oneTUBE-10 shearing enable full emulsification of paraffin and disaggregation of tissue for highly efficient nucleic acid extraction and generation of size specific DNA libraries. With the Twist Bioscience Human Core Exome kit, the resulting libraries are indexed, pooled, and target enriched with uniquely optimized DNA probes to generate ready-to-sequence high quality multiplexed libraries.Using the aforementioned workflow, results from processing numerous FFPE tissue types are presented. Sequencing results demonstrate large improvements in general Picard metrics that include uniformity (Fold_80 < 1.8), sequencing depth (20X coverage >85% with 100X downsampling), and duplication rates (<6%) when compared to similar published studies. These results demonstrate a validated solution for library preparation and targeted exome sequencing of FFPE samples that can be integrated into automated workflows. The truXTRAC kit and AFA® technology from Covaris generate size specific DNA libraries from FFPE samples which, when paired with Twist Bioscience’s superior target enrichment workflow, deliver multiplexed libraries for high performance targeted sequencing.
Citation Format: Mark Consugar, Leonardo Arbiza, Kristin Butcher, Siyuan Chen, Hutson Chilton, Richard Gantt, Yehudit Hasin-Brumshtein, Jim Laugharn, Jayne Simon, Ulrich Thomann, Christina Thompson, Ramsey Zeitoun. High performance multiplexed targeted enrichment sequencing from FFPE tissues abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 3544.
We have developed a web tool, PupaSuite, for the selection of single nucleotide polymorphisms (SNPs) with potential phenotypic effect, specifically oriented to help in the design of large-scale ...genotyping projects. PupaSuite uses a collection of data on SNPs from heterogeneous sources and a large number of pre-calculated predictions to offer a flexible and intuitive interface for selecting an optimal set of SNPs. It improves the functionality of PupaSNP and PupasView programs and implements new facilities such as the analysis of user's data to derive haplotypes with functional information. A new estimator of putative effect of polymorphisms has been included that uses evolutionary information. Also SNPeffect database predictions have been included. The PupaSuite web interface is accessible through http://pupasuite.bioinfo.cipf.es and through http://www.pupasnp.org.
At a genomic scale, the patterns that have shaped molecular evolution are believed to be largely heterogeneous. Consequently, comparative analyses should use appropriate probabilistic substitution ...models that capture the main features under which different genomic regions have evolved. While efforts have concentrated in the development and understanding of model selection techniques, no descriptions of overall relative substitution model fit at the genome level have been reported. Here, we provide a characterization of best-fit substitution models across three genomic data sets including coding regions from mammals, vertebrates, and Drosophila (24,000 alignments). According to the Akaike Information Criterion (AIC), 82 of 88 models considered were selected as best-fit models at least in one occasion, although with very different frequencies. Most parameter estimates also varied broadly among genes. Patterns found for vertebrates and Drosophila were quite similar and often more complex than those found in mammals. Phylogenetic trees derived from models in the 95% confidence interval set showed much less variance and were significantly closer to the tree estimated under the best-fit model than trees derived from models outside this interval. Although alternative criteria selected simpler models than the AIC, they suggested similar patterns. All together our results show that at a genomic scale, different gene alignments for the same set of taxa are best explained by a large variety of different substitution models and that model choice has implications on different parameter estimates including the inferred phylogenetic trees. After taking into account the differences related to sample size, our results suggest a noticeable diversity in the underlying evolutionary process. All together, we conclude that the use of model selection techniques is important to obtain consistent phylogenetic estimates from real data at a genomic scale.