Genetic adaptation to external stimuli occurs through the combined action of mutation and selection. A central problem in genetics is to identify loci responsive to specific selective constraints. ...Many tests have been proposed to identify the genomic signatures of natural selection by quantifying the skew in the site frequency spectrum (SFS) under selection relative to neutrality. We build upon recent work that connects many of these tests under a common framework, by describing how selective sweeps affect the scaled SFS. We show that the specific skew depends on many attributes of the sweep, including the selection coefficient and the time under selection. Using supervised learning on extensive simulated data, we characterize the features of the scaled SFS that best separate different types of selective sweeps from neutrality. We develop a test, SFselect, that consistently outperforms many existing tests over a wide range of selective sweeps. We apply SFselect to polymorphism data from a laboratory evolution experiment of Drosophila melanogaster adapted to hypoxia and identify loci that strengthen the role of the Notch pathway in hypoxia tolerance, but were missed by previous approaches. We further apply our test to human data and identify regions that are in agreement with earlier studies, as well as many novel regions.
The translation of "next-generation" sequencing directly to the clinic is still being assessed but has the potential for genetic diseases to reduce costs, advance accuracy, and point to unsuspected ...yet treatable conditions. To study its capability in the clinic, we performed whole-exome sequencing in 118 probands with a diagnosis of a pediatric-onset neurodevelopmental disease in which most known causes had been excluded. Twenty-two genes not previously identified as disease-causing were identified in this study (19% of cohort), further establishing exome sequencing as a useful tool for gene discovery. New genes identified included EXOC8 in Joubert syndrome and GFM2 in a patient with microcephaly, simplified gyral pattern, and insulin-dependent diabetes. Exome sequencing uncovered 10 probands (8% of cohort) with mutations in genes known to cause a disease different from the initial diagnosis. Upon further medical evaluation, these mutations were found to account for each proband's disease, leading to a change in diagnosis, some of which led to changes in patient management. Our data provide proof of principle that genomic strategies are useful in clarifying diagnosis in a proportion of patients with neurodevelopmental disorders.
Through long-term laboratory selection (over 200 generations), we have generated Drosophila melanogaster populations that tolerate severe, normally lethal, levels of hypoxia. Because of initial ...experiments suspecting genetic mechanisms underlying this adaptation, we compared the genomes of the hypoxia-selected flies with those of controls using deep resequencing. By applying unique computing and analytical methods we identified a number of DNA regions under selection, mostly on the X chromosome. Several of the hypoxia-selected regions contained genes encoding or regulating the Notch pathway. In addition, previous expression profiling revealed an activation of the Notch pathway in the hypoxia-selected flies. We confirmed the contribution of Notch activation to hypoxia tolerance using a specific γ-secretase inhibitor, N-N-(3,5-Difluorophenacetyl)-L-alanyl-S-phenylglycine t-butyl ester (DAPT), which significantly reduced adult survival and life span in the hypoxia-selected flies. We also demonstrated that flies with loss-of-function Notch mutations or RNAi-mediated Notch knockdown had a significant reduction in hypoxia tolerance, but those with a gain-of-function had a dramatic opposite effect. Using the UAS-Gal4 system, we also showed that specific overexpression of the Notch intracellular domain in glial cells was critical for conferring hypoxia tolerance. Unique analytical tools and genetic and bioinformatic strategies allowed us to discover that Notch activation plays a major role in this hypoxia tolerance in Drosophila melanogaster.
Although it has long been proposed that genetic factors contribute to adaptation to high altitude, such factors remain largely unverified. Recent advances in high-throughput sequencing have made it ...feasible to analyze genome-wide patterns of genetic variation in human populations. Since traditionally such studies surveyed only a small fraction of the genome, interpretation of the results was limited.
We report here the results of the first whole genome resequencing-based analysis identifying genes that likely modulate high altitude adaptation in native Ethiopians residing at 3,500 m above sea level on Bale Plateau or Chennek field in Ethiopia. Using cross-population tests of selection, we identify regions with a significant loss of diversity, indicative of a selective sweep. We focus on a 208 kbp gene-rich region on chromosome 19, which is significant in both of the Ethiopian subpopulations sampled. This region contains eight protein-coding genes and spans 135 SNPs. To elucidate its potential role in hypoxia tolerance, we experimentally tested whether individual genes from the region affect hypoxia tolerance in Drosophila. Three genes significantly impact survival rates in low oxygen: cic, an ortholog of human CIC, Hsl, an ortholog of human LIPE, and Paf-AHα, an ortholog of human PAFAH1B3.
Our study reveals evolutionarily conserved genes that modulate hypoxia tolerance. In addition, we show that many of our results would likely be unattainable using data from exome sequencing or microarray studies. This highlights the importance of whole genome sequencing for investigating adaptation by natural selection.
The hypoxic conditions at high altitudes present a challenge for survival, causing pressure for adaptation. Interestingly, many high-altitude denizens (particularly in the Andes) are maladapted, with ...a condition known as chronic mountain sickness (CMS) or Monge disease. To decode the genetic basis of this disease, we sequenced and compared the whole genomes of 20 Andean subjects (10 with CMS and 10 without). We discovered 11 regions genome-wide with significant differences in haplotype frequencies consistent with selective sweeps. In these regions, two genes (an erythropoiesis regulator, SENP1, and an oncogene, ANP32D) had a higher transcriptional response to hypoxia in individuals with CMS relative to those without. We further found that downregulating the orthologs of these genes in flies dramatically enhanced survival rates under hypoxia, demonstrating that suppression of SENP1 and ANP32D plays an essential role in hypoxia tolerance. Our study provides an unbiased framework to identify and validate the genetic basis of adaptation to high altitudes and identifies potentially targetable mechanisms for CMS treatment.
Research into hypoxia (or low oxygen levels) has been a hot topic for a number of decades, because many harmful diseases, such as heart attacks and cancer, create much of their damage by inducing ...hypoxia. It has been suspected for years that the ability of a cell (or an organism) to cope with a hypoxic environment is, at least in part, influenced by genetic factors. However, for financial reasons, virtually all studies that have attempted to find these factors have been constrained to a subset of variant sites (targeted genes, exons, or genotyping arrays). As the costs of sequencing drop, though, whole-genome sequencing will become increasingly used. The primary goal of this dissertation is to build computational tools that use the power of whole-genome sequencing to identify genetic variants that can confer tolerance to hypoxia. Even though the basic computational problem is one of correlation, the experimental design plays a huge role in determining the best way to measure this correlation. First, we discuss ways to identify correlated sites in a typical association study. While the single-locus case is trivial to solve, extending this to multiple loci is intractable using a naive approach. We discuss existing randomized algorithms that solve this problem quickly, and extend these algorithms to handle quantitative phenotypes. We then apply one of these approaches to identify interacting sites correlating with survival rate under acute hypoxia. We then focus on a different problem — detecting natural selection. As the signatures of natural selection are dependent on several parameters, such as selection pressure and time under selection, which are largely unknown, we compare the performance of a number of tests over a wide range of parameters and identify optimal regimes for each of them. We then select a statistic appropriate for strong, laboratory selection and use it to identify elements of the Notch repression mechanism in flies that have adapted to extreme hypoxia (4% O2). Finally, we apply a number of these statistics to two different populations of humans adapting to mild hypoxia, identifying novel and distinct mechanisms in both cases.
For smaller organisms with faster breeding cycles, artificial selection can be used to create sub-populations with different phenotypic traits. Genetic tests can be employed to identify the causal ...markers for the phenotypes, as a precursor to engineering strains with a combination of traits. Traditional approaches involve analyzing crosses of inbred strains to test for co-segregation with genetic markers. Here we take advantage of cheaper next generation sequencing techniques to identify genetic signatures of adaptation to the selection constraints. Obtaining individual sequencing data is often unrealistic due to cost and sample issues, so we focus on pooled genomic data. We explore a series of statistical tests for selection using pooled case (under selection) and control populations. The tests generally capture skews in the scaled frequency spectrum of alleles in a region, which are indicative of a selective sweep. Extensive simulations are used to show that these approaches work well for a wide range of population divergence times and strong selective pressures. Control vs control simulations are used to determine an empirical False Positive Rate, and regions under selection are determined using a 1% FPR level. We show that pooling does not have a significant impact on statistical power. The tests are also robust to reasonable variations in several different parameters, including window size, base-calling error rate, and sequencing coverage. We then demonstrate the viability (and the challenges) of one of these methods in two independent Drosophila populations (Drosophila melanogaster) bred under selection for hypoxia and accelerated development, respectively. Testing for extreme hypoxia tolerance showed clear signals of selection, pointing to loci that are important for hypoxia adaptation. Overall, we outline a strategy for finding regions under selection using pooled sequences, then devise optimal tests for that strategy. The approaches show promise for detecting selection, even several generations after fixation of the beneficial allele has occurred.
The quality of life of breast cancer survivors is maintained by minimizing adverse effects on their physical appearance. In this study, we present an automated method for computing a common measure ...of breast symmetry, the normalized Breast Retraction Assessment (pBRA), from routine clinical photographs taken to document breast reconstruction procedures.