Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical ...constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets.
A review of UMAP in population genetics Diaz-Papkovich, Alex; Anderson-Trocmé, Luke; Gravel, Simon
Journal of human genetics,
01/2021, Letnik:
66, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Uniform manifold approximation and projection (UMAP) has been rapidly adopted by the population genetics community to study population structure. It has become common in visualizing the ancestral ...composition of human genetic datasets, as well as searching for unique clusters of data, and for identifying geographic patterns. Here we give an overview of applications of UMAP in population genetics, provide recommendations for best practices, and offer insights on optimal uses for the technique.
Abstract
Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported ...only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.
On the genes, genealogies, and geographies of Quebec Anderson-Trocmé, Luke; Nelson, Dominic; Zabad, Shadi ...
Science (American Association for the Advancement of Science),
05/2023, Letnik:
380, Številka:
6647
Journal Article
Recenzirano
Odprti dostop
Population genetic models only provide coarse representations of real-world ancestry. We used a pedigree compiled from 4 million parish records and genotype data from 2276 French and 20,451 French ...Canadian individuals to finely model and trace French Canadian ancestry through space and time. The loss of ancestral French population structure and the appearance of spatial and regional structure highlights a wide range of population expansion models. Geographic features shaped migrations, and we find enrichments for migration, genetic, and genealogical relatedness patterns within river networks across regions of Quebec. Finally, we provide a freely accessible simulated whole-genome sequence dataset with spatiotemporal metadata for 1,426,749 individuals reflecting intricate French Canadian population structure. Such realistic population-scale simulations provide opportunities to investigate population genetics at an unprecedented resolution.
Dissimilatory sulfate reduction is a microbial catabolic pathway that preferentially processes less massive sulfur isotopes relative to their heavier counterparts. This sulfur isotope fractionation ...is recorded in ancient sedimentary rocks and generally is considered to reflect a phenotypic response to environmental variations rather than to evolutionary adaptation. Modern sulfate-reducing microorganisms isolated from similar environments can exhibit a wide range of sulfur isotope fractionations, suggesting that adaptive processes influence the sulfur isotope phenotype. To date, the relationship between evolutionary adaptation and isotopic phenotypes has not been explored. We addressed this by studying the covariation of fitness, sulfur isotope fractionation, and growth characteristics in Desulfovibrio vulgaris Hildenborough in a microbial evolution experiment. After 560 generations, the mean fitness of the evolved lineages relative to the starting isogenic population had increased by ∼ 17%. After 927 generations, the mean fitness relative to the initial ancestral population had increased by ∼ 20%. Growth rate in exponential phase increased during the course of the experiment, suggesting that this was a primary influence behind the fitness increases. Consistent changes were observed within different selection intervals between fractionation and fitness. Fitness changes were associated with changes in exponential growth rate but changes in fractionation were not. Instead, they appeared to be a response to changes in the parameters that govern growth rate: yield and cell-specific sulfate respiration rate. We hypothesize that cell-specific sulfate respiration rate, in particular, provides a bridge that allows physiological controls on fractionation to cross over to the adaptive realm.
The genome sequencing revolution over the past few decades has generated data from increasingly large cohorts of individuals. The analysis of these data has allowed researchers to identify patterns ...in genetic variation within and between human populations. Differences in allele frequencies across diverse groups of individuals are commonly accounted for in genome wide association studies to avoid spurious associations. As a result, continental population structure observed in diverse cohorts has been well studied, and has led to many advances in our understanding of deep human history. However, the study of fine-scale structure within populations has only recently become possible as sample sizes of individuals belonging to the same population continue to increase. To this point, genomic data from founder populations has played an important role in investigating demographic factors that can lead to the formation of population structure. The work presented here investigates genetic signatures observed in founder populations as case studies to identify factors that can lead to the formation of fine-scale structure.First we consider a mutational signature observed in the Japanese cohort of the 1000 Genomes Project. Differences in mutational signatures across continental populations have been reported in multiple cohorts. These differences be- tween populations are measured as an enrichment in certain types of mutations. Over time, these mutational signatures can lead to the observation of genetic population structure. The source of these mutational signatures have been hypothesized to be the result of environmental factors or mutator phenotypes. However, we determined that the signature observed in the Japanese population of the 1000 Genomes Project was the result of a technical artefact resulting from sequencing technology batch effects. We developed new statistical methods that enabled us to identify suspicious variants in the Japanese cohort as well as the rest of the 1000 Genomes Project cohort. We also identified a number of publications whose results will have to be revisited in the light of our findings.Moving beyond technical artefacts, we turn our attention to another well studied founder population : the French- Canadian (FC) population of Quebec. First, by comparing the genomes of 2,276 French and 20,451 FC individuals, we find the structure observed in the FC population is independent of ancestral French population structure. Then, we generalized the msprime software to perform genome-wide coalescent simulations conditioning on the known pedigree of the FC population and provide a freely accessible simulated whole-genome sequence dataset with spatiotemporal metadata for 1,426,749 individuals reflecting intricate FC population structure. Furthermore, we detail how topography and historical events shaped the present day population of FC. We find enrichments for migration rates, genetic and genealogical relatedness within river networks across Quebec. We expect this high-resolution model of human populations will provide new opportunities to investigate population genetics at an unprecedented scale
In this thesis, I apply newly developed sequencing methods and analytical techniques to study the source and nature of adaptive variation in the context of a long term evolution experiment with ...Chlamydomonas reinhardtii as a model. I focus on two distinct measures of evolutionary dynamics and use a variety of bioinformatic methods to process and analyze the genomic data to answer such questions. The first chapter addresses the question of whether standing genetic variation or novel mutations are the source of adaptive variation. By comparing the local ancestry of each genomic region for each sample, I have been able to determine that standing variation is still present in each population. Thus proving that a "hard sweep" has not occurred. The second chapter attempts to describe the function of the genes that have been impacted by biologically significant mutations that have been actively under selection over the course of the experiment. I performed a variant annotation which categorizes the impact of each variant, followed by an estimation of selection coefficient via allele frequency change over the course of the experiment. The combination of these two steps allowed me to limit my enrichment analysis to genes that have been actively under selection. My results suggest that the genes under selection in this experiment are enriched in signaling pathways and gene regulation. Together, these two analyses begin to unveil how sexual populations of moderate size adapt to novel environments.
Melanoma is an immunogenic cancer with a high response rate to immune checkpoint inhibitors (ICIs). It harbors a high mutation burden compared with other cancers and, as a result, has abundant ...tumor-infiltrating lymphocytes (TILs) within its microenvironment. However, understanding the complex interplay between the stroma, tumor cells, and distinct TIL subsets remains a substantial challenge in immune oncology. To properly study this interplay, quantifying spatial relationships of multiple cell types within the tumor microenvironment is crucial. To address this, we used cytometry time-of-flight (CyTOF) imaging mass cytometry (IMC) to simultaneously quantify the expression of 35 protein markers, characterizing the microenvironment of 5 benign nevi and 67 melanomas. We profiled more than 220,000 individual cells to identify melanoma, lymphocyte subsets, macrophage/monocyte, and stromal cell populations, allowing for in-depth spatial quantification of the melanoma microenvironment. We found that within pretreatment melanomas, the abundance of proliferating antigen-experienced cytotoxic T cells (CD8
CD45RO
Ki67
) and the proximity of antigen-experienced cytotoxic T cells to melanoma cells were associated with positive response to ICIs. Our study highlights the potential of multiplexed single-cell technology to quantify spatial cell-cell interactions within the tumor microenvironment to understand immune therapy responses.