Local-ancestry inference is an important step in the genetic analysis of fully sequenced human genomes. Current methods can only detect continental-level ancestry (i.e., European versus African ...versus Asian) accurately even when using millions of markers. Here, we present RFMix, a powerful discriminative modeling approach that is faster (∼30×) and more accurate than existing methods. We accomplish this by using a conditional random field parameterized by random forests trained on reference panels. RFMix is capable of learning from the admixed samples themselves to boost performance and autocorrect phasing errors. RFMix shows high sensitivity and specificity in simulated Hispanics/Latinos and African Americans and admixed Europeans, Africans, and Asians. Finally, we demonstrate that African Americans in HapMap contain modest (but nonzero) levels of Native American ancestry (∼0.4%).
There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where ...settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and especially Native American ancestry to these populations. Estimated Native American ancestry is 48% in MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern American ancestry of the Taíno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas 16 thousand years ago (kya), supports that the MXL Ancestors split 12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The model also features effective populations of 62,000 in Mexico, 8,700 in Colombia, and 1,900 in Puerto Rico. Modeling Identity-by-descent (IBD) and ancestry tract length, we show that post-contact populations also differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe. Finally, we compare IBD and ancestry assignments to find evidence for relatedness among European founders to the three populations.
The Out-of-Africa (OOA) dispersal ∼50,000 y ago is characterized by a series of founder events as modern humans expanded into multiple continents. Population genetics theory predicts an increase of ...mutational load in populations undergoing serial founder effects during range expansions. To test this hypothesis, we have sequenced full genomes and high-coverage exomes from seven geographically divergent human populations from Namibia, Congo, Algeria, Pakistan, Cambodia, Siberia, and Mexico. We find that individual genomes vary modestly in the overall number of predicted deleterious alleles. We show via spatially explicit simulations that the observed distribution of deleterious allele frequencies is consistent with the OOA dispersal, particularly under a model where deleterious mutations are recessive. We conclude that there is a strong signal of purifying selection at conserved genomic positions within Africa, but that many predicted deleterious mutations have evolved as if they were neutral during the expansion out of Africa. Under a model where selection is inversely related to dominance, we show that OOA populations are likely to have a higher mutation load due to increased allele frequencies of nearly neutral variants that are recessive or partially recessive.
Human genetic diversity in southern Europe is higher than in other regions of the continent. This difference has been attributed to postglacial expansions, the demic diffusion of agriculture from the ...Near East, and gene flow from Africa. Using SNP data from 2,099 individuals in 43 populations, we show that estimates of recent shared ancestry between Europe and Africa are substantially increased when gene flow from North Africans, rather than Sub-Saharan Africans, is considered. The gradient of North African ancestry accounts for previous observations of low levels of sharing with Sub-Saharan Africa and is independent of recent gene flow from the Near East. The source of genetic diversity in southern Europe has important biomedical implications; we find that most disease risk alleles from genome-wide association studies follow expected patterns of divergence between Europe and North Africa, with the principal exception of multiple sclerosis.
There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where ...settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and especially Native American ancestry to these populations. Estimated Native American ancestry is in MXL, in CLM, and in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern America ancestry of the Taíno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas thousand years ago (kya), supports that the MXL Ancestors split kya, with a subsequent split of the ancestors to CLM and PUR kya. The model also features effective populations of in Mexico, in Colombia, and in Puerto Rico. Modeling Identity-by-descent (IBD) and ancestry tract length, we show that post-contact populations also differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe. Finally, we compare IBD and ancestry assignments to find evidence for relatedness among European founders to the three populations.
Local ancestry inference is an important step in both medical genetics studies and demographic studies. This is because many human populations are the result of admixture, or the interbreeding of ...distinct ancestral populations. The recent drastic increase in sample sizes and marker densities of population genetic data, particularly from whole-genome sequencing, provides an opportunity for computational methods to harness this data to accurately infer fine-scale local ancestry. However, current approaches to inferring local ancestry can only detect continental-level ancestry accurately and are too computationally complex to handle fully sequenced human genomes. Thus there is a need for methods that can utilize massive population genetics data sets to infer fine-scale ancestry in a computationally rapid and robust manner. In this thesis, I describe my contributions toward this goal. First, I describe a method I developed called RFMix, which uses conditional random fields parameterized by random forests to rapidly train on massive data sets, infer fine-scale local ancestry and correct phase. Second, I evaluate RFMix using simulated and real data sets and compare it to other methods. I also apply RFMix to real data sets to infer demographic histories. Finally, I develop a pipeline for generating reference panels from large databases containing mislabeled and unlabeled samples, and apply it to the massive AncestryDNA genetic database to show that using local ancestry inference as an intermediate analysis step gives better global ancestry estimates than traditional direct approaches.
There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where ...settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and Native American ancestry to these populations. Estimated Native American ancestry is 48% in MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern America ancestry of the Taíno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas 16 thousand years ago (kya), supports that the MXL Ancestors split 12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The model also features effective populations of 62,000 in Mexico, 8,700 in Colombia, and 1,900 in Puerto Rico. Modeling Identity-by-descent and ancestry tract length, we show that post-contact populations differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe.
Abstract
Background
Studies estimate that 30%–50% of antibiotics prescribed for hospitalized patients are inappropriate, but pediatric data are limited. Characterization of inappropriate prescribing ...practices for children is needed to guide pediatric antimicrobial stewardship.
Methods
Cross-sectional analysis of antibiotic prescribing at 32 children’s hospitals in the United States. Subjects included hospitalized children with ≥ 1 antibiotic order at 8:00 am on 1 day per calendar quarter, over 6 quarters (quarter 3 2016–quarter 4 2017). Antimicrobial stewardship program (ASP) physicians and/or pharmacists used a standardized survey to collect data on antibiotic orders and evaluate appropriateness. The primary outcome was the percentage of antibiotics prescribed for infectious use that were classified as suboptimal, defined as inappropriate or needing modification.
Results
Of 34 927 children hospitalized on survey days, 12 213 (35.0%) had ≥ 1 active antibiotic order. Among 11 784 patients receiving antibiotics for infectious use, 25.9% were prescribed ≥ 1 suboptimal antibiotic. Of the 17 110 antibiotic orders prescribed for infectious use, 21.0% were considered suboptimal. Most common reasons for inappropriate use were bug–drug mismatch (27.7%), surgical prophylaxis > 24 hours (17.7%), overly broad empiric therapy (11.2%), and unnecessary treatment (11.0%). The majority of recommended modifications were to stop (44.7%) or narrow (19.7%) the drug. ASPs would not have routinely reviewed 46.1% of suboptimal orders.
Conclusions
Across 32 children’s hospitals, approximately 1 in 3 hospitalized children are receiving 1 or more antibiotics at any given time. One-quarter of these children are receiving suboptimal therapy, and nearly half of suboptimal use is not captured by current ASP practices.
At US children’s hospitals, 35% of children are receiving 1 or more antibiotics at any given time, and 26% of these children are receiving suboptimal antibiotics. Nearly half of suboptimal antibiotics are not reviewed by antimicrobial stewardship programs.
Although many children's hospitals have established antimicrobial stewardship programs (ASPs), data-driven benchmarks for optimizing antimicrobial use across centers are lacking. We developed a ...multicenter quality improvement collaborative focused on sharing data reports and benchmarking antimicrobial use to improve antimicrobial prescribing among hospitalized children.
A national antimicrobial stewardship collaborative among children's hospitals, Sharing Antimicrobial Reports for Pediatric Stewardship (SHARPS), was established in 2013. Characteristics of the hospitals and their ASPs were obtained through a standardized survey. Antimicrobial-use data reports were developed on the basis of input from the participating hospitals. Collaborative learning opportunities were provided through monthly webinars and annual meetings.
Since 2013, 36 US hospitals have participated in the SHARPS collaborative. The median full-time equivalent (pharmacist and physician) dedicated to 30 of these ASPs was 0.75 (interquartile range, 0.45-1.4). To date, the collaborative has developed 26 data reports that include benchmarking reports according to specific antimicrobial agents, indications, and clinical service lines. The collaborative has conducted 27 webinars and 3 in-person meetings to highlight the stewardship work being conducted in the hospitals. The data reports and learning opportunities have resulted in approximately 36 distinct stewardship interventions.
A pediatric antimicrobial stewardship collaborative has been successful in promoting the development of and innovation among pediatric ASPs. Additional research is needed to determine the impact of these efforts.