Patterns of genetic diversity within populations of human pathogens, shaped by the ecology of host-microbe interactions, contain important information about the epidemiological history of infectious ...disease. Exploiting this information, however, requires a systematic approach that distinguishes the genetic signal generated by epidemiological processes from the effects of other forces, such as recombination, mutation, and population history. Here, a variety of quantitative techniques were employed to investigate multilocus sequence information from isolate collections of Neisseria meningitidis, a major cause of meningitis and septicemia world wide. This allowed quantitative evaluation of alternative explanations for the observed population structure. A coalescent-based approach was employed to estimate the rate of mutation, the rate of recombination, and the size distribution of recombination fragments from samples from disease-associated and carried meningococci obtained in the Czech Republic in 1993 and a global collection of disease-associated isolates collected globally from 1937 to 1996. The parameter estimates were used to reject a model in which genetic structure arose by chance in small populations, and analysis of molecular variation showed that geographically restricted gene flow was unlikely to be the cause of the genetic structure. The genetic differentiation between disease and carriage isolate collections indicated that, whereas certain genotypes were overrepresented among the disease-isolate collections (the "hyperinvasive" lineages), disease-associated and carried meningococci exhibited remarkably little differentiation at the level of individual nucleotide polymorphisms. In combination, these results indicated the repeated action of natural selection on meningococcal populations, possibly arising from the coevolutionary dynamic of host-pathogen interactions.
Gene conversion plays an important part in shaping genetic diversity in populations, yet estimating the rate at which it occurs is difficult because of the short lengths of DNA involved. We have ...developed a new statistical approach to estimating gene conversion rates from genetic variation, by extending an existing model for haplotype data in the presence of crossover events. We show, by simulation, that when the rate of gene conversion events is at least comparable to the rate of crossover events, the method provides a powerful approach to the detection of gene conversion and estimation of its rate. Application of the method to data from the telomeric X chromosome of Drosophila melanogaster, in which crossover activity is suppressed, indicates that gene conversion occurs approximately 400 times more often than crossover events. We also extend the method to estimating variable crossover and gene conversion rates and estimate the rate of gene conversion to be approximately 1.5 times higher than the crossover rate in a region of human chromosome 1 with known recombination hotspots.
Lapatinib is associated with a low incidence of serious liver injury. Previous investigations have identified and confirmed the Class II allele HLA-DRB1*07:01 to be strongly associated with ...lapatinib-induced liver injury; however, the moderate positive predictive value limits its clinical utility. To assess whether additional genetic variants located within the major histocompatibility complex locus or elsewhere in the genome may influence lapatinib-induced liver injury risk, and potentially lead to a genetic association with improved predictive qualities, we have taken two approaches: a genome-wide association study and a whole-genome sequencing study. This evaluation did not reveal additional associations other than the previously identified association for HLA-DRB1*07:01. The present study represents the most comprehensive genetic evaluation of drug-induced liver injury (DILI) or hypersensitivity, and suggests that investigation of possible human leukocyte antigen associations with DILI and other hypersensitivities represents an important first step in understanding the mechanism of these events.
The congenital dyserythropoietic anemias are a heterogeneous group of rare disorders primarily affecting erythropoiesis with characteristic morphological abnormalities and a block in erythroid ...maturation. Mutations in the CDAN1 gene, which encodes Codanin-1, underlie the majority of congenital dyserythropoietic anemia type I cases. However, no likely pathogenic CDAN1 mutation has been detected in approximately 20% of cases, suggesting the presence of at least one other locus. We used whole genome sequencing and segregation analysis to identify a homozygous T to A transversion (c.533T>A), predicted to lead to a p.L178Q missense substitution in C15ORF41, a gene of unknown function, in a consanguineous pedigree of Middle-Eastern origin. Sequencing C15ORF41 in other CDAN1 mutation-negative congenital dyserythropoietic anemia type I pedigrees identified a homozygous transition (c.281A>G), predicted to lead to a p.Y94C substitution, in two further pedigrees of SouthEast Asian origin. The haplotype surrounding the c.281A>G change suggests a founder effect for this mutation in Pakistan. Detailed sequence similarity searches indicate that C15ORF41 encodes a novel restriction endonuclease that is a member of the Holliday junction resolvase family of proteins.
The completion of the International HapMap Project marks the start of a new phase in human genetics. The aim of the project was to provide a resource that facilitates the design of efficient ...genome-wide association studies, through characterising patterns of genetic variation and linkage disequilibrium in a sample of 270 individuals across four geographical populations. In total, over one million SNPs have been typed across these genomes, providing an unprecedented view of human genetic diversity. In this review we focus on what the HapMap Project has taught us about the structure of human genetic variation and the fundamental molecular and evolutionary processes that shape it.
Using the statistical analysis of genetic variation, we have developed a high-resolution genetic map of recombination hotspots and recombination rate variation across the human genome. This map, ...which has a resolution several orders of magnitude greater than previous studies, identifies over 25,000 recombination hotspots and gives new insights into the distribution and determination of recombination. Wavelet-based analysis demonstrates scale-specific influences of base composition, coding context and DNA repeats on recombination rates, though, in contrast with other species, no association with DNase I hypersensitivity. We have also identified specific DNA motifs that are strongly associated with recombination hotspots and whose activity is influenced by local context. Comparative analysis of recombination rates in humans and chimpanzees demonstrates very high rates of evolution of the fine-scale structure of the recombination landscape. In the light of these observations, we suggest possible resolutions of the hotspot paradox.
Genetic maps, which document the way in which recombination rates vary over a genome, are an essential tool for many genetic analyses. We present a high-resolution genetic map of the human genome, ...based on statistical analyses of genetic variation data, and identify more than 25,000 recombination hotspots, together with motifs and sequence contexts that play a role in hotspot activity. Differences between the behavior of recombination rates over large (megabase) and small (kilobase) scales lead us to suggest a two-stage model for recombination in which hotspots are stochastic features, within a framework in which large-scale rates are constrained.
Instances in which natural selection maintains genetic variation in a population over millions of years are thought to be extremely rare. We conducted a genome-wide scan for long-lived balancing ...selection by looking for combinations of SNPs shared between humans and chimpanzees. In addition to the major histocompatibility complex, we identified 125 regions in which the same haplotypes are segregating in the two species, all but two of which are noncoding. In six cases, there is evidence for an ancestral polymorphism that persisted to the present in humans and chimpanzees. Regions with shared haplotypes are significantly enriched for membrane glycoproteins, and a similar trend is seen among shared coding polymorphisms. These findings indicate that ancient balancing selection has shaped human variation and point to genes involved in host-pathogen interactions as common targets.
The equilibrium per-genome mutation rate in sexual species is thought to result from a trade-off between the benefits of reducing the deleterious mutation rate and the costs of increasing fidelity. ...We propose that selection will often favour a lower mutation rate on the X chromosome than on autosomes, owing to the exposure of deleterious recessive mutations on hemizygous chromosomes. We tested this hypothesis by examining 33 X-linked genes that have been sequenced in both mouse and rat, and compared their rate of evolution against 238 autosomal genes. The X-linked genes were found to have a significantly lower rate of synonymous substitution than the autosomal genes. Neither the supposed higher mutation rate in males nor stronger purifying selection against slightly deleterious mutations on the X chromosome can account for the low value. The most parsimonious explanation is that rodents have a lower mutation rate on the X chromosome than on autosomes. It is therefore likely that previous indirect estimates of the excess male mutation rate are inaccurate. Indeed, after correction we find no evidence for a male-biased mutation rate in rodents. Furthermore, the rate of synonymous substitution in Y-linked genes is not significantly different from that in autosomal ones. The extent to which enhanced male mutation rates are problematic for the mutational deterministic model of the evolution of sex must, in turn, be questioned.
Genetic variation at classical HLA alleles is a crucial determinant of transplant success and susceptibility to a large number of infectious and autoimmune diseases. However, large-scale studies ...involving classical type I and type II HLA alleles might be limited by the cost of allele-typing technologies. Although recent studies have shown that some common HLA alleles can be tagged with small numbers of markers,
1,2
SNP-based tagging does not offer a complete solution to predicting HLA alleles. We have developed a new statistical methodology to use SNP variation within the region to predict alleles at key class I (
HLA-A,
HLA-B, and
HLA-C) and class II (
HLA-DRB1,
HLA-DQA1, and
HLA-DQB1) loci. Our results indicate that a single panel of ∼100 SNPs typed across the region is sufficient for predicting both rare and common HLA alleles with up to 95% accuracy in both African and non-African populations. Furthermore, we show that HLA alleles can be successfully predicted by using previously genotyped SNPs that are within the MHC and that had not been chosen for their ability to predict HLA alleles, such as those included on genome-wide products. These results indicate that our methodology, combined with an extended database of reference haplotypes, will facilitate large-scale experiments, including disease-association studies and vaccine trials, in which detailed information about HLA type is valuable.