•Compilation of spatial prevalence information on eye and hair pigmentation.•Combination of recent and older sources, together with an uncertainty assessment.•Spatial interpolation of trait ...prevalence.
DNA-based prediction of externally visible characteristics has become an established approach in forensic genetics, with the aim of tracing individuals who are potentially unknown to the investigating authorities but without using this prediction as evidence in court. While a number of prediction models have been proposed, use of prior probabilities in those models has largely been absent. Here, we aim at compiling information on the spatial distribution of eye and hair coloration in order to use this as prior knowledge to improve prediction accuracy. To this end, we conducted a detailed literature review and created maps showing the eye and hair pigmentation prevalence both by countries with available information and by interpolation in order to obtain prior estimates for populations without available data. Furthermore, we assessed the association between these two traits in a very large data set. A strong limitation was the quite low amount of available data, especially outside Europe. We hope that our results will facilitate the improvement of already existing and of novel prediction methods for pigmentation traits and induce further studies on the spatial distribution of these traits.
So far, the annotation of translation initiation sites (TISs) has been based mostly upon bioinformatics rather than experimental evidence. We adapted ribosomal footprinting to puromycin-treated cells ...to generate a transcriptome-wide map of TISs in a human monocytic cell line. A neural network was trained on the ribosomal footprints observed at previously annotated AUG translation initiation codons (TICs), and used for the ab initio prediction of TISs in 5062 transcripts with sufficient sequence coverage. Functional interpretation suggested 2994 novel upstream open reading frames (uORFs) in the 5' UTR, 1406 uORFs overlapping with the coding sequence, and 546 N-terminal protein extensions. The TIS detection method was validated on the basis of previously published alternative TISs and uORFs. Among primates, TICs in newly annotated TISs were significantly more conserved than control codons, both for AUGs and near-cognate codons. The transcriptome-wide map of novel candidate TISs derived as part of the study will shed further light on the way in which human proteome diversity is influenced by alternative translation initiation and regulation.
The use of next-generation sequencing approaches in clinical diagnostics has led to a tremendous increase in data and a vast number of variants of uncertain significance that require interpretation. ...Therefore, prediction of the effects of missense mutations using in silico tools has become a frequently used approach. Aim of this study was to assess the reliability of in silico prediction as a basis for clinical decision making in the context of hereditary breast and/or ovarian cancer.
We tested the performance of four prediction tools (Align-GVGD, SIFT, PolyPhen-2, MutationTaster2) using a set of 236 BRCA1/2 missense variants that had previously been classified by expert committees. However, a major pitfall in the creation of a reliable evaluation set for our purpose is the generally accepted classification of BRCA1/2 missense variants using the multifactorial likelihood model, which is partially based on Align-GVGD results. To overcome this drawback we identified 161 variants whose classification is independent of any previous in silico prediction. In addition to the performance as stand-alone tools we examined the sensitivity, specificity, accuracy and Matthews correlation coefficient (MCC) of combined approaches.
PolyPhen-2 achieved the lowest sensitivity (0.67), specificity (0.67), accuracy (0.67) and MCC (0.39). Align-GVGD achieved the highest values of specificity (0.92), accuracy (0.92) and MCC (0.73), but was outperformed regarding its sensitivity (0.90) by SIFT (1.00) and MutationTaster2 (1.00). All tools suffered from poor specificities, resulting in an unacceptable proportion of false positive results in a clinical setting. This shortcoming could not be bypassed by combination of these tools. In the best case scenario, 138 families would be affected by the misclassification of neutral variants within the cohort of patients of the German Consortium for Hereditary Breast and Ovarian Cancer.
We show that due to low specificities state-of-the-art in silico prediction tools are not suitable to predict pathogenicity of variants of uncertain significance in BRCA1/2. Thus, clinical consequences should never be based solely on in silico forecasts. However, our data suggests that SIFT and MutationTaster2 could be suitable to predict benignity, as both tools did not result in false negative predictions in our analysis.
The availability of high-density panels of genetic polymorphisms has led to the discovery of extended regions of apparent autozygosity in the human genome. At the genotype level, these regions ...present as sizeable stretches, or ‘runs’, of homozygosity (ROH). Here, we investigated both the genomic and the geographic distribution of ROHs in a large European sample of individuals originating from 23 subpopulations. The genomic ROH distribution was found to be characterized by a pattern of highly significant non-uniformity that was virtually identical in all subpopulations studied. Some 77 chromosomal regions contained ROHs at considerable frequency, thereby forming ‘ROH islands’ that were not explicable by high linkage disequilibrium alone. At the geographic level, the number and cumulative length of ROHs followed a prominent South to North gradient in agreement with expectations from European population history. The individual ROH length, in contrast, showed only minor and unsystematic geographic variation. While our findings are thus consistent with a larger effective population size in Southern than in Northern Europe, combined with a higher historic population density and mobility, they also indicate that the patterns of meiotic recombination in humans must have been very similar throughout the continent. Extending previous reports of a strong correlation between geography and identity-by-state, our data show that the genomic identity-by-descent patterns of Europeans are also clinal. As a consequence, the planning, design and interpretation of ROH-based genetic studies must take sample origin into account in order for such studies to be sensible and valid.
Individuals with Dupuytren disease (DD) are commonly seen by physicians and surgeons across multiple specialties. It is an increasingly common and disabling fibroproliferative disorder of the palmar ...fascia, which leads to flexion contractures of the digits, and is associated with other tissue-specific fibroses. DD affects between 5% and 25% of people of European descent and is the most common inherited disease of connective tissue. We undertook the largest GWAS to date in individuals with a surgically validated diagnosis of DD from the UK, with replication in British, Dutch, and German individuals. We validated association at all nine previously described signals and discovered 17 additional variants with p ≤ 5 × 10−8. As a proof of principle, we demonstrated correlation of the high-risk genotype at the statistically most strongly associated variant with decreased secretion of the soluble WNT-antagonist SFRP4, in surgical specimen-derived DD myofibroblasts. These results highlight important pathways involved in the pathogenesis of fibrosis, including WNT signaling, extracellular matrix modulation, and inflammation. In addition, many associated loci contain genes that were hitherto unrecognized as playing a role in fibrosis, opening up new avenues of research that may lead to novel treatments for DD and fibrosis more generally. DD represents an ideal human model disease for fibrosis research.
Leprosy, a chronic infectious disease caused by Mycobacterium leprae (M. leprae), was very common in Europe till the 16th century. Here, we perform an ancient DNA study on medieval skeletons from ...Denmark that show lesions specific for lepromatous leprosy (LL). First, we test the remains for M. leprae DNA to confirm the infection status of the individuals and to assess the bacterial diversity. We assemble 10 complete M. leprae genomes that all differ from each other. Second, we evaluate whether the human leukocyte antigen allele DRB1*15:01, a strong LL susceptibility factor in modern populations, also predisposed medieval Europeans to the disease. The comparison of genotype data from 69 M. leprae DNA-positive LL cases with those from contemporary and medieval controls reveals a statistically significant association in both instances. In addition, we observe that DRB1*15:01 co-occurs with DQB1*06:02 on a haplotype that is a strong risk factor for inflammatory diseases today.
Dupuytren's disease (DD) is a highly heritable fibrotic disorder of the hand with incompletely understood etiology. A number of genetic loci, including Wnt signaling members, have been previously ...identified. Our overall aim was to identify novel genetic loci, to prioritize genes within the loci for functional studies, and to assess genetic correlation with associated disorders. We performed a meta-analysis of six DD genome-wide association studies from three European countries and extensive bioinformatic follow-up analyses. Leveraging 11,320 cases and 47,023 controls, we identified 85 genome-wide significant single nucleotide polymorphisms in 56 loci, of which 11 were novel, explaining 13.3-38.1% of disease variance. Gene prioritization implicated the Hedgehog and Notch signaling pathways. We also identified a significant genetic correlation with frozen shoulder. The pathways identified highlight the potential for new therapeutic targets and provide a basis for additional mechanistic studies for a common disorder that can severely impact hand function.
• Genome-wide association study (GWAS) analyzing 763 long-lived individuals and 1085 controls replicates
apolipoprotein E (
APOE) as the major susceptibility factor for human longevity. • GWAS fails ...to detect further susceptibility genes. • Test of association of 33 previously identified markers in a US-American GWAS, replicates only
APOE in the German study sample.
We conducted a case–control genome-wide association study (GWAS) of human longevity, comparing 664,472 autosomal SNPs in 763 long-lived individuals (LLI; mean age: 99.7 years) and 1085 controls (mean age: 60.2 years) from Germany. Only one association, namely that of SNP rs4420638 near the
APOC1 gene, achieved genome-wide significance (allele-based
P
=
1.8
×
10
−10). However, logistic regression analysis revealed that this association, which was replicated in an independent German sample, is fully explicable by linkage disequilibrium with the
APOE allele ɛ4, the only variant hitherto established as a major genetic determinant of survival into old age. Our GWAS failed to identify any additional autosomal susceptibility genes. One explanation for this lack of success in our study would be that GWAS provide only limited statistical power for a polygenic phenotype with loci of small effect such as human longevity. A recent GWAS in Dutch LLI independently confirmed the
APOE–longevity association, thus strengthening the conclusion that this locus is a very, if not the most, important genetic factor influencing longevity.