ABSTRACT
The purpose of the dbNSFP is to provide a one‐stop resource for functional predictions and annotations for human nonsynonymous single‐nucleotide variants (nsSNVs) and splice‐site variants ...(ssSNVs), and to facilitate the steps of filtering and prioritizing SNVs from a large list of SNVs discovered in an exome‐sequencing study. A list of all potential nsSNVs and ssSNVs based on the human reference sequence were created and functional predictions and annotations were curated and compiled for each SNV. Here, we report a recent major update of the database to version 3.0. The SNV list has been rebuilt based on GENCODE 22 and currently the database includes 82,832,027 nsSNVs and ssSNVs. An attached database dbscSNV, which compiled all potential human SNVs within splicing consensus regions and their deleteriousness predictions, add another 15,030,459 potentially functional SNVs. Eleven prediction scores (MetaSVM, MetaLR, CADD, VEST3, PROVEAN, 4× fitCons, fathmm‐MKL, and DANN) and allele frequencies from the UK10K cohorts and the Exome Aggregation Consortium (ExAC), among others, have been added. The original seven prediction scores in v2.0 (SIFT, 2× Polyphen2, LRT, MutationTaster, MutationAssessor, and FATHMM) as well as many SNV and gene functional annotations have been updated. dbNSFP v3.0 is freely available at http://sites.google.com/site/jpopgen/dbNSFP.
The purpose of the dbNSFP is to provide a one‐stop resource for functional predictions and annotations for human non‐synonymous single‐nucleotide variants (nsSNVs) and splice site variants (ssSNVs), and to facilitate the steps of filtering and prioritizing SNVs from a large list of SNVs discovered in an exome‐sequencing study. Here we report a recent major update of the database to version 3.0 and some preliminary analyses comparing the 24 functional prediction scores and conservation scores in dbNSFP v3.0.
In silico tools have been developed to predict variants that may have an impact on pre-mRNA splicing. The major limitation of the application of these tools to basic research and clinical practice is ...the difficulty in interpreting the output. Most tools only predict potential splice sites given a DNA sequence without measuring splicing signal changes caused by a variant. Another limitation is the lack of large-scale evaluation studies of these tools. We compared eight in silico tools on 2959 single nucleotide variants within splicing consensus regions (scSNVs) using receiver operating characteristic analysis. The Position Weight Matrix model and MaxEntScan outperformed other methods. Two ensemble learning methods, adaptive boosting and random forests, were used to construct models that take advantage of individual methods. Both models further improved prediction, with outputs of directly interpretable prediction scores. We applied our ensemble scores to scSNVs from the Catalogue of Somatic Mutations in Cancer database. Analysis showed that predicted splice-altering scSNVs are enriched in recurrent scSNVs and known cancer genes. We pre-computed our ensemble scores for all potential scSNVs across the human genome, providing a whole genome level resource for identifying splice-altering scSNVs discovered from large-scale sequencing studies.
Set-based analysis that jointly tests the association of variants in a group has emerged as a popular tool for analyzing rare and low-frequency variants in sequencing studies. The existing set-based ...tests can suffer significant power loss when only a small proportion of variants are causal, and their powers can be sensitive to the number, effect sizes, and effect directions of the causal variants and the choices of weights. Here we propose an aggregated Cauchy association test (ACAT), a general, powerful, and computationally efficient p value combination method for boosting power in sequencing studies. First, by combining variant-level p values, we use ACAT to construct a set-based test (ACAT-V) that is particularly powerful in the presence of only a small number of causal variants in a variant set. Second, by combining different variant-set-level p values, we use ACAT to construct an omnibus test (ACAT-O) that combines the strength of multiple complimentary set-based tests, including the burden test, sequence kernel association test (SKAT), and ACAT-V. Through analysis of extensively simulated data and the whole-genome sequencing data from the Atherosclerosis Risk in Communities (ARIC) study, we demonstrate that ACAT-V complements the SKAT and the burden test, and that ACAT-O has a substantially more robust and higher power than those of the alternative tests.
ARIC (Atherosclerosis Risk In Communities) initiated community-based surveillance in 1987 for myocardial infarction and coronary heart disease (CHD) incidence and mortality and created a prospective ...cohort of 15,792 Black and White adults ages 45 to 64 years. The primary aims were to improve understanding of the decline in CHD mortality and identify determinants of subclinical atherosclerosis and CHD in Black and White middle-age adults. ARIC has examined areas including health disparities, genomics, heart failure, and prevention, producing more than 2,300 publications. Results have had strong clinical impact and demonstrate the importance of population-based research in the spectrum of biomedical research to improve health.
Nonalcoholic fatty liver disease (NAFLD) is a burgeoning health problem of unknown etiology that varies in prevalence among ancestry groups. To identify genetic variants contributing to differences ...in hepatic fat content, we carried out a genome-wide association scan of nonsynonymous sequence variations (n = 9,229) in a population comprising Hispanic, African American and European American individuals. An allele in PNPLA3 (rs738409G, encoding I148M) was strongly associated with increased hepatic fat levels (P = 5.9 × 10−10) and with hepatic inflammation (P = 3.7 × 10−4). The allele was most common in Hispanics, the group most susceptible to NAFLD; hepatic fat content was more than twofold higher in PNPLA3 rs738409G homozygotes than in noncarriers. Resequencing revealed another allele of PNPLA3 (rs6006460T, encoding S453I) that was associated with lower hepatic fat content in African Americans, the group at lowest risk of NAFLD. Thus, variation in PNPLA3 contributes to ancestry-related and inter-individual differences in hepatic fat content and susceptibility to NAFLD.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
RNA splicing is the process during which introns are excised and exons are spliced. The precise recognition of splicing signals is critical to this process, and mutations affecting splicing comprise ...a considerable proportion of genetic disease etiology. Analysis of RNA samples from the patient is the most straightforward and reliable method to detect splicing defects. However, currently, the technical limitation prohibits its use in routine clinical practice. In silico tools that predict potential consequences of splicing mutations may be useful in daily diagnostic activities. In this review, we provide medical geneticists with some basic insights into some of the most popular in silico tools for splicing defect prediction, from the viewpoint of end users. Bioinformaticians in relevant areas who are working on huge data sets may also benefit from this review. Specifically, we focus on those tools whose primary goal is to predict the impact of mutations within the 5' and 3' splicing consensus regions: the algorithms used by different tools as well as their major advantages and disadvantages are briefly introduced; the formats of their input and output are summarized; and the interpretation, evaluation, and prospection are also discussed.
The relative activity of lipoprotein lipase (LPL) in different tissues controls the partitioning of lipoprotein-derived fatty acids between sites of fat storage (adipose tissue) and oxidation (heart ...and skeletal muscle). Here we used a reverse genetic strategy to test the hypothesis that 4 angiopoietin-like proteins (ANGPTL3, -4, -5, and -6) play key roles in triglyceride (TG) metabolism in humans. We re-sequenced the coding regions of the genes encoding these proteins and identified multiple rare nonsynonymous (NS) sequence variations that were associated with low plasma TG levels but not with other metabolic phenotypes. Functional studies revealed that all mutant alleles of ANGPTL3 and ANGPTL4 that were associated with low plasma TG levels interfered either with the synthesis or secretion of the protein or with the ability of the ANGPTL protein to inhibit LPL. A total of 1% of the Dallas Heart Study population and 4% of those participants with a plasma TG in the lowest quartile had a rare loss-of-function mutation in ANGPTL3, ANGPTL4, or ANGPTL5. Thus, ANGPTL3, ANGPTL4, and ANGPTL5, but not ANGPTL6, play nonredundant roles in TG metabolism, and multiple alleles at these loci cumulatively contribute to variability in plasma TG levels in humans.
Using a polygenic score of DNA sequence polymorphisms, the authors of this study quantified genetic risk and assessed four healthy lifestyle factors. Among participants at high genetic risk, a ...healthy lifestyle was associated with a reduced risk of coronary disease.
Both genetic and lifestyle factors are key drivers of coronary artery disease, a complex disorder that is the leading cause of death worldwide.
1
A familial pattern in the risk of coronary artery disease was first described in 1938 and was subsequently confirmed in large studies involving twins and prospective cohorts.
2
–
6
Since 2007, genomewide association analyses have identified more than 50 independent loci associated with the risk of coronary artery disease.
7
–
15
These risk alleles, when aggregated into a polygenic risk score, are predictive of incident coronary events and provide a continuous and quantitative measure of genetic susceptibility.
16
–
24
Much . . .
Nomograms to predict normal aortic root diameter for body surface area (BSA) in broad ranges of age have been widely used but are limited by lack of consideration of gender effects, jumps in upper ...limits of aortic diameter among age strata, and data from older teenagers. Sinus of Valsalva diameter was measured by American Society of Echocardiography convention in normal-weight, nonhypertensive, nondiabetic subjects ≥15 years old without aortic valve disease from clinical or population-based samples. Analyses of covariance and linear regression with assessment of residuals identified determinants and developed predictive models for normal aortic root diameter. In 1,207 apparently normal subjects ≥15 years old (54% women), aortic root diameter was 2.1 to 4.3 cm. Aortic root diameter was strongly related to BSA and height (r = 0.48 for the 2 comparisons), age (r = 0.36), and male gender (+2.7 mm adjusted for BSA and age, p <0.001 for all comparisons). Multivariable equations using age, gender, and BSA or height predicted aortic diameter strongly (R = 0.674 for the 2 comparisons, p <0.001) with minimal relation of residuals to age or body size: for BSA 2.423 + (age years × 0.009) + (BSA square meters × 0.461) − (gender 1 = man, 2 = woman × 0.267), SEE 0.261 cm; for height 1.519 + (age years × 0.010) + (height centimeters × 0.010) − (gender 1 = man, 2 = woman × 0.247), SEE 0.215 cm. In conclusion, aortic root diameter is larger in men and increases with body size and age. Regression models incorporating body size, age, and gender are applicable to adolescents and adults without limitations of previous nomograms.
Blood pressure and kidney function have a bidirectional relation. Hypertension has long been considered as a risk factor for kidney function decline. However, whether intensive blood pressure control ...could promote kidney health has been uncertain. The kidney is known to have a major role in affecting blood pressure through sodium extraction and regulating electrolyte balance. This bidirectional relation makes causal inference between these two traits difficult. Therefore, to examine the causal relations between these two traits, we performed two-sample Mendelian randomization analyses using summary statistics of large-scale genome-wide association studies. We selected genetic instruments more likely to be specific for kidney function using meta-analyses of complementary kidney function biomarkers (glomerular filtration rate estimated from serum creatinine eGFRcr, and blood urea nitrogen from the CKDGen Consortium). Systolic and diastolic blood pressure summary statistics were from the International Consortium for Blood Pressure and UK Biobank. Significant evidence supported the causal effects of higher kidney function on lower blood pressure. Based on the mode-based Mendelian randomization method, the effect estimates for one standard deviation (SD) higher in log-transformed eGFRcr was -0.17 SD unit (95 % confidence interval: -0.09 to -0.24) in systolic blood pressure and -0.15 SD unit (95% confidence interval: -0.07 to -0.22) in diastolic blood pressure. In contrast, the causal effects of blood pressure on kidney function were not statistically significant. Thus, our results support causal effects of higher kidney function on lower blood pressure and suggest preventing kidney function decline can reduce the public health burden of hypertension.
Display omitted