Alzheimer's disease (AD) is a complex disorder influenced by environmental and genetic factors. Recent work has identified 11 AD markers in 10 loci. We used Genome-wide Complex Trait Analysis to ...analyze >2 million SNPs for 10,922 individuals from the Alzheimer's Disease Genetics Consortium to assess the phenotypic variance explained first by known late-onset AD loci, and then by all SNPs in the Alzheimer's Disease Genetics Consortium dataset. In all, 33% of total phenotypic variance is explained by all common SNPs. APOE alone explained 6% and other known markers 2%, meaning more than 25% of phenotypic variance remains unexplained by known markers, but is tagged by common SNPs included on genotyping arrays or imputed with HapMap genotypes. Novel AD markers that explain large amounts of phenotypic variance are likely to be rare and unidentifiable using genome-wide association studies. Based on our findings and the current direction of human genetics research, we suggest specific study designs for future studies to identify the remaining heritability of Alzheimer's disease.
The human genome contains "dark" gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations ...within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions.
Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark and 35.2% are ≥ 5% dark. We identify dark regions that are present in protein-coding exons across 748 genes. Linked-read or long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduce dark protein-coding regions to approximately 50.5%, 35.6%, and 9.6%, respectively. We present an algorithm to resolve most camouflaged regions and apply it to the Alzheimer's Disease Sequencing Project. We rescue a rare ten-nucleotide frameshift deletion in CR1, a top Alzheimer's disease gene, found in disease cases but not in controls.
While we could not formally assess the association of the CR1 frameshift mutation with Alzheimer's disease due to insufficient sample-size, we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies.
TREM2 variants in Alzheimer's disease Guerreiro, Rita; Wojtas, Aleksandra; Bras, Jose ...
New England journal of medicine/The New England journal of medicine,
01/2013, Letnik:
368, Številka:
2
Journal Article
Recenzirano
Odprti dostop
Homozygous loss-of-function mutations in TREM2, encoding the triggering receptor expressed on myeloid cells 2 protein, have previously been associated with an autosomal recessive form of early-onset ...dementia.
We used genome, exome, and Sanger sequencing to analyze the genetic variability in TREM2 in a series of 1092 patients with Alzheimer's disease and 1107 controls (the discovery set). We then performed a meta-analysis on imputed data for the TREM2 variant rs75932628 (predicted to cause a R47H substitution) from three genomewide association studies of Alzheimer's disease and tested for the association of the variant with disease. We genotyped the R47H variant in an additional 1887 cases and 4061 controls. We then assayed the expression of TREM2 across different regions of the human brain and identified genes that are differentially expressed in a mouse model of Alzheimer's disease and in control mice.
We found significantly more variants in exon 2 of TREM2 in patients with Alzheimer's disease than in controls in the discovery set (P=0.02). There were 22 variant alleles in 1092 patients with Alzheimer's disease and 5 variant alleles in 1107 controls (P<0.001). The most commonly associated variant, rs75932628 (encoding R47H), showed highly significant association with Alzheimer's disease (P<0.001). Meta-analysis of rs75932628 genotypes imputed from genomewide association studies confirmed this association (P=0.002), as did direct genotyping of an additional series of 1887 patients with Alzheimer's disease and 4061 controls (P<0.001). Trem2 expression differed between control mice and a mouse model of Alzheimer's disease.
Heterozygous rare variants in TREM2 are associated with a significant increase in the risk of Alzheimer's disease. (Funded by Alzheimer's Research UK and others.).
Identical codon pairing and co-tRNA codon pairing increase translational efficiency within genes when two codons that encode the same amino acid are translated by the same tRNA before it diffuses ...from the ribosome. We examine the phylogenetic signal in both identical and co-tRNA codon pairing across 23 428 species using alignment-free and parsimony methods. We determined that conserved codon pairing typically has a smaller window size than the length of a ribosome, and codon pairing tracks phylogenies across various taxonomic groups. We report a comprehensive analysis of codon pairing, including the extent to which each codon pairs. Our parsimony method generally recovers phylogenies that are more congruent with the established phylogenies than our alignment-free method. However, four of the ten taxonomic groups did not have sufficient orthologous codon pairings and were therefore analyzed using only the alignment-free methods. Since the recovered phylogenies using only codon pairing largely match phylogenies from the Open Tree of Life and the NCBI taxonomy, and are comparable to trees recovered by other algorithms, we propose that codon pairing biases are phylogenetically conserved and should be considered in conjunction with other phylogenomic techniques.
Analyzing next-generation sequencing data is difficult because datasets are large, second generation sequencing platforms have high error rates, and because each position in the target genome (exome, ...transcriptome, etc.) is sequenced multiple times. Given these challenges, numerous bioinformatic algorithms have been developed to analyze these data. These algorithms aim to find an appropriate balance between data loss, errors, analysis time, and memory footprint. Typical analysis pipelines require multiple steps. If one or more of these steps is unnecessary, it would significantly decrease compute time and data manipulation to remove the step. One step in many pipelines is PCR duplicate removal, where PCR duplicates arise from multiple PCR products from the same template molecule binding on the flowcell. These are often removed because there is concern they can lead to false positive variant calls. Picard (MarkDuplicates) and SAMTools (rmdup) are the two main softwares used for PCR duplicate removal.
Approximately 92 % of the 17+ million variants called were called whether we removed duplicates with Picard or SAMTools, or left the PCR duplicates in the dataset. There were no significant differences between the unique variant sets when comparing the transition/transversion ratios (p = 1.0), percentage of novel variants (p = 0.99), average population frequencies (p = 0.99), and the percentage of protein-changing variants (p = 1.0). Results were similar for variants in the American College of Medical Genetics genes. Genotype concordance between NGS and SNP chips was above 99 % for all genotype groups (e.g., homozygous reference).
Our results suggest that PCR duplicate removal has minimal effect on the accuracy of subsequent variant calls.
Alzheimer disease (AD) is a progressive disorder that affects cognitive function. There is increasing support for the role of neuroinflammation and aberrant immune regulation in the pathophysiology ...of AD. The immunoregulatory human leukocyte antigen (HLA) complex has been linked to susceptibility for a number of neurodegenerative diseases, including AD; however, studies to date have failed to consistently identify a risk HLA haplotype for AD. Contributing to this difficulty are the complex genetic organization of the HLA region, differences in sequencing and allelic imputation methods, and diversity across ethnic populations.
Building on prior work linking the HLA to AD, we used a robust imputation method on two separate case-control cohorts to examine the relationship between HLA haplotypes and AD risk in 309 individuals (191 AD, 118 cognitively normal CN controls) from the San Francisco-based University of California, San Francisco (UCSF) Memory and Aging Center (collected between 1999-2015) and 11,381 individuals (5,728 AD, 5,653 CN controls) from the Alzheimer's Disease Genetics Consortium (ADGC), a National Institute on Aging (NIA)-funded national data repository (reflecting samples collected between 1984-2012). We also examined cerebrospinal fluid (CSF) biomarker measures for patients seen between 2005-2007 and longitudinal cognitive data from the Alzheimer's Disease Neuroimaging Initiative (n = 346, mean follow-up 3.15 ± 2.04 y in AD individuals) to assess the clinical relevance of identified risk haplotypes. The strongest association with AD risk occurred with major histocompatibility complex (MHC) haplotype A*03:01~B*07:02~DRB1*15:01~DQA1*01:02~DQB1*06:02 (p = 9.6 x 10-4, odds ratio OR 95% confidence interval = 1.21 1.08-1.37) in the combined UCSF + ADGC cohort. Secondary analysis suggested that this effect may be driven primarily by individuals who are negative for the established AD genetic risk factor, apolipoprotein E (APOE) ɛ4. Separate analyses of class I and II haplotypes further supported the role of class I haplotype A*03:01~B*07:02 (p = 0.03, OR = 1.11 1.01-1.23) and class II haplotype DRB1*15:01- DQA1*01:02- DQB1*06:02 (DR15) (p = 0.03, OR = 1.08 1.01-1.15) as risk factors for AD. We followed up these findings in the clinical dataset representing the spectrum of cognitively normal controls, individuals with mild cognitive impairment, and individuals with AD to assess their relevance to disease. Carrying A*03:01~B*07:02 was associated with higher CSF amyloid levels (p = 0.03, β ± standard error = 47.19 ± 21.78). We also found a dose-dependent association between the DR15 haplotype and greater rates of cognitive decline (greater impairment on the 11-item Alzheimer's Disease Assessment Scale cognitive subscale ADAS11 over time p = 0.03, β ± standard error = 0.7 ± 0.3; worse forgetting score on the Rey Auditory Verbal Learning Test (RAVLT) over time p = 0.02, β ± standard error = -0.2 ± 0.06). In a subset of the same cohort, dose of DR15 was also associated with higher baseline levels of chemokine CC-4, a biomarker of inflammation (p = 0.005, β ± standard error = 0.08 ± 0.03). The main study limitations are that the results represent only individuals of European-ancestry and clinically diagnosed individuals, and that our study used imputed genotypes for a subset of HLA genes.
We provide evidence that variation in the HLA locus-including risk haplotype DR15-contributes to AD risk. DR15 has also been associated with multiple sclerosis, and its component alleles have been implicated in Parkinson disease and narcolepsy. Our findings thus raise the possibility that DR15-associated mechanisms may contribute to pan-neuronal disease vulnerability.
Abstract Introduction Genetic data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) have been crucial in advancing the understanding of Alzheimer's disease (AD) pathophysiology. Here, we ...provide an update on sample collection, scientific progress and opportunities, conceptual issues, and future plans. Methods Lymphoblastoid cell lines and DNA and RNA samples from blood have been collected and banked, and data and biosamples have been widely disseminated. To date, APOE genotyping, genome-wide association study (GWAS), and whole exome and whole genome sequencing data have been obtained and disseminated. Results ADNI genetic data have been downloaded thousands of times, and >300 publications have resulted, including reports of large-scale GWAS by consortia to which ADNI contributed. Many of the first applications of quantitative endophenotype association studies used ADNI data, including some of the earliest GWAS and pathway-based studies of biospecimen and imaging biomarkers, as well as memory and other clinical/cognitive variables. Other contributions include some of the first whole exome and whole genome sequencing data sets and reports in healthy controls, mild cognitive impairment, and AD. Discussion Numerous genetic susceptibility and protective markers for AD and disease biomarkers have been identified and replicated using ADNI data and have heavily implicated immune, mitochondrial, cell cycle/fate, and other biological processes. Early sequencing studies suggest that rare and structural variants are likely to account for significant additional phenotypic variation. Longitudinal analyses of transcriptomic, proteomic, metabolomic, and epigenomic changes will also further elucidate dynamic processes underlying preclinical and prodromal stages of disease. Integration of this unique collection of multiomics data within a systems biology framework will help to separate truly informative markers of early disease mechanisms and potential novel therapeutic targets from the vast background of less relevant biological processes. Fortunately, a broad swath of the scientific community has accepted this grand challenge.
Cerebrospinal fluid (CSF) 42 amino acid species of amyloid beta (Aβ42) and tau levels are strongly correlated with the presence of Alzheimer's disease (AD) neuropathology including amyloid plaques ...and neurodegeneration and have been successfully used as endophenotypes for genetic studies of AD. Additional CSF analytes may also serve as useful endophenotypes that capture other aspects of AD pathophysiology. Here we have conducted a genome-wide association study of CSF levels of 59 AD-related analytes. All analytes were measured using the Rules Based Medicine Human DiscoveryMAP Panel, which includes analytes relevant to several disease-related processes. Data from two independently collected and measured datasets, the Knight Alzheimer's Disease Research Center (ADRC) and Alzheimer's Disease Neuroimaging Initiative (ADNI), were analyzed separately, and combined results were obtained using meta-analysis. We identified genetic associations with CSF levels of 5 proteins (Angiotensin-converting enzyme (ACE), Chemokine (C-C motif) ligand 2 (CCL2), Chemokine (C-C motif) ligand 4 (CCL4), Interleukin 6 receptor (IL6R) and Matrix metalloproteinase-3 (MMP3)) with study-wide significant p-values (p<1.46×10-10) and significant, consistent evidence for association in both the Knight ADRC and the ADNI samples. These proteins are involved in amyloid processing and pro-inflammatory signaling. SNPs associated with ACE, IL6R and MMP3 protein levels are located within the coding regions of the corresponding structural gene. The SNPs associated with CSF levels of CCL4 and CCL2 are located in known chemokine binding proteins. The genetic associations reported here are novel and suggest mechanisms for genetic control of CSF and plasma levels of these disease-related proteins. Significant SNPs in ACE and MMP3 also showed association with AD risk. Our findings suggest that these proteins/pathways may be valuable therapeutic targets for AD. Robust associations in cognitively normal individuals suggest that these SNPs also influence regulation of these proteins more generally and may therefore be relevant to other diseases.
Ramp sequences increase translational speed and accuracy when rare, slowly-translated codons are found at the beginnings of genes. Here, the results of the first analysis of ramp sequences in a ...phylogenetic construct are presented. Ramp sequences were compared from 247 vertebrates (114 Mammalian and 133 non-mammalian), where the presence and absence of ramp sequences was analyzed as a binary character in a parsimony and maximum likelihood framework. Additionally, ramp sequences were mapped to the Open Tree of Life synthetic tree to determine the number of parallelisms and reversals that occurred, and those results were compared to random permutations. Parsimony and maximum likelihood analyses of the presence and absence of ramp sequences recovered phylogenies that are highly congruent with established phylogenies. Additionally, 81% of vertebrate mammalian ramps and 81.2% of other vertebrate ramps had less parallelisms and reversals than the mean from 1000 randomly permuted trees. A chi-square analysis of completely orthologous ramp sequences resulted in a p-value < 0.001 as compared to random chance. Ramp sequences recover comparable phylogenies as other phylogenomic methods. Although not all ramp sequences appear to have a phylogenetic signal, more ramp sequences track speciation than expected by random chance. Therefore, ramp sequences may be used in conjunction with other phylogenomic approaches if many orthologs are taken into account. However, phylogenomic methods utilizing few orthologs should be cautious in incorporating ramp sequences because individual ramp sequences may provide conflicting signals.
The Genetics Core of the Alzheimer’s Disease Neuroimaging Initiative (ADNI), formally established in 2009, aims to provide resources and facilitate research related to genetic predictors of ...multidimensional Alzheimer’s disease (AD)-related phenotypes. Here, we provide a systematic review of genetic studies published between 2009 and 2012 where either ADNI
APOE
genotype or genome-wide association study (GWAS) data were used. We review and synthesize ADNI genetic associations with disease status or quantitative disease endophenotypes including structural and functional neuroimaging, fluid biomarker assays, and cognitive performance. We also discuss the diverse analytical strategies used in these studies, including univariate and multivariate analysis, meta-analysis, pathway analysis, and interaction and network analysis. Finally, we perform pathway and network enrichment analyses of these ADNI genetic associations to highlight key mechanisms that may drive disease onset and trajectory. Major ADNI findings included all the top 10 AD genes and several of these (e.g.,
APOE
,
BIN1
,
CLU
,
CR1
, and
PICALM
) were corroborated by ADNI imaging, fluid and cognitive phenotypes. ADNI imaging genetics studies discovered novel findings (e.g.,
FRMD6
) that were later replicated on different data sets. Several other genes (e.g.,
APOC1, FTO, GRIN2B, MAGI2,
and
TOMM40
) were associated with multiple ADNI phenotypes, warranting further investigation on other data sets. The broad availability and wide scope of ADNI genetic and phenotypic data has advanced our understanding of the genetic basis of AD and has nominated novel targets for future studies employing next-generation sequencing and convergent multi-omics approaches, and for clinical drug and biomarker development.