Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. ...Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally 'unrelated' individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Socioeconomic position (SEP) is a multi-dimensional construct reflecting (and influencing) multiple socio-cultural, physical, and environmental factors. In a sample of 286,301 participants from UK ...Biobank, we identify 30 (29 previously unreported) independent-loci associated with income. Using a method to meta-analyze data from genetically-correlated traits, we identify an additional 120 income-associated loci. These loci show clear evidence of functionality, with transcriptional differences identified across multiple cortical tissues, and links to GABAergic and serotonergic neurotransmission. By combining our genome wide association study on income with data from eQTL studies and chromatin interactions, 24 genes are prioritized for follow up, 18 of which were previously associated with intelligence. We identify intelligence as one of the likely causal, partly-heritable phenotypes that might bridge the gap between molecular genetic inheritance and phenotypic consequence in terms of income differences. These results indicate that, in modern era Great Britain, genetic effects contribute towards some of the observed socioeconomic inequalities.
Pedigree-based analyses of intelligence have reported that genetic differences account for 50-80% of the phenotypic variation. For personality traits these effects are smaller, with 34-48% of the ...variance being explained by genetic differences. However, molecular genetic studies using unrelated individuals typically report a heritability estimate of around 30% for intelligence and between 0 and 15% for personality variables. Pedigree-based estimates and molecular genetic estimates may differ because current genotyping platforms are poor at tagging causal variants, variants with low minor allele frequency, copy number variants, and structural variants. Using ~20,000 individuals in the Generation Scotland family cohort genotyped for ~700,000 single-nucleotide polymorphisms (SNPs), we exploit the high levels of linkage disequilibrium (LD) found in members of the same family to quantify the total effect of genetic variants that are not tagged in GWAS of unrelated individuals. In our models, genetic variants in low LD with genotyped SNPs explain over half of the genetic variance in intelligence, education, and neuroticism. By capturing these additional genetic effects our models closely approximate the heritability estimates from twin studies for intelligence and education, but not for neuroticism and extraversion. We then replicated our finding using imputed molecular genetic data from unrelated individuals to show that ~50% of differences in intelligence, and ~40% of the differences in education, can be explained by genetic effects when a larger number of rare SNPs are included. From an evolutionary genetic perspective, a substantial contribution of rare genetic variants to individual differences in intelligence, and education is consistent with mutation-selection balance.
With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets ...might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS.
We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data.
We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages("OmicsPLS").
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Objective:Death by suicide is a highly preventable yet growing worldwide health crisis. To date, there has been a lack of adequately powered genomic studies of suicide, with no sizable suicide death ...cohorts available for analysis. To address this limitation, the authors conducted the first comprehensive genomic analysis of suicide death using previously unpublished genotype data from a large population-ascertained cohort.Methods:The analysis sample comprised 3,413 population-ascertained case subjects of European ancestry and 14,810 ancestrally matched control subjects. Analytical methods included principal component analysis for ancestral matching and adjusting for population stratification, linear mixed model genome-wide association testing (conditional on genetic-relatedness matrix), gene and gene set-enrichment testing, and polygenic score analyses, as well as single-nucleotide polymorphism (SNP) heritability and genetic correlation estimation using linkage disequilibrium score regression.Results:Genome-wide association analysis identified two genome-wide significant loci (involving six SNPs: rs34399104, rs35518298, rs34053895, rs66828456, rs35502061, and rs35256367). Gene-based analyses implicated 22 genes on chromosomes 13, 15, 16, 17, and 19 (q<0.05). Suicide death heritability was estimated at an h2SNP value of 0.25 (SE=0.04) and a value of 0.16 (SE=0.02) when converted to a liability scale. Notably, suicide polygenic scores were significantly predictive across training and test sets. Polygenic scores for several other psychiatric disorders and psychological traits were also predictive, particularly scores for behavioral disinhibition and major depressive disorder.Conclusions:Multiple genome-wide significant loci and genes were identified and polygenic score prediction of suicide death case-control status was demonstrated, adjusting for ancestry, in independent training and test sets. Additionally, the suicide death sample was found to have increased genetic risk for behavioral disinhibition, major depressive disorder, depressive symptoms, autism spectrum disorder, psychosis, and alcohol use disorder compared with the control sample.
Joint modeling of a number of phenotypes using multivariate methods has often been neglected in genome-wide association studies and if used, replication has not been sought. Modern omics technologies ...allow characterization of functional phenomena using a large number of related phenotype measures, which can benefit from such joint analysis. Here, we report a multivariate genome-wide association studies of 23 immunoglobulin G (IgG) N-glycosylation phenotypes. In the discovery cohort, our multi-phenotype method uncovers ten genome-wide significant loci, of which five are novel (IGH, ELL2, HLA-B-C, AZI1, FUT6-FUT3). We convincingly replicate all novel loci via multivariate tests. We show that IgG N-glycosylation loci are strongly enriched for genes expressed in the immune system, in particular antibody-producing cells and B lymphocytes. We empirically demonstrate the efficacy of multivariate methods to discover novel, reproducible pleiotropic effects.Multivariate analysis methods can uncover the relationship between phenotypic measures characterised by modern omic techniques. Here the authors conduct a multivariate GWAS on IgG N-glycosylation phenotypes and identify 5 novel loci enriched in immune system genes.
'Epigenetic age acceleration' is a valuable biomarker of ageing, predictive of morbidity and mortality, but for which the underlying biological mechanisms are not well established. Two commonly used ...measures, derived from DNA methylation, are Horvath-based (Horvath-EAA) and Hannum-based (Hannum-EAA) epigenetic age acceleration. We conducted genome-wide association studies of Horvath-EAA and Hannum-EAA in 13,493 unrelated individuals of European ancestry, to elucidate genetic determinants of differential epigenetic ageing. We identified ten independent SNPs associated with Horvath-EAA, five of which are novel. We also report 21 Horvath-EAA-associated genes including several involved in metabolism (NHLRC, TPMT) and immune system pathways (TRIM59, EDARADD). GWAS of Hannum-EAA identified one associated variant (rs1005277), and implicated 12 genes including several involved in innate immune system pathways (UBE2D3, MANBA, TRIM46), with metabolic functions (UBE2D3, MANBA), or linked to lifespan regulation (CISD2). Both measures had nominal inverse genetic correlations with father's age at death, a rough proxy for lifespan. Nominally significant genetic correlations between Hannum-EAA and lifestyle factors including smoking behaviours and education support the hypothesis that Hannum-based epigenetic ageing is sensitive to variations in environment, whereas Horvath-EAA is a more stable cellular ageing process. We identified novel SNPs and genes associated with epigenetic age acceleration, and highlighted differences in the genetic architecture of Horvath-based and Hannum-based epigenetic ageing measures. Understanding the biological mechanisms underlying individual differences in the rate of epigenetic ageing could help explain different trajectories of age-related decline.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Infectious diseases still threaten global human health, and host genetic factors have been indicated as determining risk factors for observed variations in disease susceptibility, severity, and ...outcome. We performed a genome-wide meta-analysis on 4624 subjects from the 10,001 Dalmatians cohort, with 14 infection-related traits. Despite a rather small number of cases in some instances, we detected 29 infection-related genetic associations, mostly belonging to rare variants. Notably, the list included the genes CD28, INPP5D, ITPKB, MACROD2, and RSF1, all of which have known roles in the immune response. Expanding our knowledge on rare variants could contribute to the development of genetic panels that could assist in predicting an individual's life-long susceptibility to major infectious diseases. In addition, longitudinal biobanks are an interesting source of information for identifying the host genetic variants involved in infectious disease susceptibility and severity. Since infectious diseases continue to act as a selective pressure on our genomes, there is a constant need for a large consortium of biobanks with access to genetic and environmental data to further elucidate the complex mechanisms behind host-pathogen interactions and infectious disease susceptibility.
Runs of Homozygosity in European Populations McQuillan, Ruth; Leutenegger, Anne-Louise; Abdel-Rahman, Rehab ...
American journal of human genetics
83, Številka:
3
Journal Article
Recenzirano
Odprti dostop
Estimating individual genome-wide autozygosity is important both in the identification of recessive disease variants via homozygosity mapping and in the investigation of the effects of genome-wide ...homozygosity on traits of biomedical importance. Approaches have tended to involve either single-point estimates or rather complex multipoint methods of inferring individual autozygosity, all on the basis of limited marker data. Now, with the availability of high-density genome scans, a multipoint, observational method of estimating individual autozygosity is possible. Using data from a 300,000 SNP panel in 2618 individuals from two isolated and two more-cosmopolitan populations of European origin, we explore the potential of estimating individual autozygosity from data on runs of homozygosity (ROHs). Termed F
roh, this is defined as the proportion of the autosomal genome in runs of homozygosity above a specified length. Mean F
roh distinguishes clearly between subpopulations classified in terms of grandparental endogamy and population size. With the use of good pedigree data for one of the populations (Orkney), F
roh was found to correlate strongly with the inbreeding coefficient estimated from pedigrees (r = 0.86). Using pedigrees to identify individuals with no shared maternal and paternal ancestors in five, and probably at least ten, generations, we show that ROHs measuring up to 4 Mb are common in demonstrably outbred individuals. Given the stochastic variation in ROH number, length, and location and the fact that ROHs are important whether ancient or recent in origin, approaches such as this will provide a more useful description of genomic autozygosity than has hitherto been possible.
The association between APOE genotype and cognitive function suggests a positive role for the e2 allele and a negative role for the e4 allele. Both alleles have relatively low frequencies in the ...general population; hence, meta-analyses have been based on many small, heterogeneous studies. Here, we report the APOE-cognition associations in the largest single analysis to date. APOE status and cognitive ability were measured in 18 337 participants from the Generation Scotland study between 2006 and 2011. The age range was 18-94 years with a mean of 47 (SD 15). Four cognitive domains were assessed: verbal declarative memory (paragraph recall), processing speed (digit symbol substitution), verbal fluency (phonemic verbal fluency), and vocabulary (Mill Hill synonyms). Linear regression was used to assess the associations between APOE genetic status and cognition. Possession of the e4 allele was associated with lower scores on the measures of memory and processing speed in subjects aged >60. Across all age ranges, the e4 allele was linked to better verbal fluency scores. In younger subjects (≤60 years) the e4 allele was linked to higher vocabulary scores. There were no associations between the e2 allele and cognitive ability. As seen in previous meta-analyses, the APOE e4 allele is linked to poorer cognitive performance in the domains of memory and processing speed. By contrast, positive associations were seen between the e4 allele and measures of verbal fluency and vocabulary. All associations were relatively small and, in many cases, nominally significant despite the very large sample size.