Cells from the same individual share common genetic and environmental backgrounds and are not statistically independent; therefore, they are subsamples or pseudoreplicates. Thus, single-cell data ...have a hierarchical structure that many current single-cell methods do not address, leading to biased inference, highly inflated type 1 error rates, and reduced robustness and reproducibility. This includes methods that use a batch effect correction for individual as a means of accounting for within-sample correlation. Here, we document this dependence across a range of cell types and show that pseudo-bulk aggregation methods are conservative and underpowered relative to mixed models. To compute differential expression within a specific cell type across treatment groups, we propose applying generalized linear mixed models with a random effect for individual, to properly account for both zero inflation and the correlation structure among measures from cells within an individual. Finally, we provide power estimates across a range of experimental conditions to assist researchers in designing appropriately powered studies.
Human genetic diversity is the result of population genetic forces. This genetic variation influences disease risk and contributes to health disparities. Autoimmune diseases (ADs) are a family of ...complex heterogeneous disorders with similar underlying mechanisms characterized by immune responses against self. Collectively, ADs are common, exhibit gender and ethnic disparities, and increasing incidence. As natural selection is an important influence on human genetic variation, and immune function genes are enriched for signals of positive selection, it is thought that the prevalence of AD risk alleles seen in different population is partially the result of differing selective pressures (for example, due to pathogens). With the advent of high-throughput technologies, new analytical methodologies and large-scale projects, evidence for the role of natural selection in contributing to the heritable component of ADs keeps growing. This review summarizes the genetic regions associated with susceptibility to different ADs and concomitant evidence for selection, including known agents of selection exerting selective pressure in these regions. Examples of specific adaptive variants with phenotypic effects are included as an evidence of natural selection increasing AD susceptibility. Many of the complexities of gene effects in different ADs can be explained by population genetics phenomena. Integrating AD susceptibility studies with population genetics to investigate how natural selection has contributed to genetic variation that influences disease risk will help to identify functional variants and elucidate biological mechanisms. As such, the study of population genetics in human population holds untapped potential for elucidating the genetic causes of human disease and more rapidly focusing to personalized medicine.
South America has a complex demographic history shaped by multiple migration and admixture events in pre- and post-colonial times. Settled over 14,000 years ago by Native Americans, South America has ...experienced migrations of European and African individuals, similar to other regions in the Americas. However, the timing and magnitude of these events resulted in markedly different patterns of admixture throughout Latin America. We use genome-wide SNP data for 437 admixed individuals from 5 countries (Colombia, Ecuador, Peru, Chile, and Argentina) to explore the population structure and demographic history of South American Latinos. We combined these data with population reference panels from Africa, Asia, Europe and the Americas to perform global ancestry analysis and infer the subcontinental origin of the European and Native American ancestry components of the admixed individuals. By applying ancestry-specific PCA analyses we find that most of the European ancestry in South American Latinos is from the Iberian Peninsula; however, many individuals trace their ancestry back to Italy, especially within Argentina. We find a strong gradient in the Native American ancestry component of South American Latinos associated with country of origin and the geography of local indigenous populations. For example, Native American genomic segments in Peruvians show greater affinities with Andean indigenous peoples like Quechua and Aymara, whereas Native American haplotypes from Colombians tend to cluster with Amazonian and coastal tribes from northern South America. Using ancestry tract length analysis we modeled post-colonial South American migration history as the youngest in Latin America during European colonization (9-14 generations ago), with an additional strong pulse of European migration occurring between 3 and 9 generations ago. These genetic footprints can impact our understanding of population-level differences in biomedical traits and, thus, inform future medical genetic studies in the region.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Study design is a critical aspect of any experiment, and sample size calculations for statistical power that are consistent with that study design are central to robust and reproducible results. ...However, the existing power calculators for tests of differential expression in single-cell RNA-seq data focus on the total number of cells and not the number of independent experimental units, the true unit of interest for power. Thus, current methods grossly overestimate the power.
Hierarchicell is the first single-cell power calculator to explicitly simulate and account for the hierarchical correlation structure (i.e., within sample correlation) that exists in single-cell RNA-seq data. Hierarchicell, an R-package available on GitHub, estimates the within sample correlation structure from real data to simulate hierarchical single-cell RNA-seq data and estimate power for tests of differential expression. This multi-stage approach models gene dropout rates, intra-individual dispersion, inter-individual variation, variable or fixed number of cells per individual, and the correlation among cells within an individual. Without modeling the within sample correlation structure and without properly accounting for the correlation in downstream analysis, we demonstrate that estimates of power are falsely inflated. Hierarchicell can be used to estimate power for binary and continuous phenotypes based on user-specified number of independent experimental units (e.g., individuals) and cells within the experimental unit.
Hierarchicell is a user-friendly R-package that provides accurate estimates of power for testing hypotheses of differential expression in single-cell RNA-seq data. This R-package represents an important addition to single-cell RNA analytic tools and will help researchers design experiments with appropriate and accurate power, increasing discovery and improving robustness and reproducibility.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Intracerebral haemorrhage and small vessel ischaemic stroke (SVS) are the most acute manifestations of cerebral small vessel disease, with no established preventive approaches beyond hypertension ...management. Combined genome-wide association study (GWAS) of these two correlated diseases may improve statistical power to detect novel genetic factors for cerebral small vessel disease, elucidating underlying disease mechanisms that may form the basis for future treatments. Because intracerebral haemorrhage location is an adequate surrogate for distinct histopathological variants of cerebral small vessel disease (lobar for cerebral amyloid angiopathy and non-lobar for arteriolosclerosis), we performed GWAS of intracerebral haemorrhage by location in 1813 subjects (755 lobar and 1005 non-lobar) and 1711 stroke-free control subjects. Intracerebral haemorrhage GWAS results by location were meta-analysed with GWAS results for SVS from MEGASTROKE, using 'Multi-Trait Analysis of GWAS' (MTAG) to integrate summary data across traits and generate combined effect estimates. After combining intracerebral haemorrhage and SVS datasets, our sample size included 241 024 participants (6255 intracerebral haemorrhage or SVS cases and 233 058 control subjects). Genome-wide significant associations were observed for non-lobar intracerebral haemorrhage enhanced by SVS with rs2758605 MTAG P-value (P) = 2.6 × 10-8 at 1q22; rs72932727 (P = 1.7 × 10-8) at 2q33; and rs9515201 (P = 5.3 × 10-10) at 13q34. In the GTEx gene expression library, rs2758605 (1q22), rs72932727 (2q33) and rs9515201 (13q34) are significant cis-eQTLs for PMF1 (P = 1 × 10-4 in tibial nerve), NBEAL1, FAM117B and CARF (P < 2.1 × 10-7 in arteries) and COL4A2 and COL4A1 (P < 0.01 in brain putamen), respectively. Leveraging S-PrediXcan for gene-based association testing with the predicted expression models in tissues related with nerve, artery, and non-lobar brain, we found that experiment-wide significant (P < 8.5 × 10-7) associations at three genes at 2q33 including NBEAL1, FAM117B and WDR12 and genome-wide significant associations at two genes including ICA1L at 2q33 and ZCCHC14 at 16q24. Brain cell-type specific expression profiling libraries reveal that SEMA4A, SLC25A44 and PMF1 at 1q22 and COL4A1 and COL4A2 at 13q34 were mainly expressed in endothelial cells, while the genes at 2q33 (FAM117B, CARF and NBEAL1) were expressed in various cell types including astrocytes, oligodendrocytes and neurons. Our cross-phenotype genetic study of intracerebral haemorrhage and SVS demonstrates novel genome-wide associations for non-lobar intracerebral haemorrhage at 2q33 and 13q34. Our replication of the 1q22 locus previous seen in traditional GWAS of intracerebral haemorrhage, as well as the rediscovery of 13q34, which had previously been reported in candidate gene studies with other cerebral small vessel disease-related traits strengthens the credibility of applying this novel genome-wide approach across intracerebral haemorrhage and SVS.
Type 2 diabetes (T2D)-associated end-stage kidney disease (ESKD) is a complex disorder resulting from the combined influence of genetic and environmental factors. This study contains a comprehensive ...genetic analysis of putative nephropathy loci in 965 African American (AA) cases with T2D-ESKD and 1029 AA population-based controls extending prior findings. Analysis was based on 4,341 directly genotyped and imputed single nucleotide polymorphisms (SNPs) in 22 nephropathy candidate genes. After admixture adjustment and correction for multiple comparisons, 37 SNPs across eight loci were significantly associated (1.6E-05<P(emp)<0.049). Among these, variants in MYH9 were the most significant (1.6E-05<P(emp)<0.049), followed by additional chromosome 22 loci (APOL1, SFI1, and LIMK2). Nominal signals were observed in AGTR1, RPS12, CHN2 and CNDP1. Additional adjustment for APOL1 G1/G2 risk variants attenuated association at MYH9 (P(emp) = 0.00026-0.043) while marginally improving significance of other APOL1 SNPs (rs136161, rs713753, and rs767855; P(emp) = 0.0060-0.037); association at other loci was markedly reduced except for CHN2 (chimerin; rs17157914, P(emp)= 0.029). In addition, SNPs in other candidate loci (FRMD3 and TRPC6) trended toward association with T2D-ESKD (P(emp)<0.05). These results suggest that risk contributed by putative nephropathy genes is shared across populations of African and European ancestry.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Genome wide association studies show there is a genetic component to severe COVID-19. We find evidence that the genome-wide genetic association signal with severe COVID-19 is correlated with that of ...systemic lupus erythematosus (SLE), having formally tested this using genetic correlation analysis by LD score regression. To identify the shared associated loci and gain insight into the shared genetic effects, using summary level data we performed meta-analyses, a local genetic correlation analysis and fine-mapping using stepwise regression and functional annotation. This identified multiple loci shared between the two traits, some of which exert opposing effects. The locus with most evidence of shared association is TYK2, a gene critical to the type I interferon pathway, where the local genetic correlation is negative. Another shared locus is CLEC1A, where the direction of effects is aligned, that encodes a lectin involved in cell signaling, and the anti-fungal immune response. Our analyses suggest that several loci with reciprocal effects between the two traits have a role in the defense response pathway, adding to the evidence that SLE risk alleles are protective against infection.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Systemic lupus erythematosus (SLE) is a clinically heterogeneous, systemic autoimmune disease characterized by autoantibody formation. Previously published genome-wide association studies (GWAS) have ...investigated SLE as a single phenotype. Therefore, we conducted a GWAS to identify genetic factors associated with anti-dsDNA autoantibody production, a SLE-related autoantibody with diagnostic and clinical importance. Using two independent datasets, over 400,000 single nucleotide polymorphisms (SNPs) were studied in a total of 1,717 SLE cases and 4,813 healthy controls. Anti-dsDNA autoantibody positive (anti-dsDNA +, n = 811) and anti-dsDNA autoantibody negative (anti-dsDNA -, n = 906) SLE cases were compared to healthy controls and to each other to identify SNPs associated specifically with these SLE subtypes. SNPs in the previously identified SLE susceptibility loci STAT4, IRF5, ITGAM, and the major histocompatibility complex were strongly associated with anti-dsDNA + SLE. Far fewer and weaker associations were observed for anti-dsDNA - SLE. For example, rs7574865 in STAT4 had an OR for anti-dsDNA + SLE of 1.77 (95% CI 1.57-1.99, p = 2.0E-20) compared to an OR for anti-dsDNA - SLE of 1.26 (95% CI 1.12-1.41, p = 2.4E-04), with p(heterogeneity)<0.0005. SNPs in the SLE susceptibility loci BANK1, KIAA1542, and UBE2L3 showed evidence of association with anti-dsDNA + SLE and were not associated with anti-dsDNA - SLE. In conclusion, we identified differential genetic associations with SLE based on anti-dsDNA autoantibody production. Many previously identified SLE susceptibility loci may confer disease risk through their role in autoantibody production and be more accurately described as autoantibody propensity loci. Lack of strong SNP associations may suggest that other types of genetic variation or non-genetic factors such as environmental exposures have a greater impact on susceptibility to anti-dsDNA - SLE.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
In spite of the well-known clustering of multiple autoimmune disorders in families, analyses of specific shared genes and polymorphisms between systemic lupus erythematosus (SLE) and other autoimmune ...diseases (ADs) have been limited. Therefore, we comprehensively tested autoimmune variants for association with SLE, aiming to identify pleiotropic genetic associations between these diseases. We compiled a list of 446 non-Major Histocompatibility Complex (MHC) variants identified in genome-wide association studies (GWAS) of populations of European ancestry across 17 ADs. We then tested these variants in our combined Caucasian SLE cohorts of 1,500 cases and 5,706 controls. We tested a subset of these polymorphisms in an independent Caucasian replication cohort of 2,085 SLE cases and 2,854 controls, allowing the computation of a meta-analysis between all cohorts. We have uncovered novel shared SLE loci that passed multiple comparisons adjustment, including the VTCN1 (rs12046117, P = 2.02×10(-06)) region. We observed that the loci shared among the most ADs include IL23R, OLIG3/TNFAIP3, and IL2RA. Given the lack of a universal autoimmune risk locus outside of the MHC and variable specificities for different diseases, our data suggests partial pleiotropy among ADs. Hierarchical clustering of ADs suggested that the most genetically related ADs appear to be type 1 diabetes with rheumatoid arthritis and Crohn's disease with ulcerative colitis. These findings support a relatively distinct genetic susceptibility for SLE. For many of the shared GWAS autoimmune loci, we found no evidence for association with SLE, including IL23R. Also, several established SLE loci are apparently not associated with other ADs, including the ITGAM-ITGAX and TNFSF4 regions. This study represents the most comprehensive evaluation of shared autoimmune loci to date, supports a relatively distinct non-MHC genetic susceptibility for SLE, provides further evidence for previously and newly identified shared genes in SLE, and highlights the value of studies of potentially pleiotropic genes in autoimmune diseases.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK