In the statistical analysis of genome-wide association data, it is challenging to precisely localize the variants that affect complex traits, due to linkage disequilibrium, and to maximize power ...while limiting spurious findings. Here we report on KnockoffZoom: a flexible method that localizes causal variants at multiple resolutions by testing the conditional associations of genetic segments of decreasing width, while provably controlling the false discovery rate. Our method utilizes artificial genotypes as negative controls and is equally valid for quantitative and binary phenotypes, without requiring any assumptions about their genetic architectures. Instead, we rely on well-established genetic models of linkage disequilibrium. We demonstrate that our method can detect more associations than mixed effects models and achieve fine-mapping precision, at comparable computational cost. Lastly, we apply KnockoffZoom to data from 350k subjects in the UK Biobank and report many new findings.
In vitro cancer cultures, including three-dimensional organoids, typically contain exclusively neoplastic epithelium but require artificial reconstitution to recapitulate the tumor microenvironment ...(TME). The co-culture of primary tumor epithelia with endogenous, syngeneic tumor-infiltrating lymphocytes (TILs) as a cohesive unit has been particularly elusive. Here, an air-liquid interface (ALI) method propagated patient-derived organoids (PDOs) from >100 human biopsies or mouse tumors in syngeneic immunocompetent hosts as tumor epithelia with native embedded immune cells (T, B, NK, macrophages). Robust droplet-based, single-cell simultaneous determination of gene expression and immune repertoire indicated that PDO TILs accurately preserved the original tumor T cell receptor (TCR) spectrum. Crucially, human and murine PDOs successfully modeled immune checkpoint blockade (ICB) with anti-PD-1- and/or anti-PD-L1 expanding and activating tumor antigen-specific TILs and eliciting tumor cytotoxicity. Organoid-based propagation of primary tumor epithelium en bloc with endogenous immune stroma should enable immuno-oncology investigations within the TME and facilitate personalized immunotherapy testing.
Display omitted
•Air-liquid interface (ALI) patient-derived tumor organoids (PDO) retain immune cells•5′ V(D)J and RNA-seq from the same single cells allows robust immune characterization•T cell receptor repertoire is highly conserved between tumor and PDO•ALI PDOs functionally recapitulate the PD-1/PD-L1-dependent immune checkpoint
The tumor-immune microenvironment is modeled using a patient-derived organoid approach that preserves the original tumor T cell receptor spectrum and successfully models immune checkpoint blockade.
Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial ...inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.
The distal lung contains terminal bronchioles and alveoli that facilitate gas exchange. Three-dimensional in vitro human distal lung culture systems would strongly facilitate the investigation of ...pathologies such as interstitial lung disease, cancer and coronavirus disease 2019 (COVID-19) pneumonia caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Here we describe the development of a long-term feeder-free, chemically defined culture system for distal lung progenitors as organoids derived from single adult human alveolar epithelial type II (AT2) or KRT5
basal cells. AT2 organoids were able to differentiate into AT1 cells, and basal cell organoids developed lumens lined with differentiated club and ciliated cells. Single-cell analysis of KRT5
cells in basal organoids revealed a distinct population of ITGA6
ITGB4
mitotic cells, whose offspring further segregated into a TNFRSF12A
subfraction that comprised about ten per cent of KRT5
basal cells. This subpopulation formed clusters within terminal bronchioles and exhibited enriched clonogenic organoid growth activity. We created distal lung organoids with apical-out polarity to present ACE2 on the exposed external surface, facilitating infection of AT2 and basal cultures with SARS-CoV-2 and identifying club cells as a target population. This long-term, feeder-free culture of human distal lung organoids, coupled with single-cell analysis, identifies functional heterogeneity among basal cells and establishes a facile in vitro organoid model of human distal lung infections, including COVID-19-associated pneumonia.
With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted ...target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study.
The Gene Ontology (GO) is a central resource for functional-genomics research. Scientists rely on the functional annotations in the GO for hypothesis generation and couple it with high-throughput ...biological data to enhance interpretation of results. At the same time, the sheer number of concepts (>30,000) and relationships (>70,000) presents a challenge: it can be difficult to draw a comprehensive picture of how certain concepts of interest might relate with the rest of the ontology structure. Here we present new visualization strategies to facilitate the exploration and use of the information in the GO. We rely on novel graphical display and software architecture that allow significant interaction. To illustrate the potential of our strategies, we provide examples from high-throughput genomic analyses, including chromatin immunoprecipitation experiments and genome-wide association studies. The scientist can also use our visualizations to identify gene sets that likely experience coordinated changes in their expression and use them to simulate biologically-grounded single cell RNA sequencing data, or conduct power studies for differential gene expression studies using our built-in pipeline. Our software and documentation are available at http://aegis.stanford.edu .
Genome-wide association studies (GWAS) of longitudinal birth cohorts enable joint investigation of environmental and genetic influences on complex traits. We report GWAS results for nine quantitative ...metabolic traits (triglycerides, high-density lipoprotein, low-density lipoprotein, glucose, insulin, C-reactive protein, body mass index, and systolic and diastolic blood pressure) in the Northern Finland Birth Cohort 1966 (NFBC1966), drawn from the most genetically isolated Finnish regions. We replicate most previously reported associations for these traits and identify nine new associations, several of which highlight genes with metabolic functions: high-density lipoprotein with NR1H3 (LXRA), low-density lipoprotein with AR and FADS1-FADS2, glucose with MTNR1B, and insulin with PANK1. Two of these new associations emerged after adjustment of results for body mass index. Gene-environment interaction analyses suggested additional associations, which will require validation in larger samples. The currently identified loci, together with quantified environmental exposures, explain little of the trait variation in NFBC1966. The association observed between low-density lipoprotein and an infrequent variant in AR suggests the potential of such a cohort for identifying associations with both common, low-impact and rarer, high-impact quantitative trait loci.
Recent advances in genome sequencing and imputation technologies provide an exciting opportunity to comprehensively study the contribution of genetic variants to complex phenotypes. However, our ...ability to translate genetic discoveries into mechanistic insights remains limited at this point. In this paper, we propose an efficient knockoff-based method, GhostKnockoff, for genome-wide association studies (GWAS) that leads to improved power and ability to prioritize putative causal variants relative to conventional GWAS approaches. The method requires only Z-scores from conventional GWAS and hence can be easily applied to enhance existing and future studies. The method can also be applied to meta-analysis of multiple GWAS allowing for arbitrary sample overlap. We demonstrate its performance using empirical simulations and two applications: (1) a meta-analysis for Alzheimer's disease comprising nine overlapping large-scale GWAS, whole-exome and whole-genome sequencing studies and (2) analysis of 1403 binary phenotypes from the UK Biobank data in 408,961 samples of European ancestry. Our results demonstrate that GhostKnockoff can identify putatively functional variants with weaker statistical effects that are missed by conventional association tests.
Molecular and cellular changes are intrinsic to aging and age-related diseases. Prior cross-sectional studies have investigated the combined effects of age and genetics on gene expression and ...alternative splicing; however, there has been no long-term, longitudinal characterization of these molecular changes, especially in older age.
We perform RNA sequencing in whole blood from the same individuals at ages 70 and 80 to quantify how gene expression, alternative splicing, and their genetic regulation are altered during this 10-year period of advanced aging at a population and individual level. We observe that individuals are more similar to their own expression profiles later in life than profiles of other individuals their own age. We identify 1291 and 294 genes differentially expressed and alternatively spliced with age, as well as 529 genes with outlying individual trajectories. Further, we observe a strong correlation of genetic effects on expression and splicing between the two ages, with a small subset of tested genes showing a reduction in genetic associations with expression and splicing in older age.
These findings demonstrate that, although the transcriptome and its genetic regulation is mostly stable late in life, a small subset of genes is dynamic and is characterized by a reduction in genetic regulation, most likely due to increasing environmental variance with age.