Neural demyelination and brain damage accumulated in white matter appear as hyperintense areas on T2-weighted MRI scans in the form of lesions. Modeling binary images at the population level, where ...each voxel represents the existence of a lesion, plays an important role in understanding aging and inflammatory diseases. We propose a scalable hierarchical Bayesian spatial model, called BLESS, capable of handling binary responses by placing continuous spike-and-slab mixture priors on spatially varying parameters and enforcing spatial dependency on the parameter dictating the amount of sparsity within the probability of inclusion. The use of mean-field variational inference with dynamic posterior exploration, which is an annealing-like strategy that improves optimization, allows our method to scale to large sample sizes. Our method also accounts for underestimation of posterior variance due to variational inference by providing an approximate posterior sampling approach based on Bayesian bootstrap ideas and spike-and-slab priors with random shrinkage targets. Besides accurate uncertainty quantification, this approach is capable of producing novel cluster size based imaging statistics, such as credible intervals of cluster size, and measures of reliability of cluster occurrence. Lastly, we validate our results via simulation studies and an application to the UK Biobank, a large-scale lesion mapping study with a sample size of 40,000 subjects. Supplementary materials for this article are available online.
The function of the majority of genes in the human and mouse genomes is unknown. Investigating and illuminating this dark genome is a major challenge for the biomedical sciences. The International ...Mouse Phenotyping Consortium (IMPC) is addressing this through the generation and broad-based phenotyping of a knockout (KO) mouse line for every protein-coding gene, producing a multidimensional data set that underlies a genome-wide annotation map from genes to phenotypes. Here, we develop a multivariate (MV) statistical approach and apply it to IMPC data comprising 148 phenotypes measured across 4,548 KO lines. There are 4,256 (1.4% of 302,997 observed data measurements) hits called by the univariate (UV) model analysing each phenotype separately, compared to 31,843 (10.5%) hits in the observed data results of the MV model, corresponding to an estimated 7.5-fold increase in power of the MV model relative to the UV model. One key property of the data set is its 55.0% rate of missingness, resulting from quality control filters and incomplete measurement of some KO lines. This raises the question of whether it is possible to infer perturbations at phenotype-gene pairs at which data are not available, i.e., to infer some in vivo effects using statistical analysis rather than experimentation. We demonstrate that, even at missing phenotypes, the MV model can detect perturbations with power comparable to the single-phenotype analysis, thereby filling in the complete gene-phenotype map with good sensitivity. A factor analysis of the MV model's fitted covariance structure identifies 20 clusters of phenotypes, with each cluster tending to be perturbed collectively. These factors cumulatively explain 75% of the KO-induced variation in the data and facilitate biological interpretation of perturbations. We also demonstrate that the MV approach strengthens the correspondence between IMPC phenotypes and existing gene annotation databases. Analysis of a subset of KO lines measured in replicate across multiple laboratories confirms that the MV model increases power with high replicability.
Background:
In the phase III ASCLEPIOS I and II trials, participants with relapsing multiple sclerosis receiving ofatumumab had significantly better clinical and magnetic resonance imaging (MRI) ...outcomes than those receiving teriflunomide.
Objectives:
To assess the efficacy and safety of ofatumumab versus teriflunomide in recently diagnosed, treatment-naive (RDTN) participants from ASCLEPIOS.
Methods:
Participants were randomized to receive ofatumumab (20 mg subcutaneously every 4 weeks) or teriflunomide (14 mg orally once daily) for up to 30 months. Endpoints analysed post hoc in the protocol-defined RDTN population included annualized relapse rate (ARR), confirmed disability worsening (CDW), progression independent of relapse activity (PIRA) and adverse events.
Results:
Data were analysed from 615 RDTN participants (ofatumumab: n = 314; teriflunomide: n = 301). Compared with teriflunomide, ofatumumab reduced ARR by 50% (rate ratio (95% confidence interval (CI)): 0.50 (0.33, 0.74); p < 0.001), and delayed 6-month CDW by 46% (hazard ratio (HR; 95% CI): 0.54 (0.30, 0.98); p = 0.044) and 6-month PIRA by 56% (HR: 0.44 (0.20, 1.00); p = 0.049). Safety findings were manageable and consistent with those of the overall ASCLEPIOS population.
Conclusion:
The favourable benefit–risk profile of ofatumumab versus teriflunomide supports its consideration as a first-line therapy in RDTN patients.
ASCLEPIOS I and II are registered at ClinicalTrials.gov (NCT02792218 and NCT02792231).
Background:
The Oxford Big Data Institute, multiple sclerosis (MS) physicians and Novartis aim to address unresolved questions in MS with a novel comprehensive clinical trial data set.
Objective:
The ...objective of this study is to describe the Novartis–Oxford MS (NO.MS) data set and to explore the relationships between age, disease activity and disease worsening across MS phenotypes.
Methods:
We report key characteristics of NO.MS. We modelled MS lesion formation, relapse frequency, brain volume change and disability worsening cross-sectionally, as a function of patients’ baseline age, using phase III study data (≈8000 patients).
Results:
NO.MS contains data of ≈35,000 patients (>200,000 brain images from ≈10,000 patients), with >10 years follow-up. (1) Focal disease activity is highest in paediatric patients and decreases with age, (2) brain volume loss is similar across age and phenotypes and (3) the youngest patients have the lowest likelihood (<25%) of disability worsening over 2 years while risk is higher (25%–75%) in older, disabled or progressive MS patients. Young patients benefit most from treatment.
Conclusion:
NO.MS will illuminate questions related to MS characterisation, progression and prognosis. Age modulates relapse frequency and, thus, the phenotypic presentation of MS. Disease worsening across all phenotypes is mediated by age and appears to some extent be independent from new focal inflammatory activity.
Abstract Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the ...genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 24.5 million primary-care health records in over 740,000 individuals in the UK Biobank, Million Veteran Program USA, and Estonian Biobank, to discover and validate the genetic architecture of adiposity trajectories. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI by 14%. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 ( APOE missense variant). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology of quantitative traits in adulthood.
Imaging genetics analyses use neuroimaging traits as intermediate phenotypes to infer the degree of genetic contribution to brain structure and function in health and/or illness. Coefficients of ...relatedness (CR) summarize the degree of genetic similarity among subjects and are used to estimate the heritability – the proportion of phenotypic variance explained by genetic factors. The CR can be inferred directly from genome-wide genotype data to explain the degree of shared variation in common genetic polymorphisms (SNP-heritability) among related or unrelated subjects. We developed a central processing and graphics processing unit (CPU and GPU) accelerated Fast and Powerful Heritability Inference (FPHI) approach that linearizes likelihood calculations to overcome the ∼N2–3 computational effort dependency on sample size of classical likelihood approaches. We calculated for 60 regional and 1.3 × 105 voxel-wise traits in N = 1,206 twin and sibling participants from the Human Connectome Project (HCP) (550 M/656 F, age = 28.8 ± 3.7 years) and N = 37,432 (17,531 M/19,901 F; age = 63.7 ± 7.5 years) participants from the UK Biobank (UKBB). The FPHI estimates were in excellent agreement with heritability values calculated using Genome-wide Complex Trait Analysis software (r = 0.96 and 0.98 in HCP and UKBB sample) while significantly reducing computational (102–4 times). The regional and voxel-wise traits heritability estimates for the HCP and UKBB were likewise in excellent agreement (r = 0.63–0.76, p < 10−10). In summary, the hardware-accelerated FPHI made it practical to calculate heritability values for voxel-wise neuroimaging traits, even in very large samples such as the UKBB. The patterns of additive genetic variance in neuroimaging traits measured in a large sample of related and unrelated individuals showed excellent agreement regardless of the estimation method. The code and instruction to execute these analyses are available at www.solar-eclipse-genetics.org.
Imaging genetic analyses quantify genetic control over quantitative measurements of brain structure and function using coefficients of relationship (CR) that code the degree of shared genetics ...between subjects. CR can be inferred through self‐reported relatedness or calculated empirically using genome‐wide SNP scans. We hypothesized that empirical CR provides a more accurate assessment of shared genetics than self‐reported relatedness. We tested this in 1,046 participants of the Human Connectome Project (HCP) (480 M/566 F) recruited from the Missouri twin registry. We calculated the heritability for 17 quantitative traits drawn from four categories (brain diffusion and structure, cognition, and body physiology) documented by the HCP. We compared the heritability and genetic correlation estimates calculated using self‐reported and empirical CR methods Kinship‐based INference for GWAS (KING) and weighted allelic correlation (WAC). The polygenetic nature of traits was assessed by calculating the empirical CR from chromosomal SNP sets. The heritability estimates based on whole‐genome empirical CR were higher but remained significantly correlated (r ∼0.9) with those obtained using self‐reported values. Population stratification in the HCP sample has likely influenced the empirical CR calculations and biased heritability estimates. Heritability values calculated using empirical CR for chromosomal SNP sets were significantly correlated with the chromosomal length (r 0.7) suggesting a polygenic nature for these traits. The chromosomal heritability patterns were correlated among traits from the same knowledge domains; among traits with significant genetic correlations; and among traits sharing biological processes, without being genetically related. The pedigree structures generated in our analyses are available online as a web‐based calculator (www.solar-eclipse-genetics.org/HCP).
Abstract
Background
Novartis and the University of Oxford’s Big Data Institute (BDI) have established a research alliance with the aim to improve health care and drug development by making it more ...efficient and targeted. Using a combination of the latest statistical machine learning technology with an innovative IT platform developed to manage large volumes of anonymised data from numerous data sources and types we plan to identify novel patterns with clinical relevance which cannot be detected by humans alone to identify phenotypes and early predictors of patient disease activity and progression.
Method
The collaboration focuses on highly complex autoimmune diseases and develops a computational framework to assemble a research-ready dataset across numerous modalities. For the Multiple Sclerosis (MS) project, the collaboration has anonymised and integrated phase II to phase IV clinical and imaging trial data from ≈35,000 patients across all clinical phenotypes and collected in more than 2200 centres worldwide. For the “IL-17” project, the collaboration has anonymised and integrated clinical and imaging data from over 30 phase II and III
Cosentyx
clinical trials including more than 15,000 patients, suffering from four autoimmune disorders (Psoriasis, Axial Spondyloarthritis, Psoriatic arthritis (PsA) and Rheumatoid arthritis (RA)).
Results
A fundamental component of successful data analysis and the collaborative development of novel machine learning methods on these rich data sets has been the construction of a research informatics framework that can capture the data at regular intervals where images could be anonymised and integrated with the de-identified clinical data, quality controlled and compiled into a research-ready relational database which would then be available to multi-disciplinary analysts. The collaborative development from a group of software developers, data wranglers, statisticians, clinicians, and domain scientists across both organisations has been key. This framework is innovative, as it facilitates collaborative data management and makes a complicated clinical trial data set from a pharmaceutical company available to academic researchers who become associated with the project.
Conclusions
An informatics framework has been developed to capture clinical trial data into a pipeline of anonymisation, quality control, data exploration, and subsequent integration into a database. Establishing this framework has been integral to the development of analytical tools.
Genome wide association (GWA) analysis of brain imaging phenotypes can advance our understanding of the genetic basis of normal and disorder-related variation in the brain. GWA approaches typically ...use linear mixed effect models to account for non-independence amongst subjects due to factors, such as family relatedness and population structure. The use of these models with high-dimensional imaging phenotypes presents enormous challenges in terms of computational intensity and the need to account multiple testing in both the imaging and genetic domain. Here we present a method that makes mixed models practical with high-dimensional traits by a combination of a transformation applied to the data and model, and the use of a non-iterative variance component estimator. With such speed enhancements permutation tests are feasible, which allows inference on powerful spatial tests like the cluster size statistic.