Abstract DNA methylation is an important epigenetic mark that modulates gene expression through the inhibition of transcriptional proteins binding to DNA. As in many other omics experiments, the ...issue of missing values is an important one, and appropriate imputation techniques are important in avoiding an unnecessary sample size reduction as well as to optimally leverage the information collected. We consider the case where relatively few samples are processed via an expensive high-density whole genome bisulfite sequencing (WGBS) strategy and a larger number of samples is processed using more affordable low-density, array-based technologies. In such cases, one can impute the low-coverage (array-based) methylation data using the high-density information provided by the WGBS samples. In this paper, we propose an efficient Linear Model of Coregionalisation with informative Covariates (LMCC) to predict missing values based on observed values and covariates. Our model assumes that at each site, the methylation vector of all samples is linked to the set of fixed factors (covariates) and a set of latent factors. Furthermore, we exploit the functional nature of the data and the spatial correlation across sites by assuming some Gaussian processes on the fixed and latent coefficient vectors, respectively. Our simulations show that the use of covariates can significantly improve the accuracy of imputed values, especially in cases where missing data contain some relevant information about the explanatory variable. We also showed that our proposed model is particularly efficient when the number of columns is much greater than the number of rows—which is usually the case in methylation data analysis. Finally, we apply and compare our proposed method with alternative approaches on two real methylation datasets, showing how covariates such as cell type, tissue type or age can enhance the accuracy of imputed values.
Several reports have described cortical thickness (CTh) developmental trajectories, with conflicting results. Some studies have reported inverted-U shape curves with peaks of CTh in late childhood to ...adolescence, while others suggested predominant monotonic decline after age 6. In this study, we reviewed CTh developmental trajectories in the NIH MRI Study of Normal Brain Development, and in a second step, evaluated the impact of post-processing quality control (QC) procedures on identified trajectories. The quality-controlled sample included 384 individual subjects with repeated scanning (1–3 per subject, total scans n=753) from 4.9 to 22.3years of age. The best-fit model (cubic, quadratic, or first-order linear) was identified at each vertex using mixed-effects models. The majority of brain regions showed linear monotonic decline of CTh. There were few areas of cubic trajectories, mostly in bilateral temporo-parietal areas and the right prefrontal cortex, in which CTh peaks were at, or prior to, age 8. When controlling for total brain volume, CTh trajectories were even more uniformly linear. The only sex difference was faster thinning of occipital areas in boys compared to girls. The best-fit model for whole brain mean thickness was a monotonic decline of 0.027mm per year. QC procedures had a significant impact on identified trajectories, with a clear shift toward more complex trajectories (i.e., quadratic or cubic) when including all scans without QC (n=954). Trajectories were almost exclusively linear when using only scans that passed the most stringent QC (n=598). The impact of QC probably relates to decreasing the inclusion of scans with CTh underestimation secondary to movement artifacts, which are more common in younger subjects. In summary, our results suggest that CTh follows a simple linear decline in most cortical areas by age 5, and all areas by age 8. This study further supports the crucial importance of implementing post-processing QC in CTh studies of development, aging, and neuropsychiatric disorders.
•Cortical thickness follows mostly a monotonic linear decline after 5years of age.•Areas with cubic developmental trajectories have peaks of cortical thickness prior to age 8.•The only sex difference in maturation is faster occipital thinning in males.•Mean cortical thickness follows a monotonic linear decline of 0.027mm per year.•Quality control processes have a significant impact on identified trajectories.•A post-processing quality control should be applied in all cortical thickness studies.
Motivated by a DNA methylation application, this article addresses the problem of fitting and inferring a multivariate binomial regression model for outcomes that are contaminated by errors and ...exhibit extra‐parametric variations, also known as dispersion. While dispersion in univariate binomial regression has been extensively studied, addressing dispersion in the context of multivariate outcomes remains a complex and relatively unexplored task. The complexity arises from a noteworthy data characteristic observed in our motivating dataset: non‐constant yet correlated dispersion across outcomes. To address this challenge and account for possible measurement error, we propose a novel hierarchical quasi‐binomial varying coefficient mixed model, which enables flexible dispersion patterns through a combination of additive and multiplicative dispersion components. To maximize the Laplace‐approximated quasi‐likelihood of our model, we further develop a specialized two‐stage expectation‐maximization (EM) algorithm, where a plug‐in estimate for the multiplicative scale parameter enhances the speed and stability of the EM iterations. Simulations demonstrated that our approach yields accurate inference for smooth covariate effects and exhibits excellent power in detecting non‐zero effects. Additionally, we applied our proposed method to investigate the association between DNA methylation, measured across the genome through targeted custom capture sequencing of whole blood, and levels of anti‐citrullinated protein antibodies (ACPA), a preclinical marker for rheumatoid arthritis (RA) risk. Our analysis revealed 23 significant genes that potentially contribute to ACPA‐related differential methylation, highlighting the relevance of cell signaling and collagen metabolism in RA. We implemented our method in the R Bioconductor package called “SOMNiBUS.”
Abstract Introduction Recent literature proposes that amyloid-β and phosphorylated tau (p-tau) synergism accelerates biomarker abnormalities in controls. Yet, it remains to be answered whether this ...synergism is the driving force behind Alzheimer disease (AD) dementia. Methods We stratified 314 mild cognitive impairment individuals using 18 Fflorbetapir positron emission tomography amyloid-β imaging and cerebrospinal fluid p-tau. Regression and voxel-based logistic regression models with interaction terms evaluated 2-year changes in cognition and clinical status as a function of baseline biomarkers. Results We found that the synergism between 18 Fflorbetapir and p-tau, rather than their additive effects, was associated with the cognitive decline and progression to AD. Furthermore, voxel-based analysis revealed that temporal and inferior parietal were the regions where the synergism determined an increased likelihood of developing AD. Discussion Together, the present results support that progression to AD dementia is driven by the synergistic rather than a mere additive effect between amyloid-β and p-tau proteins.
Objectives:
The aim was to identify the most important features of structural knee osteoarthritis (OA) progressors and classification using machine learning methods.
Methods:
Participants, features ...and outcomes were from the Osteoarthritis Initiative. Features were from baseline (1107), including articular knee tissues (135) assessed by quantitative magnetic resonance imaging (MRI). OA progressors were ascertained by four outcomes: cartilage volume loss in medial plateau at 48 and 96 months (Prop_CV_48M, 96M), Kellgren–Lawrence (KL) grade ⩾ 2 and medial joint space narrowing (JSN) ⩾ 1 at 48 months. Six feature selection models were used to identify the common features in each outcome. Six classification methods were applied to measure the accuracy of the selected features in classifying the subjects into progressors and non-progressors. Classification of the best features was done using an automatic machine learning interface and the area under the curve (AUC). To prioritize the top five features, sparse partial least square (sPLS) method was used.
Results:
For the classification of the best common features in each outcome, Multi-Layer Perceptron (MLP) achieved the highest AUC in Prop_CV_96M, KL and JSN (0.80, 0.88, 0.95), and Gradient Boosting Machine for Prop_CV_48M (0.70). sPLS showed the baseline top five features to predict knee OA progressors are the joint space width, mean cartilage thickness of the medial tibial plateau and sub-regions and JSN.
Conclusion:
In this comprehensive study using a large number of features (n = 1107) and MRI outcomes in addition to radiological outcomes, we identified the best features and classification methods for knee OA structural progressors. Data revealed baseline X-ray and MRI-based features could predict early OA knee progressors and that MLP is the best classification method.
•Gray-white contrast (GWC) decreases in most areas of the cortex in early life.•The trajectories of GWC decline tend to be cubic across the cortex.•Nonetheless, in a few areas, GWC followed simple ...linear and quadratic trajectories.•The trajectories of GWC decline exhibit a high level of symmetry across hemispheres.
In the last few years, a significant amount of work has aimed to characterize maturational trajectories of cortical development. The role of pericortical microstructure putatively characterized as the gray-white matter contrast (GWC) at the pericortical gray-white matter boundary and its relationship to more traditional morphological measures of cortical morphometry has emerged as a means to examine finer grained neuroanatomical underpinnings of cortical changes. In this work, we characterize the GWC developmental trajectories in a representative sample (n = 394) of children and adolescents (~4 to ~22 years of age), with repeated scans (1–3 scans per subject, total scans n = 819). We tested whether linear, quadratic, or cubic trajectories of contrast development best described changes in GWC. A best-fit model was identified vertex-wise across the whole cortex via the Akaike Information Criterion (AIC). GWC across nearly the whole brain was found to significantly change with age. Cubic trajectories were likeliest for 63% of vertices, quadratic trajectories were likeliest for 20% of vertices, and linear trajectories were likeliest for 16% of vertices. A main effect of sex was observed in some regions, where males had a higher GWC than females. However, no sex by age interactions were found on GWC. In summary, our results suggest a progressive decrease in GWC at the pericortical boundary throughout childhood and adolescence. This work contributes to efforts seeking to characterize typical, healthy brain development and, by extension, can help elucidate aberrant developmental trajectories.
Capturing the conditional covariances or correlations among the elements of a multivariate response vector based on covariates is important to various fields including neuroscience, epidemiology and ...biomedicine. We propose a new method called Covariance Regression with Random Forests (CovRegRF) to estimate the covariance matrix of a multivariate response given a set of covariates, using a random forest framework. Random forest trees are built with a splitting rule specially designed to maximize the difference between the sample covariance matrix estimates of the child nodes. We also propose a significance test for the partial effect of a subset of covariates. We evaluate the performance of the proposed method and significance test through a simulation study which shows that the proposed method provides accurate covariance matrix estimates and that the Type-1 error is well controlled. An application of the proposed method to thyroid disease data is also presented. CovRegRF is implemented in a freely available R package on CRAN.
The epilepsy clinic at the Montreal Neurological Institute receives a high volume of referrals. Despite most patients assessed in the clinic eventually being diagnosed with epilepsy, other disorders ...causing alteration of consciousness or paroxystic symptoms that could be misdiagnosed as seizures are seen frequently. The incidence and clinical characteristics of such patients have not yet been determined. We aimed to determine the proportion and clinical characteristics of patients referred to our epilepsy clinic who had a final diagnosis other than epilepsy.
We performed a retrospective chart analysis of consecutive patient referrals to the epilepsy clinic from January 2013 to January 2015, inclusively.
Four hundred four patient referrals were evaluated, 106 (or 26%) had a final diagnosis other than epilepsy. Referrals came primarily from general practitioners and nonneurology specialists. Although most patients had a normal routine electroencephalography (EEG) prior to the clinic visit, sleep-deprived EEG and cardiac investigations were rarely performed. Patients received a final diagnosis other than epilepsy after 1 to 2 visits in 92% of cases and with minimal paraclinical investigations. Prolonged video-EEG recording was required in 27% of patients. The most common diagnoses were syncope (33%), psychiatric symptoms (20%), followed by migraine (10%), and psychogenic nonepileptic seizures (9%).
A significant proportion of patients seen in our tertiary care epilepsy clinic is in fact, not patients with epilepsy. Enhanced knowledge of these differential diagnosis and important anamnesis components to rule out seizures will help improve guidelines for referral to Epilepsy clinic and cost-effectively optimize the use of paraclinical investigations.
•Many patients referred to specialty epilepsy clinics don’t have epilepsy•The diagnosis of seizures remains a challenge regardless of physician’s specialty•Epilepsy ruled out within 2 visits, with minimal investigations, emphasizing the role of history taking in the diagnosis of seizures.•Missing a diagnosis of epilepsy is perceived as having more consequences than misdiagnosing conditions that mimic seizures.•Current diagnostic practices result in poor utilization of resources. Physician education is needed to improve outcomes and health care costs.
We thank Hattab and colleagues for their correspondence and their investigation of cell-type mixture correction methods in methyl-CG binding domain sequencing. Here, we speculate on why surrogate ...variable analysis (SVA) performed differently between their two data sets, and poorly in one of them.Please see related Correspondence article: https://genomebiology.biomedcentral.com/articles/10/1186/s13059-017-1148-8 and related Research article: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0935-y.
We propose an extension to quantile normalization that removes unwanted technical variation using control probes. We adapt our algorithm, functional normalization, to the Illumina 450k methylation ...array and address the open problem of normalizing methylation data with global epigenetic changes, such as human cancers. Using data sets from The Cancer Genome Atlas and a large case-control study, we show that our algorithm outperforms all existing normalization methods with respect to replication of results between experiments, and yields robust results even in the presence of batch effects. Functional normalization can be applied to any microarray platform, provided suitable control probes are available.