Gaussian graphical models are usually estimated from unreplicated data. The data are, however, likely to comprise signal and noise. These two cannot be deconvoluted from unreplicated data. ...Pragmatically, the noise is then ignored in practice. We point out the consequences of this practice for the reconstruction of the conditional independence graph of the signal. Replicated data allow for the deconvolution of signal and noise and the reconstruction of former's conditional independence graph. Hereto we present a penalized Expectation‐Maximization algorithm. The penalty parameter is chosen to maximize the F‐fold cross‐validated log‐likelihood. Sampling schemes of the folds from replicated data are discussed. By simulation we investigate the effect of replicates on the reconstruction of the signal's conditional independence graph. Moreover, we compare the proposed method to several obvious competitors. In an application we use data from oncogenomic studies with replicates to reconstruct the gene‐gene interaction networks, operationalized as conditional independence graphs. This yields a realistic portrait of the effect of ignoring other sources but sampling variation. In addition, it bears implications on the reproducibility of inferred gene‐gene interaction networks reported in literature.
When data arrive in a sequence of two or more datasets, modeling on the most recent dataset should take previous datasets into account. We specifically investigate a strategy for regression modeling ...when parameter estimates from previous data can be used as anchoring points, yet may not be available for all parameters, thus, covariance information cannot be reused. A procedure that updates through targeted penalized estimation, which shrinks the estimator toward a nonzero value, is presented. The parameter estimate from the previous data serves as this nonzero value when an update is sought from novel data. This naturally extends to a sequence of datasets with the same response, but potentially only partial overlap in covariates. The iteratively updated regression parameter estimator is shown to be asymptotically unbiased and consistent. The penalty parameter is chosen through constrained cross-validated log-likelihood optimization. The constraint bounds the amount of shrinkage of the updated estimator toward the current one from below. The bound aims to preserve the (updated) estimator's goodness of fit on all-but-the-novel data. The proposed approach is compared to other regression modeling procedures. Finally, it is illustrated on an epidemiological study where the data arrive in batches with different covariate-availability and the model is refitted with the availability of a novel batch.
Supplementary materials
for this article are available online.
The ridge inverse covariance estimator is generalized to allow for entry-wise penalization. An efficient algorithm for its evaluation is proposed. Its computational accuracy is benchmarked against ...implementations of specific cases the generalized ridge inverse covariance estimator encompasses. The proposed estimator shrinks toward a user-specified, nonrandom target matrix and is shown to be positive definite and consistent. It is pointed out how the generalized ridge inverse covariance estimator can be used to obtain a generalization of the graphical lasso estimator as well as of its elastic net counterpart. The usage of the presented estimator is illustrated in graphical modeling of omics data.
Supplementary materials
for this article are available online.
Computationally efficient evaluation of penalized estimators of multivariate exponential family distributions is sought. These distributions encompass among others Markov random fields with variates ...of mixed type (e.g., binary and continuous) as special case of interest. The model parameter is estimated by maximization of the pseudo-likelihood augmented with a convex penalty. The estimator is shown to be consistent. With a world of multi-core computers in mind, a computationally efficient parallel Newton–Raphson algorithm is presented for numerical evaluation of the estimator alongside conditions for its convergence. Parallelization comprises the division of the parameter vector into subvectors that are estimated simultaneously and subsequently aggregated to form an estimate of the original parameter. This approach may also enable efficient numerical evaluation of other high-dimensional estimators. The performance of the proposed estimator and algorithm are evaluated and compared in a simulation study. Finally, the presented methodology is applied to data of an integrative omics study.
The ridge estimation of the precision matrix is investigated in the setting where the number of variables is large relative to the sample size. First, two archetypal ridge estimators are reviewed and ...it is noted that their penalties do not coincide with common quadratic ridge penalties. Subsequently, starting from a proper ℓ2-penalty, analytic expressions are derived for two alternative ridge estimators of the precision matrix. The alternative estimators are compared to the archetypes with regard to eigenvalue shrinkage and risk. The alternatives are also compared to the graphical lasso within the context of graphical modeling. The comparisons may give reason to prefer the proposed alternative estimators.
Abstract
We present the two-dimensional targeted fused ridge estimator of the linear and logistic regression models. The estimator (i) handles both unpenalised and penalised covariates, (ii) ...accommodates possible relations among the covariates’ coefficients through a fusion penalty, and (iii) incorporates prior information on the regression parameter through a non-zero shrinkage target. In this work, the aforementioned relations are similarities among the covariates’ coefficients due to spatial proximity in a two-dimensional grid. In an extensive re-analysis of an epidemiological and an image analysis study, we illustrate the use of the estimator’s aforementioned features that result in a tangibly interpretable predictor.
Mechanical stress determines bone mass and structure. It is not known whether mechanical loading affects expression of bone regulatory genes in a combined deficiency of estrogen and vitamin D. We ...studied the effect of mechanical loading on the messenger RNA (mRNA) expression of bone regulatory genes during vitamin D and/or estrogen deficiency. We performed a single bout in vivo axial loading with 14 N peak load, 2 Hz frequency and 360 cycles in right ulnae of nineteen weeks old female control Wistar rats with or without ovariectomy (OVX), vitamin D deficiency and the combination of OVX and vitamin D deficiency (N = 10/group). Total bone RNA was isolated 6 hours after loading, and mRNA expression was detected of Mepe, Fgf23, Dmp1, Phex, Sost, Col1a1, Cyp27b1, Vdr, and Esr1. Serum levels of 25(OH)D, 1,25(OH)2D and estradiol were also measured at this time point. The effect of loading, vitamin D and estrogen deficiency and their interaction on bone gene expression was tested using a mixed effect model analysis. Mechanical loading significantly increased the mRNA expression of Mepe, and Sost, whereas it decreased the mRNA expression of Fgf23 and Esr1. Mechanical loading showed a significant interaction with vitamin D deficiency with regard to mRNA expression of Vdr and Esr1. Mechanical loading affected gene expression of Mepe, Fgf23, Sost, and Esr1 independently of vitamin D or estrogen, indicating that mechanical loading may affect bone turnover even during vitamin D deficiency and after menopause.
CGHcall achieves high calling accuracy for array CGH data by effective use of breakpoint information from segmentation and by inclusion of several biological concepts that are ignored by existing ...algorithms. The algorithm is validated for simulated and verified real array CGH data. By incorporating more than three classes, CGHcall improves detection of single copy gains and amplifications. Moreover, it allows effective inclusion of chromosome arm information.
Availability: An R-package (GUI), a manual and an example data set are available at http://www.few.vu.nl/~mavdwiel/CGHcall.html.
Contact:
mark.vdwiel@vumc.nl
Supplementary information: Supplementary data are available at Bioinformatics online.
Hypomyelination is observed in the context of a growing number of genetic disorders that share clinical characteristics. The aim of this study was to determine the possible role of magnetic resonance ...imaging pattern recognition in distinguishing different hypomyelinating disorders, which would facilitate the diagnostic process. Only patients with hypomyelination of known cause were included in this retrospective study. A total of 112 patients with Pelizaeus–Merzbacher disease, hypomyelination with congenital cataract, hypomyelination with hypogonadotropic hypogonadism and hypodontia, Pelizaeus–Merzbacher-like disease, infantile GM1 and GM2 gangliosidosis, Salla disease and fucosidosis were included. The brain scans were rated using a standard scoring list; the raters were blinded to the diagnoses. Grouping of the patients was based on cluster analysis. Ten clusters of patients with similar magnetic resonance imaging abnormalities were identified. The most important discriminating items were early cerebellar atrophy, homogeneity of the white matter signal on T2-weighted images, abnormal signal intensity of the basal ganglia, signal abnormalities in the pons and additional T2 lesions in the deep white matter. Eight clusters each represented mainly a single disorder (i.e. Pelizaeus–Merzbacher disease, hypomyelination with congenital cataract, hypomyelination with hypogonadotropic hypogonadism and hypodontia, infantile GM1 and GM2 gangliosidosis, Pelizaeus–Merzbacher-like disease and fucosidosis); only two clusters contained multiple diseases. Pelizaeus–Merzbacher-like disease was divided between two clusters and Salla disease did not cluster at all. This study shows that it is possible to separate patients with hypomyelination disorders of known cause in clusters based on magnetic resonance imaging abnormalities alone. In most cases of Pelizaeus–Merzbacher disease, hypomyelination with congenital cataract, hypomyelination with hypogonadotropic hypogonadism and hypodontia, Pelizaeus–Merzbacher-like disease, infantile GM1 and GM2 gangliosidosis and fucosidosis, the imaging pattern gives clues for the diagnosis.