Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. ...However, the field currently lacks statistical methods to calculate sample size and estimate power for RNA-Seq differential expression studies using such designs. To fill the gap, simulation based methods have a great advantage by providing numerical solutions, since theoretical distributions of test statistics are typically unavailable for such designs.
In this paper, we propose a novel simulation based procedure for power estimation of differential expression with the employment of generalized linear mixed effects models for correlated expression data. We also propose a new procedure for power estimation of differential expression with the use of a bivariate negative binomial distribution for paired designs. We compare the performance of both the likelihood ratio test and Wald test under a variety of simulation scenarios with the proposed procedures. The simulated distribution was used to estimate the null distribution of test statistics in order to achieve the desired false positive control and was compared to the asymptotic Chi-square distribution. In addition, we applied the procedure for paired designs to the TCGA breast cancer data set.
In summary, we provide a framework for power estimation of RNA-Seq differential expression under complex experimental designs. Simulation results demonstrate that both the proposed procedures properly control the false positive rate at the nominal level.
Normalization is an essential step with considerable impact on high-throughput RNA sequencing (RNA-seq) data analysis. Although there are numerous methods for read count normalization, it remains a ...challenge to choose an optimal method due to multiple factors contributing to read count variability that affects the overall sensitivity and specificity. In order to properly determine the most appropriate normalization methods, it is critical to compare the performance and shortcomings of a representative set of normalization routines based on different dataset characteristics. Therefore, we set out to evaluate the performance of the commonly used methods (DESeq, TMM-edgeR, FPKM-CuffDiff, TC, Med UQ and FQ) and two new methods we propose: Med-pgQ2 and UQ-pgQ2 (per-gene normalization after per-sample median or upper-quartile global scaling). Our per-gene normalization approach allows for comparisons between conditions based on similar count levels. Using the benchmark Microarray Quality Control Project (MAQC) and simulated datasets, we performed differential gene expression analysis to evaluate these methods. When evaluating MAQC2 with two replicates, we observed that Med-pgQ2 and UQ-pgQ2 achieved a slightly higher area under the Receiver Operating Characteristic Curve (AUC), a specificity rate > 85%, the detection power > 92% and an actual false discovery rate (FDR) under 0.06 given the nominal FDR (≤0.05). Although the top commonly used methods (DESeq and TMM-edgeR) yield a higher power (>93%) for MAQC2 data, they trade off with a reduced specificity (<70%) and a slightly higher actual FDR than our proposed methods. In addition, the results from an analysis based on the qualitative characteristics of sample distribution for MAQC2 and human breast cancer datasets show that only our gene-wise normalization methods corrected data skewed towards lower read counts. However, when we evaluated MAQC3 with less variation in five replicates, all methods performed similarly. Thus, our proposed Med-pgQ2 and UQ-pgQ2 methods perform slightly better for differential gene analysis of RNA-seq data skewed towards lowly expressed read counts with high variation by improving specificity while maintaining a good detection power with a control of the nominal FDR level.
With the rise of metabolomics, the development of methods to address analytical challenges in the analysis of metabolomics data is of great importance. Missing values (MVs) are pervasive, yet the ...treatment of MVs can have a substantial impact on downstream statistical analyses. The MVs problem in metabolomics is quite challenging and can arise because the metabolite is not biologically present in the sample, or is present in the sample but at a concentration below the lower limit of detection (LOD), or is present in the sample but undetected due to technical issues related to sample pre-processing steps. The former is considered missing not at random (MNAR) while the latter is an example of missing at random (MAR). Typically, such MVs are substituted by a minimum value, which may lead to severely biased results in downstream analyses.
We develop a Bayesian model, called BayesMetab, that systematically accounts for missing values based on a Markov chain Monte Carlo (MCMC) algorithm that incorporates data augmentation by allowing MVs to be due to either truncation below the LOD or other technical reasons unrelated to its abundance. Based on a variety of performance metrics (power for detecting differential abundance, area under the curve, bias and MSE for parameter estimates), our simulation results indicate that BayesMetab outperformed other imputation algorithms when there is a mixture of missingness due to MAR and MNAR. Further, our approach was competitive with other methods tailored specifically to MNAR in situations where missing data were completely MNAR. Applying our approach to an analysis of metabolomics data from a mouse myocardial infarction revealed several statistically significant metabolites not previously identified that were of direct biological relevance to the study.
Our findings demonstrate that BayesMetab has improved performance in imputing the missing values and performing statistical inference compared to other current methods when missing values are due to a mixture of MNAR and MAR. Analysis of real metabolomics data strongly suggests this mixture is likely to occur in practice, and thus, it is important to consider an imputation model that accounts for a mixture of missing data types.
The R package clValid contains functions for validating the results of a clustering analysis. There are three main types of cluster validation measures available, "internal", "stability", and ..."biological". The user can choose from nine clustering algorithms in existing R packages, including hierarchical, K-means, self-organizing maps (SOM), and model-based clustering. In addition, we provide a function to perform the self-organizing tree algorithm (SOTA) method of clustering. Any combination of validation measures and clustering methods can be requested in a single function call. This allows the user to simultaneously evaluate several clustering algorithms while varying the number of clusters, to help determine the most appropriate method and number of clusters for the dataset of interest. Additionally, the package can automatically make use of the biological information contained in the Gene Ontology (GO) database to calculate the biological validation measures, via the annotation packages available in Bioconductor. The function returns an object of S4 class "clValid", which has summary, plot, print, and additional methods which allow the user to display the optimal validation scores and extract clustering results.
The Model for End-Stage Liver Disease (MELD) score has been successfully used to prioritize patients on the United States liver transplant waiting list since its adoption in 2002. The United Network ...for Organ Sharing (UNOS)/Organ Procurement Transplantation Network (OPTN) allocation policy has evolved over the years, and notable recent changes include Share 35, inclusion of serum sodium in the MELD score, and a 'delay and cap' policy for hepatocellular carcinoma (HCC) patients. We explored the potential of a registrant's change in 30-day MELD scores (ΔMELD30) to improve allocation both before and after these policy changes. Current MELD and ΔMELD30 were evaluated using cause-specific hazards models for waitlist dropout based on US liver transplant registrants added to the waitlist between 06/30/2003 and 6/30/2013. Two composite scores were constructed and then evaluated on UNOS data spanning the current policy era (01/02/2016 to 09/07/2018). Predictive accuracy was evaluated using the C-index for model discrimination and by comparing observed and predicted waitlist dropout probabilities for model calibration. After the change to MELD-Na, increased dropout associated with ΔMELD30 jumps is no longer evident at MELD scores below 30. However, the adoption of Share 35 has potentially resulted in discrepancies in waitlist dropout for patients with sharp MELD increases at higher MELD scores. Use of the ΔMELD30 to add additional points or serve as a potential tiebreaker for patients with rapid deterioration may extend the benefit of Share 35 to better include those in most critical need.
Hospital length of stay (LOS) and time for a patient to reach clinical stability (TCS) have increasingly become important outcomes when investigating ways in which to combat Community Acquired ...Pneumonia (CAP). Difficulties arise when deciding how to handle in-hospital mortality. Ad-hoc approaches that are commonly used to handle time to event outcomes with mortality can give disparate results and provide conflicting conclusions based on the same data. To ensure compatibility among studies investigating these outcomes, this type of data should be handled in a consistent and appropriate fashion.
Using both simulated data and data from the international Community Acquired Pneumonia Organization (CAPO) database, we evaluate two ad-hoc approaches for handling mortality when estimating the probability of hospital discharge and clinical stability: 1) restricting analysis to those patients who lived, and 2) assigning individuals who die the "worst" outcome (right-censoring them at the longest recorded LOS or TCS). Estimated probability distributions based on these approaches are compared with right-censoring the individuals who died at time of death (the complement of the Kaplan-Meier (KM) estimator), and treating death as a competing risk (the cumulative incidence estimator). Tests for differences in probability distributions based on the four methods are also contrasted.
The two ad-hoc approaches give different estimates of the probability of discharge and clinical stability. Analysis restricted to patients who survived is conceptually problematic, as estimation is conditioned on events that happen at a future time. Estimation based on assigning those patients who died the worst outcome (longest LOS and TCS) coincides with the complement of the KM estimator based on the subdistribution hazard, which has been previously shown to be equivalent to the cumulative incidence estimator. However, in either case the time to in-hospital mortality is ignored, preventing simultaneous assessment of patient mortality in addition to LOS and/or TCS. The power to detect differences in underlying hazards of discharge between patient populations differs for test statistics based on the four approaches, and depends on the underlying hazard ratio of mortality between the patient groups.
Treating death as a competing risk gives estimators which address the clinical questions of interest, and allows for simultaneous modelling of both in-hospital mortality and TCS / LOS. This article advocates treating mortality as a competing risk when investigating other time related outcomes.
Gene expression profiling experiments with few replicates lead to great variability in the estimates of gene variances. Toward this end, several moderated t-test methods have been developed to reduce ...this variability and to increase power for testing differential expression. Most of these moderated methods are based on linear models with fixed effects where residual variances are smoothed under a hierarchical Bayes framework. However, they are inadequate for designs with complex correlation structures, therefore application of moderated methods to linear models with mixed effects are needed for differential expression analysis.
We demonstrated the implementation of the fully moderated t-statistic method for linear models with mixed effects, where both residual variances and variance estimates of random effects are smoothed under a hierarchical Bayes framework. We compared the proposed method with two current moderated methods and show that the proposed method can control the expected number of false positives at the nominal level, while the two current moderated methods fail.
We proposed an approach for testing differential expression under complex correlation structures while providing variance shrinkage. The proposed method is able to improve power by moderation and controls the expected number of false positives properly at the nominal level.
The main purpose of this paper was to model the process by which patients enter the ED, are seen by physicians, and discharged from the Emergency Department at Nationwide Children's Hospital, as well ...as identify modifiable factors that are associated with ED lengths of stay through use of multistate modeling.
In this study, 75,591 patients admitted to the ED from March 1st, 2016 to February 28th, 2017 were analyzed using a multistate model of the ED process. Cox proportional hazards models with transition-specific covariates were used to model each transition in the multistate model and the Aalen-Johansen estimator was used to obtain transition probabilities and state occupation probabilities in the ED process.
Acuity level, season, time of day and number of ED physicians had significant and varying associations with the six transitions in the multistate model. Race and ethnicity were significantly associated with transition to left without being seen, but not with the other transitions. Conversely, age and gender were significantly associated with registration to room and subsequent transitions in the model, though the magnitude of association was not strong.
The multistate model presented in this paper decomposes the overall ED length of stay into constituent transitions for modeling covariate-specific effects on each transition. This allows physicians to understand the ED process and identify which potentially modifiable covariates would have the greatest impact on reducing the waiting times in each state in the model.
DSC is used to determine thermally-induced conformational changes of biomolecules within a blood plasma sample. Recent research has indicated that DSC curves (or thermograms) may have different ...characteristics based on disease status and, thus, may be useful as a monitoring and diagnostic tool for some diseases. Since thermograms are curves measured over a range of temperature values, they are considered functional data. In this paper we apply functional data analysis techniques to analyze differential scanning calorimetry (DSC) data from individuals from the Lupus Family Registry and Repository (LFRR). The aim was to assess the effect of lupus disease status as well as additional covariates on the thermogram profiles, and use FD analysis methods to create models for classifying lupus vs. control patients on the basis of the thermogram curves.
Thermograms were collected for 300 lupus patients and 300 controls without lupus who were matched with diseased individuals based on sex, race, and age. First, functional regression with a functional response (DSC) and categorical predictor (disease status) was used to determine how thermogram curve structure varied according to disease status and other covariates including sex, race, and year of birth. Next, functional logistic regression with disease status as the response and functional principal component analysis (FPCA) scores as the predictors was used to model the effect of thermogram structure on disease status prediction. The prediction accuracy for patients with Osteoarthritis and Rheumatoid Arthritis but without Lupus was also calculated to determine the ability of the classifier to differentiate between Lupus and other diseases. Data were divided 1000 times into separate 2/3 training and 1/3 test data for evaluation of predictions. Finally, derivatives of thermogram curves were included in the models to determine whether they aided in prediction of disease status.
Functional regression with thermogram as a functional response and disease status as predictor showed a clear separation in thermogram curve structure between cases and controls. The logistic regression model with FPCA scores as the predictors gave the most accurate results with a mean 79.22% correct classification rate with a mean sensitivity = 79.70%, and specificity = 81.48%. The model correctly classified OA and RA patients without Lupus as controls at a rate of 75.92% on average with a mean sensitivity = 79.70% and specificity = 77.6%. Regression models including FPCA scores for derivative curves did not perform as well, nor did regression models including covariates.
Changes in thermograms observed in the disease state likely reflect covalent modifications of plasma proteins or changes in large protein-protein interacting networks resulting in the stabilization of plasma proteins towards thermal denaturation. By relating functional principal components from thermograms to disease status, our Functional Principal Component Analysis model provides results that are more easily interpretable compared to prior studies. Further, the model could also potentially be coupled with other biomarkers to improve diagnostic classification for lupus.
The prognostic value of long-term glycemic variability is incompletely understood. We evaluated the influence of visit-to-visit variability (VVV) of fasting blood glucose (FBG) on incident ...cardiovascular disease (CVD) and mortality.
We conducted a prospective cohort analysis including 4,982 participants in the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT) who attended the baseline, 24-month, and 48-month visits. VVV of FBG was defined as the SD or variability independent of the mean (VIM) across FBG measurements obtained at the three visits. Participants free of CVD during the first 48 months of the study were followed for incident CVD (coronary heart disease CHD, stroke, and heart failure HF) and all-cause mortality.
Over a median follow-up of 5 years, there were 305 CVD events (189 CHD, 45 stroke, and 81 HF) and 154 deaths. The adjusted hazard ratio (HR) comparing participants in the highest versus lowest quartile of SD of FBG (≥26.4 vs. <5.5 mg/dL) was 1.43 (95% CI 0.93-2.19) for CVD and 2.22 (95% CI 1.22-4.04) for all-cause mortality. HR for VIM was 1.17 (95% CI 0.84-1.62) for CVD and 1.89 (95% CI 1.21-2.93) for all-cause mortality. Among individuals without diabetes, the highest quartile of SD of FBG (HR 2.67 95% CI 0.14-6.25) or VIM (HR 2.50 95% CI 1.40-4.46) conferred a higher risk of death.
Greater VVV of FBG is associated with increased mortality risk. Our data highlight the importance of achieving normal and consistent glycemic levels for improving clinical outcomes.