Meta-analyses increase statistical power by combining statistics from multiple studies. Meta-analysis methods have mostly been evaluated under the condition that all the data in each study have an ...association with the given phenotype. However, specific experimental conditions in each study or genetic heterogeneity can result in "unassociated statistics" that are derived from the null distribution. Here, we show that power of conventional meta-analysis methods rapidly decreases as an increasing number of unassociated statistics are included, whereas the classical Fisher's method and its weighted variant (wFisher) exhibit relatively high power that is robust to addition of unassociated statistics. We also propose another robust method based on joint distribution of ordered p-values (ordmeta). Simulation analyses for t-test, RNA-seq, and microarray data demonstrated that wFisher and ordmeta, when only a small number of studies have an association, outperformed existing meta-analysis methods. We performed meta-analyses of nine microarray datasets (prostate cancer) and four association summary datasets (body mass index), where our methods exhibited high biological relevance and were able to detect genes that the-state-of-the-art methods missed. The metapro R package that implements the proposed methods is available from both CRAN and GitHub ( http://github.com/unistbig/metapro ).
Benchmarking RNA-seq differential expression analysis methods using spike-in and simulated RNA-seq data has often yielded inconsistent results. The spike-in data, which were generated from the same ...bulk RNA sample, only represent technical variability, making the test results less reliable. We compared the performance of 12 differential expression analysis methods for RNA-seq data, including recent variants in widely used software packages, using both RNA spike-in and simulation data for negative binomial (NB) model. Performance of edgeR, DESeq2, and ROTS was particularly different between the two benchmark tests. Then, each method was tested under most extensive simulation conditions especially demonstrating the large impacts of proportion, dispersion, and balance of differentially expressed (DE) genes. DESeq2, a robust version of edgeR (edgeR.rb), voom with TMM normalization (voom.tmm) and sample weights (voom.sw) showed an overall good performance regardless of presence of outliers and proportion of DE genes. The performance of RNA-seq DE gene analysis methods substantially depended on the benchmark used. Based on the simulation results, suitable methods were suggested under various test conditions.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The Hsp90 family proteins Hsp90, Grp94, and TRAP1 are present in the cell cytoplasm, endoplasmic reticulum, and mitochondria, respectively; all play important roles in tumorigenesis by regulating ...protein homeostasis in response to stress. Thus, simultaneous inhibition of all Hsp90 paralogs is a reasonable strategy for cancer therapy. However, since the existing pan-Hsp90 inhibitor does not accumulate in mitochondria, the potential anticancer activity of pan-Hsp90 inhibition has not yet been fully examined in vivo. Analysis of The Cancer Genome Atlas database revealed that all Hsp90 paralogs were upregulated in prostate cancer. Inactivation of all Hsp90 paralogs induced mitochondrial dysfunction, increased cytosolic calcium, and activated calcineurin. Active calcineurin blocked prosurvival heat shock responses upon Hsp90 inhibition by preventing nuclear translocation of HSF1. The purine scaffold derivative DN401 inhibited all Hsp90 paralogs simultaneously and showed stronger anticancer activity than other Hsp90 inhibitors. Pan-Hsp90 inhibition increased cytotoxicity and suppressed mechanisms that protect cancer cells, suggesting that it is a feasible strategy for the development of potent anticancer drugs. The mitochondria-permeable drug DN401 is a newly identified in vivo pan-Hsp90 inhibitor with potent anticancer activity.
In differential expression analysis of RNA-sequencing (RNA-seq) read count data for two sample groups, it is known that highly expressed genes (or longer genes) are more likely to be differentially ...expressed which is called read count bias (or gene length bias). This bias had great effect on the downstream Gene Ontology over-representation analysis. However, such a bias has not been systematically analyzed for different replicate types of RNA-seq data.
We show that the dispersion coefficient of a gene in the negative binomial modeling of read counts is the critical determinant of the read count bias (and gene length bias) by mathematical inference and tests for a number of simulated and real RNA-seq datasets. We demonstrate that the read count bias is mostly confined to data with small gene dispersions (e.g., technical replicates and some of genetically identical replicates such as cell lines or inbred animals), and many biological replicate data from unrelated samples do not suffer from such a bias except for genes with some small counts. It is also shown that the sample-permuting GSEA method yields a considerable number of false positives caused by the read count bias, while the preranked method does not.
We showed the small gene variance (similarly, dispersion) is the main cause of read count bias (and gene length bias) for the first time and analyzed the read count bias for different replicate types of RNA-seq data and its effect on gene-set enrichment analysis.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis ...of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.
Non-alcoholic fatty liver disease (NAFLD) is characterized by excessive lipid accumulation and imbalances in lipid metabolism in the liver. Although nuclear receptors (NRs) play a crucial role in ...hepatic lipid metabolism, the underlying mechanisms of NR regulation in NAFLD remain largely unclear.
Using network analysis and RNA-seq to determine the correlation between NRs and microRNA in human NAFLD patients, we revealed that
specifically targets
mimic and anti-
were administered to human HepG2 and Huh-7 cells and mouse primary hepatocytes as well as high-fat diet (HFD)- or methionine-deficient diet (MCD)-fed mice to verify the specific function of
in NAFLD. We tested the inhibition of the therapeutic effect of a PPARα agonist, fenofibrate, by
and the synergic effect of combination of fenofibrate with anti-
in NAFLD mouse model.
We revealed that
specifically targets
through miRNA regulatory network analysis of nuclear receptor genes in NAFLD. The expression of
was upregulated in free fatty acid (FA)-treated hepatocytes and the livers of both obesity-induced mice and NAFLD patients. Overexpression of
significantly increased hepatic lipid accumulation and triglyceride levels. Furthermore,
significantly reduced FA oxidation and mitochondrial biogenesis by targeting
. In
-introduced mice, the effect of fenofibrate to ameliorate hepatic steatosis was significantly suppressed. Finally, inhibition of
significantly increased FA oxidation and uptake, resulting in improved insulin sensitivity and a decrease in NAFLD progression. Moreover, combination of fenofibrate and anti-
exhibited the synergic effect on improvement of NAFLD in MCD-fed mice.
Taken together, our results demonstrate that the novel
targets
, plays a significant role in hepatic lipid metabolism, and present an opportunity for the development of novel therapeutics for NAFLD.
This research was funded by Korea Mouse Phenotyping Project (2016M3A9D5A01952411), the National Research Foundation of Korea (NRF) grant funded by the Korea government (2020R1F1A1061267, 2018R1A5A1024340, NRF-2021R1I1A2041463, 2020R1I1A1A01074940, 2016M3C9A394589324), and the Future-leading Project Research Fund (1.210034.01) of UNIST.
We present an accurate and fast web server, WegoLoc for predicting subcellular localization of proteins based on sequence similarity and weighted Gene Ontology (GO) information. A term weighting ...method in the text categorization process is applied to GO terms for a support vector machine classifier. As a result, WegoLoc surpasses the state-of-the-art methods for previously used test datasets. WegoLoc supports three eukaryotic kingdoms (animals, fungi and plants) and provides human-specific analysis, and covers several sets of cellular locations. In addition, WegoLoc provides (i) multiple possible localizations of input protein(s) as well as their corresponding probability scores, (ii) weights of GO terms representing the contribution of each GO term in the prediction, and (iii) a BLAST E-value for the best hit with GO terms. If the similarity score does not meet a given threshold, an amino acid composition-based prediction is applied as a backup method.
WegoLoc and User's guide are freely available at the website http://www.btool.org/WegoLoc
smchiks@ks.ac.kr; dougnam@unist.ac.kr
Supplementary data is available at http://www.btool.org/WegoLoc.
The world is facing a serious biodiversity-loss crisis and stream ecosystems are among the most vulnerable. Human-induced disturbances are accelerating the loss of stream biodiversity; however, their ...ecological impacts are poorly understood. Here, we comprehensively investigated the impact of biodiversity loss on stream food webs using massive food web data (> 1 300 webs). We analyzed the structural changes of food webs upon accumulation of biodiversity loss and specifically compared the severity of losses between fish or benthic macroinvertebrates. In particular, we focused on currently threatened and near-threatened species, to reflect realistic extinction. We simulated their sequential and accumulative extinctions and analyzed the changes in food web structural indices using a linear mixed effect model. Stream food webs tended to be robust against the loss of threatened species; however, the accumulated extinction, including both threatened and near-threatened species, caused substantial changes in food web structures. Notably, significant decreases in the number of links, link density, and generality were observed, indicating the vulnerability of the system. The loss of fish caused larger changes in the food web structure compared to that of benthic macroinvertebrates, indicating the relative importance of fish species in sustaining food web structures. Food web alteration may lead to substantial changes in ecosystem functioning. Our study suggests preemptive action to protect near-threatened species as well as threatened ones for conserving stream ecosystems and their services. Furthermore, we suggest that the food web framework is useful for diagnosing ecosystem-level impacts of species loss in biodiversity conservation.
Bdellovibrio bacteriovorus
109J is a predatory bacterium which lives by predating on other Gram-negative bacteria to obtain the nutrients it needs for replication and survival. Here, we evaluated the ...effects two classes of bacterial signaling molecules (acyl homoserine lactones (AHLs) and diffusible signaling factor (DSF)) have on
B. bacteriovorus
109J behavior and viability. While AHLs had a non-significant impact on predation rates, DSF considerably delayed predation and bdelloplast lysis. Subsequent experiments showed that 50 μM DSF also reduced the motility of attack-phase
B. bacteriovorus
109J cells by 50% (38.2 ± 14.9 vs. 17 ± 8.9 μm/s). Transcriptomic analyses found that DSF caused genome-wide changes in
B. bacteriovorus
109J gene expression patterns during both the attack and intraperiplasmic phases, including the significant downregulation of the flagellum assembly genes and numerous serine protease genes. While the former accounts for the reduced speeds observed, the latter was confirmed experimentally with 50 μM DSF completely blocking protease secretion from attack-phase cells. Additional experiments found that 30% of the total cellular ATP was released into the supernatant when
B. bacteriovorus
109J was exposed to 200 μM DSF, implying that this QS molecule negatively impacts membrane integrity.
Deregulated pathways identified from transcriptome data of two sample groups have played a key role in many genomic studies. Gene-set enrichment analysis (GSEA) has been commonly used for pathway or ...functional analysis of microarray data, and it is also being applied to RNA-seq data. However, most RNA-seq data so far have only small replicates. This enforces to apply the gene-permuting GSEA method (or preranked GSEA) which results in a great number of false positives due to the inter-gene correlation in each gene-set. We demonstrate that incorporating the absolute gene statistic in one-tailed GSEA considerably improves the false-positive control and the overall discriminatory ability of the gene-permuting GSEA methods for RNA-seq data. To test the performance, a simulation method to generate correlated read counts within a gene-set was newly developed, and a dozen of currently available RNA-seq enrichment analysis methods were compared, where the proposed methods outperformed others that do not account for the inter-gene correlation. Analysis of real RNA-seq data also supported the proposed methods in terms of false positive control, ranks of true positives and biological relevance. An efficient R package (AbsFilterGSEA) coded with C++ (Rcpp) is available from CRAN.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK