Biclustering is an important exploratory analysis tool that simultaneously clusters rows (e.g., samples) and columns (e.g., variables) of a data matrix. Checkerboard-like biclusters reveal intrinsic ...associations between rows and columns. However, most existing methods rely on Gaussian assumptions and only apply to matrix data. In practice, non-Gaussian and/or multi-way tensor data are frequently encountered. A new CO-clustering method via Regularized Alternating Least Squares (CORALS) is proposed, which generalizes biclustering to non-Gaussian data and multi-way tensor arrays. Non-Gaussian data are modeled with single-parameter exponential family distributions and co-clusters are identified in the natural parameter space via sparse CANDECOMP/PARAFAC tensor decomposition. A regularized alternating (iteratively reweighted) least squares algorithm is devised for model fitting and a deflation procedure is exploited to automatically determine the number of co-clusters. Comprehensive simulation studies and three real data examples demonstrate the efficacy of the proposed method. The data and code are publicly available at https://github.com/reagan0323/CORALS.
Errors of coupled general circulation models (CGCMs) limit their utility for climate prediction and projection. Origins of and feedback for tropical biases are investigated in the historical climate ...simulations of 18 CGCMs from phase 5 of the Coupled Model Intercomparison Project (CMIP5), together with the available Atmospheric Model Intercomparison Project (AMIP) simulations. Based on an intermodel empirical orthogonal function (EOF) analysis of tropical Pacific precipitation, the excessive equatorial Pacific cold tongue and double intertropical convergence zone (ITCZ) stand out as the most prominent errors of the current generation of CGCMs. The comparison of CMIP–AMIP pairs enables us to identify whether a given type of errors originates from atmospheric models. The equatorial Pacific cold tongue bias is associated with deficient precipitation and surface easterly wind biases in the western half of the basin in CGCMs, but these errors are absent in atmosphere-only models, indicating that the errors arise from the interaction with the ocean via Bjerknes feedback. For the double ITCZ problem, excessive precipitation south of the equator correlates well with excessive downward solar radiation in the Southern Hemisphere (SH) midlatitudes, an error traced back to atmospheric model simulations of cloud during austral spring and summer. This extratropical forcing of the ITCZ displacements is mediated by tropical ocean–atmosphere interaction and is consistent with recent studies of ocean–atmospheric energy transport balance.
Natural medicines were the only option for the prevention and treatment of human diseases for thousands of years. Natural products are important sources for drug development. The amounts of bioactive ...natural products in natural medicines are always fairly low. Today, it is very crucial to develop effective and selective methods for the extraction and isolation of those bioactive natural products. This paper intends to provide a comprehensive view of a variety of methods used in the extraction and isolation of natural products. This paper also presents the advantage, disadvantage and practical examples of conventional and modern techniques involved in natural products research.
ChIPseeker is an R package for annotating ChIP-seq data analysis. It supports annotating ChIP peaks and provides functions to visualize ChIP peaks coverage over chromosomes and profiles of peaks ...binding to TSS regions. Comparison of ChIP peak profiles and annotation are also supported. Moreover, it supports evaluating significant overlap among ChIP-seq datasets. Currently, ChIPseeker contains 15 000 bed file information from GEO database. These datasets can be downloaded and compare with user's own data to explore significant overlap datasets for inferring co-regulation or transcription factor complex for further investigation.
ChIPseeker is released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/ChIPseeker.html).
Since 2018, China has put into practice the new Environmental Protection Tax Law, which levies taxes on non-greenhouse pollutants with stringent standards. To fully study the socioeconomic and ...environmental impacts of this policy, this paper establishes a dynamic country-level Compute General Equilibrium (CGE) model with the electricity sector disaggregated into 5 technologies and 19 major taxable pollutants included. Five scenarios are developed, including the baseline scenario BaU, the low environmental tax scenario LowET, the high environmental tax scenario HighET, the low environmental tax and low carbon tax scenario LowETC, and the high environmental tax and high carbon tax scenario HighETC. The simulation results show that the environmental tax could help reduce emissions of most kinds of pollutants but bring negative effects on the Gross Domestic Product (GDP). Compared to the BaU scenario, the GDP loss by 2030 would be 0.10%, 0.21%, 0.32% and 0.67% in the LowET, HighET, LowETC and HighETC scenarios. The emission of sulphur dioxide (SO2) will decrease by 3.55%, 7.15%, 6.70% and 13.01%, and the emission of carbon dioxide (CO2) will be reduced by 2.21%, 4.62%, 8.91% and 16.77%. The result also shows that the heavy polluted sectors and energy-intensive sectors will suffer higher output loss, while the clean energy sectors and service sector will experience an increase in the output in the policy scenarios.
•The environmental tax could help reduce emissions of most kinds of pollutants.•The GDP loss would vary from 0.10% to 0.67% under different tax levels.•Heavy polluted or energy-intensive sectors will suffer higher output loss.•The environmental tax has a synergy effect of reducing carbon emissions.
Disease ontology (DO) annotates human genes in the context of disease. DO is important annotation in translating molecular findings from high-throughput data to clinical relevance. DOSE is an R ...package providing semantic similarity computations among DO terms and genes which allows biologists to explore the similarities of diseases and of gene functions in disease perspective. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented to support discovering disease associations of high-throughput biological data. This allows biologists to verify disease relevance in a biological experiment and identify unexpected disease associations. Comparison among gene clusters is also supported.
DOSE is released under Artistic-2.0 License. The source code and documents are freely available through Bioconductor (http://www.bioconductor.org/packages/release/bioc/html/DOSE.html).
Supplementary data are available at Bioinformatics online.
gcyu@connect.hku.hk or tqyhe@jnu.edu.cn.
Recent studies have shown that the performance of single-image super-resolution methods can be significantly boosted by using deep convolutional neural networks. In this study, we present a novel ...single-image super-resolution method by introducing dense skip connections in a very deep network. In the proposed network, the feature maps of each layer are propagated into all subsequent layers, providing an effective way to combine the low-level features and high-level features to boost the reconstruction performance. In addition, the dense skip connections in the network enable short paths to be built directly from the output to each layer, alleviating the vanishing-gradient problem of very deep networks. Moreover, deconvolution layers are integrated into the network to learn the upsampling filters and to speedup the reconstruction process. Further, the proposed method substantially reduces the number of parameters, enhancing the computational efficiency. We evaluate the proposed method using images from four benchmark datasets and set a new state of the art.
Inspired by the zwitterion species generated from the splitting of H2 by frustrated Lewis pairs, we put forward a novel frustrated Lewis pair by the combination of Hδ‑ and Hδ+ incorporated Lewis acid ...and base together. Piers’ borane and chiral tert-butylsulfinamide were chosen as the FLP, and a metal-free asymmetric transfer hydrogenation of imines was realized with high enantioselectivities. Significantly, with ammonia borane as hydrogen source, a catalytic asymmetric reaction using 10 mol % of Piers’ borane, chiral tert-butylsulfinamide, and pyridine additive, has been successfully achieved to furnish optically active amines in 78–99% yields with 84–95% ee’s. Experimental and theoretical mechanistic studies reveal an interesting 8-membered ring hydrogen transfer transition state and an expected regeneration of reactive species with ammonia borane. Accordingly, a plausible catalytic pathway for this reaction is depicted.
Esophageal squamous cell carcinoma (ESCC) is one of the deadliest cancers. We performed exome sequencing on 113 tumor-normal pairs, yielding a mean of 82 non-silent mutations per tumor, and 8 cell ...lines. The mutational profile of ESCC closely resembles those of squamous cell carcinomas of other tissues but differs from that of esophageal adenocarcinoma. Genes involved in cell cycle and apoptosis regulation were mutated in 99% of cases by somatic alterations of TP53 (93%), CCND1 (33%), CDKN2A (20%), NFE2L2 (10%) and RB1 (9%). Histone modifier genes were frequently mutated, including KMT2D (also called MLL2; 19%), KMT2C (MLL3; 6%), KDM6A (7%), EP300 (10%) and CREBBP (6%). EP300 mutations were associated with poor survival. The Hippo and Notch pathways were dysregulated by mutations in FAT1, FAT2, FAT3 or FAT4 (27%) or AJUBA (JUB; 7%) and NOTCH1, NOTCH2 or NOTCH3 (22%) or FBXW7 (5%), respectively. These results define the mutational landscape of ESCC and highlight mutations in epigenetic modulators with prognostic and potentially therapeutic implications.
A finite mixture of logistic regression model (FMLR) was applied to analyze the heterogeneity within the merging driver population. This model can automatically provide useful hidden information ...about the characteristics of the driver population. EM algorithm and Newton-Raphson algorithm were used to estimate the parameters. To accomplish the objective of this study, the FMLR model was applied to a trajectory dataset extracted from the NGSIM dataset and a 2-component FMLR model was identified. The important findings can be summarized as follows: The studied drivers can be classified into two components. One is called Risk-Rejecting Drivers. These drivers are consistent with previous studies and primarily merge in as soon as possible and have a distinct preference for the large gaps. The other is the Risk-Taking Drivers that are much less sensitive to the gap size and pay more attention to surrounding traffic conditions such as the speed of front vehicle in the auxiliary lane and lead space gap between the merging vehicle and its leading vehicles in the auxiliary lane. Risk-Taking Drivers use the auxiliary lane to get to the further downstream or less congested area of the main lane. The proposed model can also produce more precise predicting accuracy than logistic regression model.