Partial least squares (PLS) have gained wide applications especially in chemometrics, metabolomics/metabonomics as well as bioinformatics. Here, we present libPLS, a library that integrates not only ...basic PLS modeling algorithms but also advanced and/or recently developed methods on model assessment, outlier detection, and variable selection. This package is featured in a set of Model Population Analysis (MPA)-type approaches that have not been integrated into a single package yet and thus functionally complement existing toolboxes. libPLS provides an integrated platform for developing PLS regression and/or linear discriminant analysis (PLS-LDA) models. It is written in MATLAB and freely available at www.libpls.net.
•Provide an integrated library for partial least squares regression and discriminant analysis.•Featured in model population analysis approaches.•Contain a series of versatile variable selection methods.
Sequence-derived structural and physiochemical features have been frequently used for analysing and predicting structural, functional, expression and interaction profiles of proteins and peptides. To ...facilitate extensive studies of proteins and peptides, we developed a freely available, open source python package called protein in python (propy) for calculating the widely used structural and physicochemical features of proteins and peptides from amino acid sequence. It computes five feature groups composed of 13 features, including amino acid composition, dipeptide composition, tripeptide composition, normalized Moreau-Broto autocorrelation, Moran autocorrelation, Geary autocorrelation, sequence-order-coupling number, quasi-sequence-order descriptors, composition, transition and distribution of various structural and physicochemical properties and two types of pseudo amino acid composition (PseAAC) descriptors. These features could be generally regarded as different Chou's PseAAC modes. In addition, it can also easily compute the previous descriptors based on user-defined properties, which are automatically available from the AAindex database.
The python package, propy, is freely available via http://code.google.com/p/protpy/downloads/list, and it runs on Linux and MS-Windows.
Supplementary data are available at Bioinformatics online.
Osteosarcoma, an aggressive malignant cancer, has a high lung metastasis rate and lacks therapeutic target. Here, we reported that chromobox homolog 4 (CBX4) was overexpressed in osteosarcoma cell ...lines and tissues. CBX4 promoted metastasis by transcriptionally up-regulating Runx2 via the recruitment of GCN5 to the Runx2 promoter. The phosphorylation of CBX4 at T437 by casein kinase 1α (CK1α) facilitated its ubiquitination at both K178 and K280 and subsequent degradation by CHIP, and this phosphorylation of CBX4 could be reduced by TNFα. Consistently, CK1α suppressed cell migration and invasion through inhibition of CBX4. There was a reverse correlation between CK1α and CBX4 in osteosarcoma tissues, and CK1α was a valuable marker to predict clinical outcomes in osteosarcoma patients with metastasis. Pyrvinium pamoate (PP) as a selective activator of CK1α could inhibit osteosarcoma metastasis via the CK1α/CBX4 axis. Our findings indicate that targeting the CK1α/CBX4 axis may benefit osteosarcoma patients with metastasis.
The heterogeneous nature of tumour microenvironment (TME) underlying diverse treatment responses remains unclear in nasopharyngeal carcinoma (NPC). Here, we profile 176,447 cells from 10 NPC ...tumour-blood pairs, using single-cell transcriptome coupled with T cell receptor sequencing. Our analyses reveal 53 cell subtypes, including tumour-infiltrating CD8
T, regulatory T (Treg), and dendritic cells (DCs), as well as malignant cells with different Epstein-Barr virus infection status. Trajectory analyses reveal exhausted CD8
T and immune-suppressive TNFRSF4
Treg cells in tumours might derive from peripheral CX3CR1
CD8
T and naïve Treg cells, respectively. Moreover, we identify immune-regulatory and tolerogenic LAMP3
DCs. Noteworthily, we observe intensive inter-cell interactions among LAMP3
DCs, Treg, exhausted CD8
T, and malignant cells, suggesting potential cross-talks to foster an immune-suppressive niche for the TME. Collectively, our study uncovers the heterogeneity and interacting molecules of the TME in NPC at single-cell resolution, which provide insights into the mechanisms underlying NPC progression and the development of precise therapies for NPC.
Baseline drift always blurs or even swamps signals and deteriorates analytical results, particularly in multivariate analysis. It is necessary to correct baseline drift to perform further data ...analysis. Simple or modified polynomial fitting has been found to be effective to some extent. However, this method requires user intervention and is prone to variability especially in low signal-to-noise ratio environments. A novel algorithm named adaptive iteratively reweighted Penalized Least Squares (airPLS) that does not require any user intervention and prior information, such as peak detection
etc.
, is proposed in this work. The method works by iteratively changing weights of sum squares errors (SSE) between the fitted baseline and original signals, and the weights of the SSE are obtained adaptively using the difference between the previously fitted baseline and the original signals. The baseline estimator is fast and flexible. Theory, implementation, and applications in simulated and real datasets are presented. The algorithm is implemented in R language and MATLAB™, which is available as open source software (
http://code.google.com/p/airpls
).
A novel algorithm named adaptive iteratively reweighted Penalized Least Squares (airPLS) is proposed for baseline correction in analytical chemistry. By investigating the correction result using real data, it proved to be simple but flexible, valid and fast for baseline estimation.
The first enantioselective total synthesis of (−)‐aspidophylline A, including assignment of its absolute configuration has been accomplished. A key element of the synthesis is a highly ...enantioselective indole allylic alkylation/iminium cyclization cascade which was developed by employing a combination of Lewis acid activation and an iridium/ligand catalyst. This strategy relies on the direct use of 2,3‐disubstituted indoles with secondary allylic alcohols appended at C2 and heteronucleophiles appended at C3, indoles which are easily prepared from simple starting materials under C−H activation conditions.
The enantioselective total synthesis of (−)‐aspidophylline A, including assignment of its absolute configuration has been accomplished. A key element of the synthesis is a highly enantioselective indole allylic alkylation/iminium cyclization cascade which was developed by employing a combination of Lewis acid activation and an iridium/ligand catalyst. cod=1,5‐cyclooctadiene, Tf=trifluoromethanesulfonyl.
Molecular representation for small molecules has been routinely used in QSAR/SAR, virtual screening, database search, ranking, drug ADME/T prediction and other drug discovery processes. To facilitate ...extensive studies of drug molecules, we developed a freely available, open-source python package called chemoinformatics in python (ChemoPy) for calculating the commonly used structural and physicochemical features. It computes 16 drug feature groups composed of 19 descriptors that include 1135 descriptor values. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pairs fingerprints, topological torsion fingerprints and Morgan/circular fingerprints. By applying a semi-empirical quantum chemistry program MOPAC, ChemoPy can also compute a large number of 3D molecular descriptors conveniently.
The python package, ChemoPy, is freely available via http://code.google.com/p/pychem/downloads/list, and it runs on Linux and MS-Windows.
Supplementary data are available at Bioinformatics online.
Display omitted
► The proposed method possesses advantages of RJMCMC algorithms. ► The proposed method is easier to implement than RJMCMC. ► Competitive results over published work were obtained. ► ...Random frog is computationally very efficient.
The identification of disease-relevant genes represents a challenge in microarray-based disease diagnosis where the sample size is often limited. Among established methods, reversible jump Markov Chain Monte Carlo (RJMCMC) methods have proven to be quite promising for variable selection. However, the design and application of an RJMCMC algorithm requires, for example, special criteria for prior distributions. Also, the simulation from joint posterior distributions of models is computationally extensive, and may even be mathematically intractable. These disadvantages may limit the applications of RJMCMC algorithms. Therefore, the development of algorithms that possess the advantages of RJMCMC methods and are also efficient and easy to follow for selecting disease-associated genes is required. Here we report a RJMCMC-like method, called random frog that possesses the advantages of RJMCMC methods and is much easier to implement. Using the colon and the estrogen gene expression datasets, we show that random frog is effective in identifying discriminating genes. The top 2 ranked genes for colon and estrogen are Z50753, U00968, and Y10871_at, Z22536_at, respectively. (The source codes with GNU General Public License Version 2.0 are freely available to non-commercial users at: http://code.google.com/p/randomfrog/.)