Abstract
To speed up the discovery of COVID-19 disease mechanisms by X-ray images, this research developed a new diagnosis platform using a deep convolutional neural network (DCNN) that is able to ...assist radiologists with diagnosis by distinguishing COVID-19 pneumonia from non-COVID-19 pneumonia in patients based on chest X-ray classification and analysis. Such a tool can save time in interpreting chest X-rays and increase the accuracy and thereby enhance our medical capacity for the detection and diagnosis of COVID-19. The explainable method is also used in the DCNN to select instances of the X-ray dataset images to explain the behavior of training-learning models to achieve higher prediction accuracy. The average accuracy of our method is above 96%, which can replace manual reading and has the potential to be applied to large-scale rapid screening of COVID-9 for widely use cases.
The existing results show the applicability of the over-parameterized model based subspace identification method (OPM-like SIM) developed for consistent estimates of Hammerstein systems under ...completely unknown periodic disturbances. However, it requires to estimate extra parameters and performer a low rank approximation step. Therefore, it may give rise to unnecessarily high variance in parameter estimates, especially using a small and noisy data set. To overcome this corruptive phenomenon, we propose a parsimonious model based SIM to obtain a consistent parameter estimate for Hammerstein systems under completely unknown periodic disturbances. Two parsimonious models instead of OPM have been used to describe the Hammerstein systems, and an orthogonal projection based fixed point iteration method has been proposed to eliminate the disturbance effects and gives a consistent parameter estimate. The avoidance of estimating extra parameters and a low rank approximation step in classical OPM-like SIM has the potential to improve the accuracy and variance properties of the parameter estimates. The effectiveness and merits are demonstrated with strict mathematical proofs, along with simulation examples.
Predicting residue‐residue distance relationships (eg, contacts) has become the key direction to advance protein structure prediction since 2014 CASP11 experiment, while deep learning has ...revolutionized the technology for contact and distance distribution prediction since its debut in 2012 CASP10 experiment. During 2018 CASP13 experiment, we enhanced our MULTICOM protein structure prediction system with three major components: contact distance prediction based on deep convolutional neural networks, distance‐driven template‐free (ab initio) modeling, and protein model ranking empowered by deep learning and contact prediction. Our experiment demonstrates that contact distance prediction and deep learning methods are the key reasons that MULTICOM was ranked 3rd out of all 98 predictors in both template‐free and template‐based structure modeling in CASP13. Deep convolutional neural network can utilize global information in pairwise residue‐residue features such as coevolution scores to substantially improve contact distance prediction, which played a decisive role in correctly folding some free modeling and hard template‐based modeling targets. Deep learning also successfully integrated one‐dimensional structural features, two‐dimensional contact information, and three‐dimensional structural quality scores to improve protein model quality assessment, where the contact prediction was demonstrated to consistently enhance ranking of protein models for the first time. The success of MULTICOM system clearly shows that protein contact distance prediction and model selection driven by deep learning holds the key of solving protein structure prediction problem. However, there are still challenges in accurately predicting protein contact distance when there are few homologous sequences, folding proteins from noisy contact distances, and ranking models of hard targets.
Abstract
Motivation
Protein fold recognition is an important problem in structural bioinformatics. Almost all traditional fold recognition methods use sequence (homology) comparison to indirectly ...predict the fold of a target protein based on the fold of a template protein with known structure, which cannot explain the relationship between sequence and fold. Only a few methods had been developed to classify protein sequences into a small number of folds due to methodological limitations, which are not generally useful in practice.
Results
We develop a deep 1D-convolution neural network (DeepSF) to directly classify any protein sequence into one of 1195 known folds, which is useful for both fold recognition and the study of sequence-structure relationship. Different from traditional sequence alignment (comparison) based methods, our method automatically extracts fold-related features from a protein sequence of any length and maps it to the fold space. We train and test our method on the datasets curated from SCOP1.75, yielding an average classification accuracy of 75.3%. On the independent testing dataset curated from SCOP2.06, the classification accuracy is 73.0%. We compare our method with a top profile-profile alignment method-HHSearch on hard template-based and template-free modeling targets of CASP9-12 in terms of fold recognition accuracy. The accuracy of our method is 12.63-26.32% higher than HHSearch on template-free modeling targets and 3.39-17.09% higher on hard template-based modeling targets for top 1, 5 and 10 predicted folds. The hidden features extracted from sequence by our method is robust against sequence mutation, insertion, deletion and truncation, and can be used for other protein pattern recognition problems such as protein clustering, comparison and ranking.
Availability and implementation
The DeepSF server is publicly available at: http://iris.rnet.missouri.edu/DeepSF/.
Supplementary information
Supplementary data are available at Bioinformatics online.
The rapid and accurate taxonomic identification of fossils is of great significance in paleontology, biostratigraphy, and other fields. However, taxonomic identification is often labor-intensive and ...tedious, and the requisition of extensive prior knowledge about a taxonomic group also requires long-term training. Moreover, identification results are often inconsistent across researchers and communities. Accordingly, in this study, we used deep learning to support taxonomic identification. We used web crawlers to collect the Fossil Image Dataset (FID) via the Internet, obtaining 415,339 images belonging to 50 fossil clades. Then we trained three powerful convolutional neural networks on a high-performance workstation. The Inception-ResNet-v2 architecture achieved an average accuracy of 0.90 in the test dataset when transfer learning was applied. The clades of microfossils and vertebrate fossils exhibited the highest identification accuracies of 0.95 and 0.90, respectively. In contrast, clades of sponges, bryozoans, and trace fossils with various morphologies or with few samples in the dataset exhibited a performance below 0.80. Visual explanation methods further highlighted the discrepancies among different fossil clades and suggested similarities between the identifications made by machine classifiers and taxonomists. Collecting large paleontological datasets from various sources, such as the literature, digitization of dark data, citizen-science data, and public data from the Internet may further enhance deep learning methods and their adoption. Such developments will also possibly lead to image-based systematic taxonomy to be replaced by machine-aided classification in the future. Pioneering studies can include microfossils and some invertebrate fossils. To contribute to this development, we deployed our model on a server for public access at www.ai-fossil.com.
Tau pathology is a hallmark of Alzheimer's disease (AD) and other tauopathies. During disease progression, abnormally phosphorylated forms of tau aggregate and accumulate into neurofibrillary ...tangles, leading to synapse loss, neuroinflammation, and neurodegeneration. Thus, targeting of tau pathology is expected to be a promising strategy for AD treatment.
The effect of rutin on tau aggregation was detected by thioflavin T fluorescence and transmission electron microscope imaging. The effect of rutin on tau oligomer-induced cytotoxicity was assessed by MTT assay. The effect of rutin on tau oligomer-mediated the production of IL-1β and TNF-α in vitro was measured by ELISA. The uptake of extracellular tau by microglia was determined by immunocytochemistry. Six-month-old male Tau-P301S mice were treated with rutin or vehicle by oral administration daily for 30 days. The cognitive performance was determined using the Morris water maze test, Y-maze test, and novel object recognition test. The levels of pathological tau, gliosis, NF-kB activation, proinflammatory cytokines such as IL-1β and TNF-α, and synaptic proteins including synaptophysin and PSD95 in the brains of the mice were evaluated by immunolabeling, immunoblotting, or ELISA.
We showed that rutin, a natural flavonoid glycoside, inhibited tau aggregation and tau oligomer-induced cytotoxicity, lowered the production of proinflammatory cytokines, protected neuronal morphology from toxic tau oligomers, and promoted microglial uptake of extracellular tau oligomers in vitro. When applied to Tau-P301S mouse model of tauopathy, rutin reduced pathological tau levels, regulated tau hyperphosphorylation by increasing PP2A level, suppressed gliosis and neuroinflammation by downregulating NF-kB pathway, prevented microglial synapse engulfment, and rescued synapse loss in mouse brains, resulting in a significant improvement of cognition.
In combination with the previously reported therapeutic effects of rutin on Aβ pathology, rutin is a promising drug candidate for AD treatment based its combinatorial targeting of tau and Aβ.
Driven by deep learning, inter-residue contact/distance prediction has been significantly improved and substantially enhanced ab initio protein structure prediction. Currently, most of the distance ...prediction methods classify inter-residue distances into multiple distance intervals instead of directly predicting real-value distances. The output of the former has to be converted into real-value distances to be used in tertiary structure prediction.
To explore the potentials of predicting real-value inter-residue distances, we develop a multi-task deep learning distance predictor (DeepDist) based on new residual convolutional network architectures to simultaneously predict real-value inter-residue distances and classify them into multiple distance intervals. Tested on 43 CASP13 hard domains, DeepDist achieves comparable performance in real-value distance prediction and multi-class distance prediction. The average mean square error (MSE) of DeepDist's real-value distance prediction is 0.896 Å
when filtering out the predicted distance ≥ 16 Å, which is lower than 1.003 Å
of DeepDist's multi-class distance prediction. When distance predictions are converted into contact predictions at 8 Å threshold (the standard threshold in the field), the precision of top L/5 and L/2 contact predictions of DeepDist's multi-class distance prediction is 79.3% and 66.1%, respectively, higher than 78.6% and 64.5% of its real-value distance prediction and the best results in the CASP13 experiment.
DeepDist can predict inter-residue distances well and improve binary contact prediction over the existing state-of-the-art methods. Moreover, the predicted real-value distances can be directly used to reconstruct protein tertiary structures better than multi-class distance predictions due to the lower MSE. Finally, we demonstrate that predicting the real-value distance map and multi-class distance map at the same time performs better than predicting real-value distances alone.
To improve state-of-health (SOH) estimation and remaining useful life (RUL) prediction, a prognostic framework shared by multiple batteries is proposed. A variant long-short-term memory (LSTM) neural ...network (NN), called AST-LSTM NN, is designed to guarantee the performance of proposed framework. Firstly, the input and forget gates are coupled by a fixed connection, which leads simultaneous determination of old information and new data. Secondly, the element-wise product of the new inputs and the historical cell states is conducted for screening out more beneficial information. Thirdly, a peephole connection from the “constant error carousel” (CEC) is added into the output gate to shield the unwanted error signals. AST-LSTM NNs, with mapping structures of many-to-one and one-to-one, are well-trained separately for the prediction of SOH and RUL. Compared with other data-driven methods, the experiments carried on NASA dataset demonstrate our method hits lower average root mean square, 0.0216, and conjunct error, 0.0831, for SOH and RUL, respectively.
Display omitted
•More generic framework applicable to multiple batteries.•Upgraded LSTM NN with tracking cell states actively.•Enhanced LSTM cell with novel structure.•Model with many-to-one structure for SOH estimation.•Accurate and robust multi-step RUL prediction.
Recent advances in mass spectrometry (MS)-based proteomics have enabled tremendous progress in the understanding of cellular mechanisms, disease progression, and the relationship between genotype and ...phenotype. Though many popular bioinformatics methods in proteomics are derived from other omics studies, novel analysis strategies are required to deal with the unique characteristics of proteomics data. In this review, we discuss the current developments in the bioinformatics methods used in proteomics and how they facilitate the mechanistic understanding of biological processes. We first introduce bioinformatics software and tools designed for mass spectrometry-based protein identification and quantification, and then we review the different statistical and machine learning methods that have been developed to perform comprehensive analysis in proteomics studies. We conclude with a discussion of how quantitative protein data can be used to reconstruct protein interactions and signaling networks.