Aspect-based sentiment analysis is a fine-grained sentiment analysis task, which needs to detection the sentiment polarity towards a given aspect. Recently, graph neural models over the dependency ...tree are widely applied for aspect-based sentiment analysis. Most existing works, however, they generally focus on learning the dependency information from contextual words to aspect words based on the dependency tree of the sentence, which lacks the exploitation of contextual affective knowledge with regard to the specific aspect. In this paper, we propose a graph convolutional network based on SenticNet to leverage the affective dependencies of the sentence according to the specific aspect, called Sentic GCN. To be specific, we explore a novel solution to construct the graph neural networks via integrating the affective knowledge from SenticNet to enhance the dependency graphs of sentences. Based on it, both the dependencies of contextual words and aspect words and the affective information between opinion words and the aspect are considered by the novel affective enhanced graph model. Experimental results on multiple public benchmark datasets illustrate that our proposed model can beat state-of-the-art methods.
Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are ...essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein sequences generated in the postgenomic age, it is a critical challenge to develop automated methods for accurate and rapidly identifying DNA-binding proteins based on their sequence information alone. Here, a novel predictor, called "iDNA-Prot|dis", was established by incorporating the amino acid distance-pair coupling information and the amino acid reduced alphabet profile into the general pseudo amino acid composition (PseAAC) vector. The former can capture the characteristics of DNA-binding proteins so as to enhance its prediction quality, while the latter can reduce the dimension of PseAAC vector so as to speed up its prediction process. It was observed by the rigorous jackknife and independent dataset tests that the new predictor outperformed the existing predictors for the same purpose. As a user-friendly web-server, iDNA-Prot|dis is accessible to the public at http://bioinformatics.hitsz.edu.cn/iDNA-Prot_dis/. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step protocol guide is provided on how to use the web-server to get their desired results without the need to follow the complicated mathematic equations that are presented in this paper just for the integrity of its developing process. It is anticipated that the iDNA-Prot|dis predictor may become a useful high throughput tool for large-scale analysis of DNA-binding proteins, or at the very least, play a complementary role to the existing predictors in this regard.
Protein secondary structure is the three dimensional form of local segments of proteins and its prediction is an important problem in protein tertiary structure prediction. Developing computational ...approaches for protein secondary structure prediction is becoming increasingly urgent.
We present a novel deep learning based model, referred to as CNNH_PSS, by using multi-scale CNN with highway. In CNNH_PSS, any two neighbor convolutional layers have a highway to deliver information from current layer to the output of the next one to keep local contexts. As lower layers extract local context while higher layers extract long-range interdependencies, the highways between neighbor layers allow CNNH_PSS to have ability to extract both local contexts and long-range interdependencies. We evaluate CNNH_PSS on two commonly used datasets: CB6133 and CB513. CNNH_PSS outperforms the multi-scale CNN without highway by at least 0.010 Q8 accuracy and also performs better than CNF, DeepCNF and SSpro8, which cannot extract long-range interdependencies, by at least 0.020 Q8 accuracy, demonstrating that both local contexts and long-range interdependencies are indeed useful for prediction. Furthermore, CNNH_PSS also performs better than GSM and DCRNN which need extra complex model to extract long-range interdependencies. It demonstrates that CNNH_PSS not only cost less computer resource, but also achieves better predicting performance.
CNNH_PSS have ability to extracts both local contexts and long-range interdependencies by combing multi-scale CNN and highway network. The evaluations on common datasets and comparisons with state-of-the-art methods indicate that CNNH_PSS is an useful and efficient tool for protein secondary structure prediction.
Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not ...consider the relationships of evolutionary information between residues.
In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. PDNA-62 and PDNA-224 are used to evaluate PSSM-RT and two existing PSSM encoding methods by five-fold cross-validation. Performance evaluations indicate that PSSM-RT is more effective than previous methods. This validates the point that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction. An ensemble learning classifier (EL_PSSM-RT) is also proposed by combining ensemble learning model and PSSM-RT to better handle the imbalance between binding and non-binding residues in datasets. EL_PSSM-RT is evaluated by five-fold cross-validation using PDNA-62 and PDNA-224 as well as two independent datasets TS-72 and TS-61. Performance comparisons with existing predictors on the four datasets demonstrate that EL_PSSM-RT is the best-performing method among all the predicting methods with improvement between 0.02-0.07 for MCC, 4.18-21.47% for ST and 0.013-0.131 for AUC. Furthermore, we analyze the importance of the pair-relationships extracted by PSSM-RT and the results validates the usefulness of PSSM-RT for encoding DNA-binding residues.
We propose a novel prediction method for the prediction of DNA-binding residue with the inclusion of relationship of evolutionary information and ensemble learning. Performance evaluation shows that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction and ensemble learning can be used to address the data imbalance issue between binding and non-binding residues. A web service of EL_PSSM-RT ( http://hlt.hitsz.edu.cn:8080/PSSM-RT_SVM/ ) is provided for free access to the biological research community.
Neural attention mechanism has achieved many successes in various tasks in natural language processing. However, existing neural attention models based on a densely connected network are loosely ...related to the attention mechanism found in psychology and neuroscience. Motivated by the finding in neuroscience that human possesses the template-searching attention mechanism, we propose to use convolution operation to simulate attentions and give a mathematical explanation of our neural attention model. We then introduce a new network architecture, which combines a recurrent neural network with our convolution-based attention model and further stacks an attention-based neural model to build a hierarchical sentiment classification model. The experimental results show that our proposed models can capture salient parts of the text to improve the performance of sentiment classification at both the sentence level and the document level.
Transcription factor binding sites (TFBSs) play an important role in gene expression regulation. Many computational methods for TFBS prediction need sufficient labeled data. However, many ...transcription factors (TFs) lack labeled data in cell types. We propose a novel method, referred to as DANN_TF, for TFBS prediction. DANN_TF consists of a feature extractor, a label predictor, and a domain classifier. The feature extractor and the domain classifier constitute an Adversarial Network, which ensures that learned features are common features across different cell types. DANN_TF is evaluated on five TFs in five cell types with a total of 25 cell-type TF pairs and compared to a baseline method which does not use Adversarial Network. For both data augmentation and cross-cell-type prediction, DANN_TF performs better than the baseline method on most cell-type TF pairs. DANN_TF is further evaluated by an additional 13 TFs in the five cell types with a total of 65 cell-type TF pairs. Results show that DANN_TF achieves significantly higher AUC than the baseline method on 96.9% pairs of the 65 cell-type TF pairs. This is a strong indication that DANN_TF can indeed learn common features for cross-cell-type TFBS prediction.
Aiming at classifying the polarities over aspects, aspect-based sentiment analysis (ABSA) is a fine-grained task of sentiment analysis. The vector representations of current models are generally ...constrained to real values. Based on mathematical formulations of quantum theory, quantum language models have drawn increasing attention. Words in such models can be projected as physical particles in quantum systems, and naturally represented by representation-rich complex-valued vectors in a Hilbert Space, rather than real-valued ones. In this paper, the Hilbert Space representation for ABSA models is investigated and the complexification of three strong real-valued baselines are constructed. Experimental results demonstrate the effectiveness of complexification and the outperformance of our complex-valued models, illustrating that the complex-valued embedding can carry additional information beyond the real embedding. Especially, a complex-valued RoBERTa model outperforms or approaches the previous state-of-the-art on three standard benchmarking datasets.
Uric acid is an important diagnostic marker of catabolism of the purine nucleosides, and accurate measurements of serum uric acid are necessary for proper diagnosis of gout or renal disease ...appearance. A candidate reference method involving isotope dilution coupled with liquid chromatography/mass spectrometry (LC/MS) has been described. An isotopically labeled internal standard, 1,3-
15N
2 uric acid, was added to serum, followed by equilibration and protein removal clean up to prepare samples for liquid chromatography/mass spectrometry electrospray ionization (LC/MS-ESI) analyses. (M−H)
− ions at
m/
z 167 and 169 for uric acid and its labeled internal standard were monitored for LC/MS. The accuracy of the measurement was evaluated by a comparison of results of this candidate reference method on lyophilized human serum reference materials for uric acid (Standard Reference Materials SRM909b) with the certified values determined by gas chromatography/mass spectrometry reference methods and by a recovery study for the added uric acid. The method performed well against the established reference method of ion-exchange followed by derivatization isotope dilution (ID) gas chromatography mass spectrometry (ID-GC/MS). The results of this method for uric acid agreed well with the certified values and were within 0.10%. The amounts of uric acid recovered and added were in good agreement for the three concentrations. This method was applied to determine uric acid in samples of frozen serum pools. Excellent precision was obtained with within-set CVs of 0.08–0.18% and between-set CVs of 0.02–0.07% for LC/MS analyses. Liquid chromatography/tandem mass spectrometry electrospray ionization (LC/MS/MS-ESI) analysis was also performed. The LC/MS and LC/MS/MS results were in very good agreement (within 0.14%). This LC/MS method, which demonstrates good accuracy and precision, and is in the speed of analysis without the need for a derivatization stage, qualifies as a candidate reference method. This method can be used as an alternative reference method to provide an accuracy base to which the routine methods can be compared.
Personalized voice triggering is a key technology in voice assistants and serves as the first step for users to activate the voice assistant. Personalized voice triggering involves keyword spotting ...(KWS) and speaker verification (SV). Conventional approaches to this task include developing KWS and SV systems separately. This paper proposes a single system called the multi-task deep cross-attention network (MTCANet) that simultaneously performs KWS and SV, while effectively utilizing information relevant to both tasks. The proposed framework integrates a KWS sub-network and an SV sub-network to enhance performance in challenging conditions such as noisy environments, short-duration speech, and model generalization. At the core of MTCANet are three modules: a novel deep cross-attention (DCA) module to integrate KWS and SV tasks, a multi-layer stacked shared encoder (SE) to reduce the impact of noise on the recognition rate, and soft attention (SA) modules to allow the model to focus on pertinent information in the middle layer while preventing gradient vanishing. Our proposed model demonstrates outstanding performance in the well-off test set, improving by 0.2%, 0.023, and 2.28% over the well-known SV model emphasized channel attention, propagation, and aggregation in time delay neural network (ECAPA-TDNN) and the advanced KWS model Convmixer in terms of equal error rate (EER), minimum detection cost function (minDCF), and accuracy (Acc), respectively.