Part of Speech (PoS) information has been broadly leveraged in previous image captioning methods to guide their decoder module to control whether the visual information is required for generating the ...target words. However, existing methods primarily focus on enhancing visual words (VWs) generation while neglecting non-visual words (NVWs) generation. So, in response, we introduce a novel PoS clues-aware adaptive attention mechanism (NPoSC-A3) to leverage the PoS clues to adaptively incorporate visual and semantic attention contexts into the language model, where the semantic information and the visual information are leveraged in generating the visual and non-visual words (VWs and NVWs). The mechanism of NPoSC-A3 comprises four key modules:
global semantic context generator (GSCG), PoS context generator (PoSCG), PoS predictor (PoSP), and PoS clues-aware adaptive attention mechanism (PoSC-A3). GSCG generates a global semantic context that our model leverages for generating NVWs. PoSP predicts the PoS information of the word to be generated at the current time step. PoSC-A3 adaptively incorporates visual and global semantic features into the decoder module based on the PoS guidance. PoSCG constrains the visual context and global semantic context effect on the captioning process for generating more syntactic captions. Extensive experiments conducted using the MSCOCO standard dataset demonstrate that our presented method has raised the effectiveness of image captioning task and outperformed most recent and advanced image captioning works with evaluation metrics and attained 127.2 in CIDEr.
Natural language processing (NLP) tools have sparked a great deal of interest due to rapid improvements in information and communications technologies. As a result, many different NLP tools are being ...produced. However, there are many challenges for developing efficient and effective NLP tools that accurately process natural languages. One such tool is part of speech (POS) tagging, which tags a particular sentence or words in a paragraph by looking at the context of the sentence/words inside the paragraph. Despite enormous efforts by researchers, POS tagging still faces challenges in improving accuracy while reducing false-positive rates and in tagging unknown words. Furthermore, the presence of ambiguity when tagging terms with different contextual meanings inside a sentence cannot be overlooked. Recently, Deep learning (DL) and Machine learning (ML)-based POS taggers are being implemented as potential solutions to efficiently identify words in a given sentence across a paragraph. This article first clarifies the concept of part of speech POS tagging. It then provides the broad categorization based on the famous ML and DL techniques employed in designing and implementing part of speech taggers. A comprehensive review of the latest POS tagging articles is provided by discussing the weakness and strengths of the proposed approaches. Then, recent trends and advancements of DL and ML-based part-of-speech-taggers are presented in terms of the proposed approaches deployed and their performance evaluation metrics. Using the limitations of the proposed approaches, we emphasized various research gaps and presented future recommendations for the research in advancing DL and ML-based POS tagging.
As one of the fundamental tasks in text analysis, phrase mining aims at extracting quality phrases from a text corpus and has various downstream applications including information ...extraction/retrieval, taxonomy construction, and topic modeling. Most existing methods rely on complex, trained linguistic analyzers, and thus likely have unsatisfactory performance on text corpora of new domains and genres without extra but expensive adaption. None of the state-of-the-art models, even data-driven models, is fully automated because they require human experts for designing rules or labeling phrases. In this paper, we propose a novel framework for automated phrase mining, AutoPhrase, which supports any language as long as a general knowledge base (e.g., Wikipedia) in that language is available, while benefiting from, but not requiring, a POS tagger. Compared to the state-of-the-art methods, AutoPhrase has shown significant improvements in both effectiveness and efficiency on five real-world datasets across different domains and languages. Besides, AutoPhrase can be extended to model single-word quality phrases.
Named Entity Recognition (NER) plays a pivotal role in knowledge extraction and improving the intelligence of edge computing. The effectiveness of span-based NER models predominantly depends on the ...representation of spans. Existing methods primarily utilize semantic features to represent spans, often neglecting other vital information. This paper proposes a method incorporating Part of Speech (POS) information into span representations to overcome this limitation. Central to this methodology is a span POS encoder designed to extract the POS feature of spans. For migrating the method to edge devices, this paper introduces a fast span POS encoder, which significantly reduces the time complexity of POS feature extraction. Building upon this innovation, a span-based NER model named IPSI (Incorporating Part of Speech Information in span representation) is developed, exhibiting outstanding performance on nested and flat datasets. Comparison the original and fast span POS encoders reveals that while the fast encoder slightly compromises performance, it markedly accelerates the training and inference processes. Finally, through a series of experiments and sample analyses, this article explores the intrinsic mechanism through which the span POS feature influences entity recognition and further illustrates the importance of the POS feature.
•A span POS encoder is proposed to extract the POS feature of a span.•An efficient variation of the span POS encoder is developed to cater to the limited computational capacity of edge devices.•A NER model is developed, which incorporates the POS feature to enhance span representation, outperforming competitor models.•The significance of the POS feature for NER is substantiated through a series of experiments.
Integrating linguistic features has been widely utilized in statistical machine translation (SMT) systems, resulting in improved translation quality. However, for low-resource languages such as Thai ...and Myanmar, the integration of linguistic features in neural machine translation (NMT) systems has yet to be implemented. In this study, we propose transformer-based NMT models (transformer, multi-source transformer, and shared-multi-source transformer models) using linguistic features for two-way translation of Thai-to-Myanmar, Myanmar-to-English, and Thai-to-English. Linguistic features such as part-of-speech (POS) tags or universal part-of-speech (UPOS) tags are added to each word on either the source or target side, or both the source and target sides, and the proposed models are conducted. The multi-source transformer and shared-multi-source transformer models take two inputs (i.e., string data and string data with POS tags) and produce string data or string data with POS tags. A transformer model that utilizes only word vectors was used as the first baseline model for comparison with the proposed models. The second baseline model, an Edit-Based Transformer with Repositioning (EDITOR) model, was also used to compare with our proposed models in addition to the baseline transformer model. The findings of the experiments show that adding linguistic features to the transformer-based models enhances the performance of a neural machine translation in low-resource language pairs. Moreover, the best translation results were yielded using shared-multi-source transformer models with linguistic features resulting in more significant Bilingual Evaluation Understudy (BLEU) scores and character n-gram F-score (chrF) scores than the baseline transformer and EDITOR models.
Linguistic features; Part-of-speech; Universal part-of-speech; Neural machine translation; Transformer architecture
The goal of aspect-based sentiment classification (ASC) is to predict the corresponding emotion of a specific target of a sentence. In neural network-based methods for ASC, various sophisticated ...models such as Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN) are widespread. Recently, ongoing research has integrated syntactic structures into graph neural networks (GNN) to deal with ASC tasks. However, these methods are limited due to the noise and inefficient use of information of syntactic dependency trees. This paper proposes a novel GNN based deep learning model to overcome the deficiencies of prior studies. In the proposed model, to exploit the information in the syntactic dependency trees, a novel part-of-speech (POS) guided syntactic dependency graph is constructed for a relational graph attention network (RGAT) to eliminate the noises. Further, a syntactic distance attention-guided layer is designed for a densely connected graph convolutional network (DCGCN), which can fully extract semantic dependency between contextual words. Experiments on three public datasets are carried out to evaluate the effectiveness of the proposed model. Comparing to the baselines, our model, as a best alternative, achieves state-of-arts performance.
•POS taggers are developed for MSA and GLF variants of the Arabic language using CRF and BiLSTM.•The gold standard annotated datasets that have been constructed for POS tagging are made accessible to ...the research community.•An exploratory analysis of the behavior of using hashtags in Arabic tweets is presented, and this can be leveraged in future studies.•The POS tagger for Arabic tweets using the BiLSTM achieves the best performance.•Experiments show that there is no need for a dialect specific POS tagger.
Over the past few years, Twitter has experienced massive growth and the volume of its online content has increased rapidly. This content has been a rich source for several studies that focused on natural language processing (NLP) research. However, Twitter data pose numerous challenges and obstacles to NLP tasks. For the English language, Twitter has an NLP tool that provides tweet-specific NLP tasks, which present significant opportunities for English NLP research and applications. Part-of-speech (POS) tagging for English tweets is one of the tasks that is offered and facilitated by such a tool. In contrast, only a few attempts have been made to develop POS taggers for Arabic content on Twitter. In this paper, we consider POS tagging, which is one of the NLP tasks that directly affects the performance of other subsequent text processing tasks. We introduce three manually annotated datasets for the POS tagging of Arabic tweets: the ‘Mixed,’ ‘MSA,’ and ‘GLF’ datasets with 3000, 1000, and 1000 Arabic tweets, respectively. In addition, we present an exploratory analysis of the behavior of using hashtags in Arabic tweets, which is a phenomenon that affects the task of POS tagging. We also present two supervised POS taggers that are developed based on two approaches: Conditional Random Fields and Bidirectional Long Short-Term Memory (Bi-LSTM) models. We conclude that the Bi-LSTM-based POS tagger achieves the state-of-the-art results for the ‘Mixed’ dataset with 96.5% accuracy. However, the specific-dialect taggers trained on the ‘MSA’ and ‘GLF’ datasets achieve an accuracy of 95.6% and 95%, respectively. The results for the ‘Mixed’ dataset indicate the effectiveness of developing a joint POS tagger without the need for a dialect-specific POS tagger.
Sometimes Internet users struggle to find what they are looking for on the Internet due to information overload. Search engines intend to identify documents related to a given keyphrase on the ...Internet and provide suggestions. Having some background knowledge about a topic or a domain will help in building effective search keyphrases that will lead to accurate results in information retrieval. This is further pronounced among students that rely on the internet to learn about a new topic. Students might not have the required background knowledge to build effective keyphrases and find what they are looking for. In this research, we are addressing this problem, and aim to help students find relevant information online. This research furthers existing literature by enhancing information retrieval frameworks through keyphrase assignment, aiming to expose students to new terminologies, therefore reducing the dependency of having background knowledge about the domain under study. We evaluated this framework and identified how it can be enhanced to suggest more effective search keyphrases. Our proposed suggestion is to introduce a keyphrase Ranking Mechanism that will improve the keyphrase assignment part of the framework by taking into consideration the part‐of‐speech of the generated keyphrases. To evaluate the proposed approach, various data sets were downloaded and processed. The results obtained showed that our proposed approach produces more effective keyphrases than the existing framework.
This article introduces a new corpus of eye movements in silent reading—the Russian Sentence Corpus (RSC). Russian uses the Cyrillic script, which has not yet been investigated in cross-linguistic ...eye movement research. As in every language studied so far, we confirmed the expected effects of low-level parameters, such as word length, frequency, and predictability, on the eye movements of skilled Russian readers. These findings allow us to add Slavic languages using Cyrillic script (exemplified by Russian) to the growing number of languages with different orthographies, ranging from the Roman-based European languages to logographic Asian ones, whose basic eye movement benchmarks conform to the universal comparative science of reading (Share,
2008
). We additionally report basic descriptive corpus statistics and three exploratory investigations of the effects of Russian morphology on the basic eye movement measures, which illustrate the kinds of questions that researchers can answer using the RSC. The annotated corpus is freely available from its project page at the Open Science Framework:
https://osf.io/x5q2r/
.
Relation classification plays an important role in the field of natural language processing (NLP). Previous research on relation classification has verified the effectiveness of using convolutional ...neural network (CNN) and recurrent neural network (RNN). In this paper, we proposed a model that combine the RNN and CNN (RCNN), which will Give full play to their respective advantages: RNN can learn temporal and context features, especially long-term dependency between two entities, while CNN is capable of catching more potential features. We experiment our model on the SemEval-2010 Task 8 dataset1, and the result shows that our method is superior to most of the existing methods.