Abstractive summarization is distinguished by using novel phrases that are not found in the source text. However, most previous research ignores this feature in favour of enhancing syntactical ...similarity with the reference. To improve novelty aspects, we have used multiple warm-started models with varying encoder and decoder checkpoints and vocabulary. These models are then adapted to the paraphrasing task and the sampling decoding strategy to further boost the levels of novelty and quality. In addition, to avoid relying only on the syntactical similarity assessment, two additional abstractive summarization metrics are introduced: 1) NovScore: a new novelty metric that delivers a summary novelty score; and 2) NSSF: a new comprehensive metric that ensembles Novelty, Syntactic, Semantic, and Faithfulness features into a single score to simulate human assessment in providing a reliable evaluation. Finally, we compare our models to the state-of-the-art sequence-to-sequence models using the current and the proposed metrics. As a result, warm-starting, sampling, and paraphrasing improve novelty degrees by 2%, 5%, and 14%, respectively, while maintaining comparable scores on other metrics.
Summarization is a process to select important information from a source text. Summarizing strategies are the core cognitive processes in summarization activity. Since summarization can be important ...as a tool to improve comprehension, it has attracted interest of teachers for teaching summary writing through direct instruction. To do this, they need to review and assess the students' summaries and these tasks are very time-consuming. Thus, a computer-assisted assessment can be used to help teachers to conduct this task more effectively.
This paper aims to propose an algorithm based on the combination of semantic relations between words and their syntactic composition to identify summarizing strategies employed by students in summary writing. An innovative aspect of our algorithm lies in its ability to identify summarizing strategies at the syntactic and semantic levels. The efficiency of the algorithm is measured in terms of Precision, Recall and F-measure. We then implemented the algorithm for the automated summarization assessment system that can be used to identify the summarizing strategies used by students in summary writing.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Centralized deepfake service providers have large amounts of computing power and training data, giving them the ability to produce high-quality deepfake content. However, once these service providers ...are attacked or malfunction, it may lead to the collapse of the entire deepfake ecosystem, making deepfake a potential threat to data security. This monopoly development has led to the uneven distribution of deepfake resources, which in turn has brought about the risk of single points of failure. To deal with the problem, this paper proposes a decentralized deepfake task management algorithm (DD-TMA) based on blockchain and edge computing. The blockchain in this algorithm can provide a decentralized storage and management platform to ensure that the data and models of deepfake tasks will not be tampered with or lost. Edge computing can distribute tasks to edge devices close to the data source for processing, reducing data transmission delays and bandwidth consumption, and improving the efficiency and security of deepfake tasks. The paper innovatively integrates blockchain, federated computing, and edge computing. Firstly, the algorithm establishes a decentralized computing platform based on blockchain. Subsequently, it enhances computing power during the execution of decentralized deepfake tasks through the integration of federated computing and edge computing. Finally, the algorithm increases the active performers of decentralized deepfake tasks through gamification, thereby improving task execution efficiency. Experiments conducted in this study on public data sets demonstrate that the algorithm is efficient, robust, and reusable. Compared with other algorithms, the efficiency of DD-TMA is improved by more than 20% and the stability is improved by more than 13%. This algorithm proves effective in solving the problems encountered in the execution of centralized deepfake tasks. The research provides new ideas for future evaluations of decentralized deepfake effects based on different strategies.
Aspect extraction represents a core task of aspect-based sentiment analysis. This study presents a supervised aspect extraction algorithm for explicit aspect extraction from formal and informal ...texts. To accomplish the new algorithm, 126 aspect extraction rules are combined to cover both formal and informal texts, because customer reviews are a mix of formal and informal texts. These 126 rules include certain dependency-based rules and pattern-based rules from previous studies, in addition to newly developed rules intended to overcome prior rules’ weaknesses. In addition, many aspect extraction rules have remained unexplored by previous studies. However, many of these 126 rules are irrelevant and should be removed. Thus, a prober selection of the included rules is required. Therefore, in this study we also improved the Whale Optimization Algorithm (WOA) to address rules selection problem with an improved algorithm called improved WOA (IWOA). Two major improvements were included into IWOA. The first improvement is the development of a new update equation based on Cauchy mutation to improve WOA population diversity. The second improvement is the development of a new local search algorithm (LSA) to solve WOA local optima. The IWOA algorithm is applied on the full set of rules to select best rules subset and remove low quality rules. Finally, a new pruning algorithm (PA) has been developed to remove incorrect aspects and retain correct aspects. The Results on seven benchmark datasets demonstrate that IWOA+PA outperforms all other state-of-the-art baseline works and most recent works.
•Development of new aspects extraction rules for sentiment analysis at aspect level.•The combination of different aspects extraction rules types.•An Improved Whale Optimization Algorithm (IWOA) for rules selection.•The development of aspect pruning algorithm.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
15.
Dynamic Salp swarm algorithm for feature selection Tubishat, Mohammad; Ja'afar, Salinah; Alswaitti, Mohammed ...
Expert systems with applications,
February 2021, 2021-02-00, 20210201, Volume:
164
Journal Article
Peer reviewed
•A dynamic Salp swarm algorithm is proposed for feature selection.•The development of novel update equation to improve solutions diversity.•The development of new Local search algorithm to improve ...algorithm exploitation.•The algorithm was tested on 23 datasets and it is outperformed other algorithms.
Recently, many optimization algorithms have been applied for Feature selection (FS) problems and show a clear outperformance in comparison with traditional FS methods. Therefore, this has motivated our study to apply the new Salp swarm algorithm (SSA) on the FS problem. However, SSA, like other optimizations algorithms, suffer from the problem of population diversity and fall into local optima. To solve these problems, this study presents an enhanced version of SSA which is known as the Dynamic Salp swarm algorithm (DSSA). Two main improvements were included in SSA to solve its problems. The first improvement includes the development of a new equation for salps’ position update. The use of this new equation is controlled by using Singer's chaotic map. The purpose of the first improvement is to enhance SSA solutions' diversity. The second improvement includes the development of a new local search algorithm (LSA) to improve SSA exploitation. The proposed DSSA was combined with the K-nearest neighbor (KNN) classifier in a wrapper mode. 20 benchmark datasets were selected from the UCI repository and 3 Hadith datasets to test and evaluate the effectiveness of the proposed DSSA algorithm. The DSSA results were compared with the original SSA and four well-known optimization algorithms including Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Ant Lion Optimizer (ALO), and Grasshopper Optimization Algorithm (GOA). From the obtained results, DSSA outperformed the original SSA and the other well-known optimization algorithms over the 23 datasets in terms of classification accuracy, fitness function values, the number of selected features, and convergence speed. Also, DSSA accuracy results were compared with the most recent variants of the SSA algorithm. DSSA showed a significant improvement over the competing algorithms in statistical analysis. These results confirm the capability of the proposed DSSA to simultaneously improve the classification accuracy while selecting the minimal number of the most informative features.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The performance of most metaheuristic algorithms depends on parameters whose settings essentially serve as a key function in determining the quality of the solution and the efficiency of the search. ...A trend that has emerged recently is to make the algorithm parameters automatically adapt to different problems during optimization, thereby liberating the user from the tedious and time-consuming task of manual setting. These fine-tuning techniques continue to be the object of ongoing research. Differential evolution (DE) is a simple yet powerful population-based metaheuristic. It has demonstrated good convergence, and its principles are easy to understand. DE is very sensitive to its parameter settings and mutation strategy; thus, this study aims to investigate these settings with the diverse versions of adaptive DE algorithms. This study has two main objectives: (1) to present an extension for the original taxonomy of evolutionary algorithms (EAs) parameter settings that has been overlooked by prior research and therefore minimize any confusion that might arise from the former taxonomy and (2) to investigate the various algorithmic design schemes that have been used in the different variants of adaptive DE and convey them in a new classification style. In other words, this study describes in depth the structural analysis and working principle that underlie the promising and recent work in this field, to analyze their advantages and disadvantages and to gain future insights that can further improve these algorithms. Finally, the interpretation of the literature and the comparative analysis of the algorithmic schemes offer several guidelines for designing and implementing adaptive DE algorithms. The proposed design framework provides readers with the main steps required to integrate any proposed meta-algorithm into parameter and/or strategy adaptation schemes.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
•ATS field overview: We provide a full review of the ATS field, providing.•A taxonomy with different categories.•A brief history of models’ evolution.•Evaluation measurements review.•Datasets ...comparisons.•ATS and MT research fields comparison and relationship.•Models comprehensive review: We collect abundant resources on the main topics of this study and provide a comprehensive review of SOTA research work: Starting from Deep neural sequence-to-sequence models, then RL approaches, and finally TL architectures, including PTLMs.•Challenges: We analyze previous and current challenges that faced and are facing researchers in the focused fields and the proposed solutions.•Comparisons: We provide different kinds of comparisons of the investigated models from different perspectives: theoretically, practically, and models’ evaluation results. Then the best models are highlighted.•Future trends: We suggest and discuss possible future research trends.
Automatic Text Summarization (ATS) is an important area in Natural Language Processing (NLP) with the goal of shortening a long text into a more compact version by conveying the most important points in a readable form. ATS applications continue to evolve and utilize effective approaches that are being evaluated and implemented by researchers. State-of-the-Art (SotA) technologies that demonstrate cutting-edge performance and accuracy in abstractive ATS are deep neural sequence-to-sequence models, Reinforcement Learning (RL) approaches, and Transfer Learning (TL) approaches, including Pre-Trained Language Models (PTLMs). The graph-based Transformer architecture and PTLMs have influenced tremendous advances in NLP applications. Additionally, the incorporation of recent mechanisms, such as the knowledge-enhanced mechanism, significantly enhanced the results. This study provides a comprehensive review of recent research advances in the area of abstractive text summarization for works spanning the past six years. Past and present problems are described, as well as their proposed solutions. In addition, abstractive ATS datasets and evaluation measurements are also highlighted. The paper concludes by comparing the best models and discussing future research directions.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
To help individuals or companies make a systematic and more accurate decisions, sentiment analysis (SA) is used to evaluate the polarity of reviews. In SA, feature selection phase is an important ...phase for machine learning classifiers specifically when the datasets used in training is huge. Whale Optimization Algorithm (WOA) is one of the recent metaheuristic optimization algorithm that mimics the whale hunting mechanism. However, WOA suffers from the same problem faced by many other optimization algorithms and tend to fall in local optima. To overcome these problems, two improvements for WOA algorithm are proposed in this paper. The first improvement includes using Elite Opposition-Based Learning (EOBL) at initialization phase of WOA. The second improvement involves the incorporation of evolutionary operators from Differential Evolution algorithm at the end of each WOA iteration including mutation, crossover, and selection operators. In addition, we also used Information Gain (IG) as a filter features selection technique with WOA using Support Vector Machine (SVM) classifier to reduce the search space explored by WOA. To verify our proposed approach, four Arabic benchmark datasets for sentiment analysis are used since there are only a few studies in sentiment analysis conducted for Arabic language as compared to English. The proposed algorithm is compared with six well-known optimization algorithms and two deep learning algorithms. The comprehensive experiments results show that the proposed algorithm outperforms all other algorithms in terms of sentiment analysis classification accuracy through finding the best solutions, while its also minimizes the number of selected features.
•To review free-text clinical text classification approaches from six aspects.•In selected studies, mostly content-based and concept-based features were used.•The datasets used in selected studies ...were categorized into four distinct types.•Selected studies used either supervised machine learning or rule-based approaches.•Ten open research challenges are presented in clinical text classification domain.
The pervasive use of electronic health databases has increased the accessibility of free-text clinical reports for supplementary use. Several text classification approaches, such as supervised machine learning (SML) or rule-based approaches, have been utilized to obtain beneficial information from free-text clinical reports. In recent years, many researchers have worked in the clinical text classification field and published their results in academic journals. However, to the best of our knowledge, no comprehensive systematic literature review (SLR) has recapitulated the existing primary studies on clinical text classification in the last five years. Thus, the current study aims to present SLR of academic articles on clinical text classification published from January 2013 to January 2018. Accordingly, we intend to maximize the procedural decision analysis in six aspects, namely, types of clinical reports, data sets and their characteristics, pre-processing and sampling techniques, feature engineering, machine learning algorithms, and performance metrics. To achieve our objective, 72 primary studies from 8 bibliographic databases were systematically selected and rigorously reviewed from the perspective of the six aspects. This review identified nine types of clinical reports, four types of data sets (i.e., homogeneous–homogenous, homogenous–heterogeneous, heterogeneous–homogenous, and heterogeneous–heterogeneous), two sampling techniques (i.e., over-sampling and under-sampling), and nine pre-processing techniques. Moreover, this review determined bag of words, bag of phrases, and bag of concepts features when represented by either term frequency or term frequency with inverse document frequency, thereby showing improved classification results. SML-based or rule-based approaches were generally employed to classify the clinical reports. To measure the performance of these classification approaches, we used precision, recall, F-measure, accuracy, AUC, and specificity in binary class problems. In multi-class problems, we primarily used micro or macro-averaging precision, recall, or F-measure. Lastly, open research issues and challenges are presented for future scholars who are interested in clinical text classification. This SLR will definitely be a beneficial resource for researchers engaged in clinical text classification.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Sentiment analysis is a text classification branch, which is defined as the process of extracting sentiment terms (i.e. feature/aspect, or opinion) and determining their opinion semantic orientation. ...At aspect level, aspect extraction is the core task for sentiment analysis which can either be implicit or explicit aspects. The growth of sentiment analysis has resulted in the emergence of various techniques for both explicit and implicit aspect extraction. However, majority of the research attempts targeted explicit aspect extraction, which indicates that there is a lack of research on implicit aspect extraction. This research provides a review of implicit aspect/features extraction techniques from different perspectives. The first perspective is making a comparison analysis for the techniques available for implicit term extraction with a brief summary of each technique. The second perspective is classifying and comparing the performance, datasets, language used, and shortcomings of the available techniques. In this study, over 50 articles have been reviewed, however, only 45 articles on implicit aspect extraction that span from 2005 to 2016 were analyzed and discussed. Majority of the researchers on implicit aspects extraction rely heavily on unsupervised methods in their research, which makes about 64% of the 45 articles, followed by supervised methods of about 27%, and lastly semi-supervised of 9%. In addition, 25 articles conducted the research work solely on product reviews, and 5 articles conducted their research work using product reviews jointly with other types of data, which makes product review datasets the most frequently used data type compared to other types. Furthermore, research on implicit aspect features extraction has focused on English and Chinese languages compared to other languages. Finally, this review also provides recommendations for future research directions and open problems.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP