•An improved Salp Swarm Algorithm is proposed for feature selection.•Opposition based learning was used with to improve its population diversity.•New local search algorithm was developed to avoid ...local optima problem.•A superior outperformance of the algorithm in comparison with other algorithms.
Many fields such as data science, data mining suffered from the rapid growth of data volume and high data dimensionality. The main problems which are faced by these fields include the high computational cost, memory cost, and low accuracy performance. These problems will occur because these fields are mainly used machine learning classifiers. However, machine learning accuracy is affected by the noisy and irrelevant features. In addition, the computational and memory cost of the machine learning is mainly affected by the size of the used datasets. Thus, to solve these problems, feature selection can be used to select optimal subset of features and reduce the data dimensionality. Feature selection represents an important preprocessing step in many intelligent and expert systems such as intrusion detection, disease prediction, and sentiment analysis. An improved version of Salp Swarm Algorithm (ISSA) is proposed in this study to solve feature selection problems and select the optimal subset of features in wrapper-mode. Two main improvements were included into the original SSA algorithm to alleviate its drawbacks and adapt it for feature selection problems. The first improvement includes the use of Opposition Based Learning (OBL) at initialization phase of SSA to improve its population diversity in the search space. The second improvement includes the development and use of new Local Search Algorithm with SSA to improve its exploitation. To confirm and validate the performance of the proposed improved SSA (ISSA), ISSA was applied on 18 datasets from UCI repository. In addition, ISSA was compared with four well-known optimization algorithms such as Genetic Algorithm, Particle Swarm Optimization, Grasshopper Optimization Algorithm, and Ant Lion Optimizer. In these experiments four different assessment criteria were used. The rdemonstrate that ISSA outperforms all baseline algorithms in terms of fitness values, accuracy, convergence curves, and feature reduction in most of the used datasets. The wrapper feature selection mode can be used in different application areas of expert and intelligent systems and this is confirmed from the obtained results over different types of datasets.
Robust findings of citations have a positive impact on researchers and significantly contribute to academic development. As a paper is cited more frequently or used as a reference in other articles, ...its citation count increases. Papers with higher citations tend to be more influential than those less cited. Research on predicting citation counts has evolved throughout the year in various fields. However, despite its recent growth, research on identifying commonly used features and techniques still lacks a comprehensive literature analysis. The present study addresses this gap and identifies frequently used features and existing techniques and their evaluation process for predicting an article’s citations. This study reviewed 150 articles from 2010 to 2023, and selected 107 based on established exclusion and inclusion criteria. It provides an overview of publication features and the standard techniques used for their identification to facilitate improvements in this field. The findings indicate that previous works frequently used (i) selected features such as paper features and citation features in predicting citations and (ii) machine learning techniques that are commonly applied to predict article citations. These findings can provide beneficial information for researchers aiming to enhance their papers and maximize their impact.
This study provides a systematic review of technology-assisted language learning. This study provides a summary content of the reviewed articles in the aspects of technology usage, language, and ...learning skills, and the benefits offered by technology in language learning. The study focused on the published articles between 2012 and 2022. Out of 5719 articles initially retrieved from five academic databases and reviewed, twenty-seven (27) research articles were selected. Based on the review findings, the most used technology is the intelligent system (n=7). The study also revealed that the most common target language is English (n=22), whereas skills such as vocabulary, writing, and grammar gained the most attention in the selected studies. The review also identified and analyzed the empirical evidence on the benefits of technology in language learning, such as language performance development, motivation, metacognitive skills, positive attitudes towards learning, enhancement of students' learning retention, collaborative learning model, and extensive learning opportunity. Barriers to the implementation of the technology, such as learning anxiety, insufficient technology literacy, and technical limitations, were also recognized, and some suggestions were provided to overcome those barriers. Thus, this review can be used as a guide for educators and researchers who intend to design technology-assisted language learning and teaching in the future.
Abstract Background Growing demand for student-centered learning (SCL) has been observed in higher education settings including dentistry. However, application of SCL in dental education is limited. ...Hence, this study aimed to facilitate SCL application in dentistry utilising a decision tree machine learning (ML) technique to map dental students’ preferred learning styles (LS) with suitable instructional strategies (IS) as a promising approach to develop an IS recommender tool for dental students. Methods A total of 255 dental students in Universiti Malaya completed the modified Index of Learning Styles (m-ILS) questionnaire containing 44 items which classified them into their respective LS. The collected data, referred to as dataset, was used in a decision tree supervised learning to automate the mapping of students' learning styles with the most suitable IS. The accuracy of the ML-empowered IS recommender tool was then evaluated. Results The application of a decision tree model in the automation process of the mapping between LS (input) and IS (target output) was able to instantly generate the list of suitable instructional strategies for each dental student. The IS recommender tool demonstrated perfect precision and recall for overall model accuracy, suggesting a good sensitivity and specificity in mapping LS with IS. Conclusion The decision tree ML empowered IS recommender tool was proven to be accurate at matching dental students’ learning styles with the relevant instructional strategies. This tool provides a workable path to planning student-centered lessons or modules that potentially will enhance the learning experience of the students.
Aspect-based sentiment analysis (ABSA) is currently among the most vigorous areas in natural language processing (NLP). Individuals, private and government institutions are increasingly using media ...sources for decision making. In the last decade, aspect extraction has been the most essential phase of sentiment analysis (SA) to conduct an abridged sentiment classification. However, previous studies on sentiment analysis mostly focused on explicit aspects extraction with limited work on implicit aspects. To the best of our knowledge, this is the first systematic review that covers implicit, explicit, and the combination of both implicit and explicit aspect extractions. Therefore, this systematic review has been conducted to, 1) identify techniques used for extracting implicit, explicit, or both implicit and explicit aspects; 2) analyze the various evaluation metrics, data domains, and languages involved in the implicit and explicit aspect extraction in sentiment analysis from years 2008 to 2019; 3) identify the key challenges associated with the techniques based on the result of a comprehensive comparative analysis; and finally, 4) highlight the feasible opportunities for future research directions. This review can be used to assist novice and prominent researchers to understand the concept of both implicit and explicit aspect extractions in aspect-based sentiment analysis domain.
Feature selection (FS) represents an important task in classification. Hadith represents an example in which we can apply FS on it. Hadiths are the second major source of Islam after the Quran. ...Thousands of Hadiths are available in Islam, and these Hadiths are grouped into a number of classes. In the literature, there are many studies conducted for Hadiths classification. Sine Cosine Algorithm (SCA) is a new metaheuristic optimization algorithm. SCA algorithm is mainly based on exploring the search space using sine and cosine mathematical formulas to find the optimal solution. However, SCA, like other Optimization Algorithm (OA), suffers from the problem of local optima and solution diversity. In this paper, to overcome SCA problems and use it for the FS problem, two major improvements were introduced to the standard SCA algorithm. The first improvement includes the use of singer chaotic map within SCA to improve solutions diversity. The second improvement includes the use of the Simulated Annealing (SA) algorithm as a local search operator within SCA to improve its exploitation. In addition, the Gini Index (GI) is used to filter the resulted selected features to reduce the number of features to be explored by SCA. Furthermore, three new Hadith datasets were created. To evaluate the proposed Improved SCA (ISCA), the new three Hadiths datasets were used in our experiments. Furthermore, to confirm the generality of ISCA, we also applied it on 14 benchmark datasets from the UCI repository. The ISCA results were compared with the original SCA and the state-of-the-art algorithms such as Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Grasshopper Optimization Algorithm (GOA), and the most recent optimization algorithm, Harris Hawks Optimizer (HHO). The obtained results confirm the clear outperformance of ISCA in comparison with other optimization algorithms and Hadith classification baseline works. From the obtained results, it is inferred that ISCA can simultaneously improve the classification accuracy while it selects the most informative features.
Aspect-based sentiment analysis (ABSA) is described as one of the most vibrant research areas over the last decade. However, due to the exponential increase in aspect-based sentiment researches, ...there is a massive interest in advanced explicit aspect extraction (EAE) techniques. This interest brings about a huge amount of literature in the EAE domain. This study aims to investigate and identify the existing approaches, techniques, types of research, quantity of publications, publication trends and demographics shaping the EAE research domain in the last decade (2009 - 2019). Accordingly, an evidence-based systematic methodology was adopted to effectively capture all the relevant studies. The main findings revealed that, 1) there is considerable and continuous rise of EAE research activities around different parts of the globe in the last five years, particularly Asia, Middle-East, and European countries; 2) EAE research has been very limited among African countries which need to be addressed due its role on business intelligence as well as semantic values; 3) three research facets were highlighted based on this study, i.e. solution research, validation research, and evaluation research, in which solution research gets the highest attention; and finally 4) the EAE challenges, as well as feasible future recommendations, were highlighted in this study.
The automatic speech identification in Arabic tweets has generated substantial attention among academics in the fields of text mining and natural language processing (NLP). The quantity of studies ...done on this subject has experienced significant growth. This study aims to provide an overview of this field by conducting a systematic review of literature that focuses on automatic hate speech identification, particularly in the Arabic language. The goal is to examine the research trends in Arabic hate speech identification and offer guidance to researchers by highlighting the most significant studies published between 2018 and 2023. This systematic study addresses five specific research questions concerning the types of the Arabic language used, hate speech categories, classification techniques, feature engineering techniques, performance metrics, validation methods, existing challenges faced by researchers, and potential future research directions. Through a comprehensive search across nine academic databases, 24 studies that met the predefined inclusion criteria and quality assessment were identified. The review findings revealed the existence of many Arabic linguistic varieties used in hate speech on Twitter, with modern standard Arabic (MSA) being the most prominent. In identification techniques, machine learning categories are the most used technique for Arabic hate speech identification. The result also shows different feature engineering techniques used and indicates that N-gram and CBOW are the most used techniques. F1-score, precision, recall, and accuracy were also identified as the most used performance metric. The review also shows that the most used validation method is the train/test split method. Therefore, the findings of this study can serve as valuable guidance for researchers in enhancing the efficacy of their models in future investigations. Besides, algorithm development, policy rule regulation, community management, and legal and ethical consideration are other real-world applications that can be reaped from this research.
Sentiment analysis (SA) is a study where people's opinions and emotions are automatically extracted in the form of sentiments from the natural language text. In social media monitoring, it is very ...useful because it allows user to gain an overall picture of the extensive public opinion behind many topics. Most works on SA are for the English text. Only a few works focus on the Malay language. Currently, a review on SA for the Malay language only focus on the SA approaches and the dataset. Some major issues such as the pre-processing techniques used to normalize the noisy text, the most employed performance measures for Malay SA, and the challenges for Malay SA has not been reviewed. Malaysians tend not to fully follow any abbreviations rules when writing on social media. Thus, a lot of noisy text can be found in social media sites like Facebook and Twitter which create some issues to SA process. Hence, the aim of this study is to investigate the state of the art, challenges and future works of SA for Malay social media text. This study provides a review on various approaches, datasets, performance measures, and pre-processing techniques used in the previous works on SA of the Malay text. More than 700 articles from journals and conference proceedings have been identified using the search keywords, however, only 17 relevant articles published from year 2013 to 2018 were reviewed. The findings from this review focus on three commonly used SA approaches which are lexicon-based, machine learning, and hybrid.