Sarcasm detection in the Indonesian language poses a unique set of challenges due to the linguistic nuances and cultural specificities of the Indonesian social media landscape. Understanding the ...dynamics of sarcasm in this context requires a deep dive into not only language patterns but also the socio-cultural background that shapes the use of sarcasm as a form of criticism and expression. In this study, we developed the first publicly available Indonesian sarcasm detection benchmark datasets from social media texts. We extensively investigated the results of classical machine learning algorithms, pre-trained language models, and recent large language models (LLMs). Our findings show that fine-tuning pre-trained language models is still superior to other techniques, achieving F1 scores of 62.74% and 76.92% on the Reddit and Twitter subsets respectively. Further, we show that recent LLMs fail to perform zero-shot classification for sarcasm detection and that tackling data imbalance requires a more sophisticated data augmentation approach than our basic methods.
Diabetic retinopathy (DR), is a complication resulting from the disease that can lead to blindness if not detected early. Recently, many classification systems for diabetic retinopathy have been ...developed. However, several problems were found, namely, the classification results in certain classes still have less than optimal accuracy values, the lack of in-depth analysis for the results, and the overall accuracy that can still be improved. In this work, we experiment by evaluating and combining new deep learning models such as EfficientNet, EfficientNetV2, LCNet, MobileNetV3, TinyNet, and FBNetV3 using ensemble stacking techniques with four different meta-learners: decision trees, logistic regression, ANN, and SVM to provide better accuracy in classifying the severity of diabetic retinopathy. Our work offers satisfactory classification results on the APTOS 2019 dataset with training, validation, testing, and F1 score accuracy of 96.56%, 95.33%, 84.17%, and 70.16%, respectively.
The topic of Drug-Target Interaction (DTI) topic has emerged nowadays since the COVID-19 outbreaks. DTI is one of the stages of finding a new cure for a recent disease. It determines whether a ...chemical compound would affect a particular protein, known as binding affinity. Recently, significant efforts have been devoted to artificial intelligence (AI) powered DTI. However, the use of transfer learning in DTI has not been explored extensively. This paper aims to make a more general DTI model by investigating DTI prediction method using Transfer learning. Three popular models will be tested and observed: CNN, RNN, and Transformer. Those models combined in several scenarios involving two extensive public datasets on DTI (BindingDB and DAVIS) to find the most optimum architecture. In our finding, combining the CNN model and BindingDB as the source data became the most recommended pre-trained model for real DTI cases. This conclusion was proved with the 6% AUPRC increase after fine-tuning the BindingDB pre-trained model to DAVIS dataset than without pre-training the model first.
Sarcasm is the use of words usually used to either mock or annoy someone, or for humorous purposes. Sarcasm is largely used in social networks and microblogging websites, where people mock or censure ...in a way that makes it difficult even for humans to tell if what is said is what is meant. Failure to identify sarcastic utterances in Natural Language Processing applications such as sentiment analysis and opinion mining will confuse classification algorithms and generate false results. Several studies on sarcasm detection have utilized different learning algorithms. However, most of these learning models have always focused on the contents of expression only, leaving the contextual information in isolation. As a result, they failed to capture the contextual information in the sarcastic expression. Moreover, some datasets used in several studies have an unbalanced dataset which impacting the model result. In this paper, we propose a contextual model for sarcasm identification in twitter using RoBERTa, and augmenting the dataset by applying Global Vector representation (GloVe) for the construction of word embedding and context learning to generate more data and balancing the dataset. The effectiveness of this technique is tested with various datasets and data augmentation settings. In particular, we achieve performance gain by 3.2% in the iSarcasm dataset when using data augmentation to increase 20% of data labeled as sarcastic, resulting F-score of 40.4% compared to 37.2% without data augmentation.
This research paper focuses on the development and evaluation of Automatic
Speech Recognition (ASR) technology using the XLS-R 300m model. The study aims
to improve ASR performance in converting ...spoken language into written text,
specifically for Indonesian, Javanese, and Sundanese languages. The paper
discusses the testing procedures, datasets used, and methodology employed in
training and evaluating the ASR systems. The results show that the XLS-R 300m
model achieves competitive Word Error Rate (WER) measurements, with a slight
compromise in performance for Javanese and Sundanese languages. The integration
of a 5-gram KenLM language model significantly reduces WER and enhances ASR
accuracy. The research contributes to the advancement of ASR technology by
addressing linguistic diversity and improving performance across various
languages. The findings provide insights into optimizing ASR accuracy and
applicability for diverse linguistic contexts.
Many studies have been conducted on Dijkstra Algorithm, and one of the implementations of those studies includes vehicle path selection or routing. A traditional Dijkstra Algorithm can compute simple ...path routing problems; however, it is not suitable for complex situations. This paper proposes an improved method of Dijkstra Algorithm that analyzes influencing factors, such as: traffic congestion, travel time reliability, weight of each equivalent path and then produces the best path according to each situation. This new method is a beneficial improvement in efficiency for shortest path search algorithms and reduces time complexity, and so, this paper applies the Dijkstra Algorithm to analyze travel routes between Bung Karno Stadium and Merlynn Park. A node diagram is created based on cost tables and transformed into cost metrics. The goal is to find the shortest route while considering traffic data and analyzing differences with road distances. The cost metrics are inserted into the Dijkstra Algorithm, resulting in minimum costs for each vertex from the source node. Two optimal travel paths are determined based on different datasets, highlighting the impact of traffic data. Further evaluations consider travel times and costs with and without traffic. Incorporating predicted traffic times into the algorithm improves travel cost predictions. This research concludes that the Dijkstra Algorithm is effective in solving the shortest route problem and can be applied to urban traffic networks with the inclusion of traffic predictions, addressing drawbacks and enhancing route planning effectiveness.
Analyzing The Most Effective Algorithm For Knapsack Problems Boes, Devano Fernando; Dewanto, Kenrick Panca; Raushanfikir, Mahesa Insan ...
2023 3rd International Conference on Smart Cities, Automation & Intelligent Computing Systems (ICON-SONICS),
2023-Dec.-6
Conference Proceeding
Knapsack problem is a classical optimization problem in computer science and programming. Knapsack problem main objective is to solve how much the maximum profit can be carried with the knapsack ...maximum capacity. The item has its own weight and profit. In this paper, we research which algorithm is most effective in solving the 0/1 knapsack and the fractional knapsack problem. The algorithm used for this research is brute force, greedy, dynamic programming, and branch and bound algorithm. Those algorithms will be tested on both knapsack problems, using several datasets on each problem. The program used to execute the algorithm is Python programming language. The results from this research are the dynamic programming has 100% success rate on both problem, the brute force has 100% success rate on the 0/1 knapsack however it only has 20% success rate on the fractional knapsack problem, the greedy algorithm surprisingly has 100% success rate on the fractional knapsack problem and 60% success rate on the 0/1 knapsack problem, and the branch and bound algorithm has 100% success rate on the fractional knapsack while has 0% success rate on the 0/1 knapsack problem. This research also considered the time complexity of each algorithm to make the conclusion. Graphs are provided for better visualization on the results.