Drowsiness is one of the main causes of road accidents and endangers the lives of road users. Recently, there has been considerable interest in utilizing features extracted from ...electroencephalography (EEG) signals to detect driver drowsiness. However, in most of the work performed in this area, the eyeblink or ocular artifacts present in EEG signals are considered noise and are removed during the preprocessing stage. In this study, we examined the possibility of extracting features from the EEG ocular artifacts themselves to perform classification between alert and drowsy states. In this study, we used the BLINKER algorithm to extract 25 blink-related features from a public dataset comprising raw EEG signals collected from 12 participants. Different machine learning classification models, including the decision tree, the support vector machine (SVM), the K-nearest neighbor (KNN) method, and the bagged and boosted tree models, were trained based on the seven selected features. These models were further optimized to improve their performance. We were able to show that features from EEG ocular artifacts are able to classify drowsy and alert states, with the optimized ensemble-boosted trees yielding the highest accuracy of 91.10% among all classic machine learning models.
This research paper presents a comprehensive comparative study assessing the quality of annotations in Turkish, Indonesian, and Minangkabau Natural Language Processing (NLP) tasks, with a specific ...focus on the contrast between annotations generated by human annotators and those produced by Large Language Models (LLMs). In the context of NLP, high-quality annotations play a pivotal role in training and evaluating machine-learning models. The study encompasses three core NLP tasks: topic classification, tweet sentiment analysis, and emotion classification, each reflecting a distinct aspect of text analysis. The research methodology incorporates a meticulously curated dataset sourced from a variety of text data, spanning diverse topics and emotions. Human annotators, proficient in the Turkish, Indonesian, and Minangkabau language, were tasked with producing high-quality annotations, adhering to comprehensive annotation guidelines. Additionally, fine-tuned Turkish LLMs were employed to generate annotations for the same tasks. The evaluation process employed precision, recall, and F1-score metrics, tailored to each specific NLP task. The findings of this study underscore the nuanced nature of annotation quality. While LLM-generated annotations demonstrated competitive quality, particularly in sentiment analysis, human-generated annotations consistently outperformed LLM-generated ones in more intricate NLP tasks. The observed differences highlight LLM limitations in understanding context and addressing ambiguity. This research contributes to the ongoing discourse on annotation sources in Turkish, Indonesian, and Minangkabau NLP, emphasizing the importance of judicious selection between human and LLM-generated annotations. It also underscores the necessity for continued advancements in LLM capabilities, as they continue to reshape the landscape of data annotation in NLP and machine learning.
Indonesia has a variety of ethnic languages, most of which belong to the same language family: the Austronesian languages. Due to the shared language family, words in Indonesian ethnic languages are ...very similar. However, previous research suggests that these Indonesian ethnic languages are endangered. Thus, to prevent that, we propose the creation of a bilingual dictionary between ethnic languages, using a neural network approach to extract transformation rules, employing character-level embedding and the Bi-LSTM method in a sequence-to-sequence model. The model has an encoder and decoder. The encoder reads the input sequence character by character, generates context, and then extracts a summary of the input. The decoder produces an output sequence wherein each character at each timestep, as well as the subsequent character output, are influenced by the previous character. The first experiment focuses on Indonesian and Minangkabau languages with 10,277 word pairs. To evaluate the model’s performance, five-fold cross-validation was used. The character-level seq2seq method (Bi-LSTM as an encoder and LSTM as a decoder) with an average precision of 83.92% outperformed the SentencePiece byte pair encoding (vocab size of 33) with an average precision of 79.56%. Furthermore, to evaluate the performance of the neural network model in finding the pattern, a rule-based approach was conducted as the baseline. The neural network approach obtained 542 more correct translations compared to the baseline. We implemented the best setting (character-level embedding with Bi-LSTM as the encoder and LSTM as the decoder) for four other Indonesian ethnic languages: Malay, Palembang, Javanese, and Sundanese. These have half the size of input dictionaries. The average precision scores for these languages are 65.08%, 62.52%, 59.69%, and 58.46%, respectively. This shows that the neural network approach can identify transformation patterns of the Indonesian language to closely related languages (such as Malay and Palembang) better than distantly related languages (such as Javanese and Sundanese).
Creating bilingual dictionary is the first crucial step in enriching low-resource languages. Especially for the closely related ones, it has been shown that the constraint-based approach is useful ...for inducing bilingual lexicons from two bilingual dictionaries via the pivot language. However, if there are no available machine-readable dictionaries as input, we need to consider manual creation by bilingual native speakers. To reach a goal of comprehensively create multiple bilingual dictionaries, even if we already have several existing machine-readable bilingual dictionaries, it is still difficult to determine the execution order of the constraint-based approach to reducing the total cost. Plan optimization is crucial in composing the order of bilingual dictionaries creation with the consideration of the methods and their costs. We formalize the plan optimization for creating bilingual dictionaries by utilizing Markov Decision Process (MDP) with the goal to get a more accurate estimation of the most feasible optimal plan with the least total cost before fully implementing the constraint-based bilingual lexicon induction. We model a prior beta distribution of bilingual lexicon induction precision with language similarity and polysemy of the topology as
and
parameters. It is further used to model cost function and state transition probability. We estimated the cost of all investment plans as a baseline for evaluating the proposed MDP-based approach with total cost as an evaluation metric. After utilizing the posterior beta distribution in the first batch of experiments to construct the prior beta distribution in the second batch of experiments, the result shows 61.5% of cost reduction compared to the estimated all investment plans and 39.4% of cost reduction compared to the estimated MDP optimal plan. The MDP-based proposal outperformed the baseline on the total cost.
Lexicostatistic and language similarity clusters are useful for computational linguistic researches that depends on language similarity or cognate recognition. Nevertheless, there are no published ...lexicostatistic/language similarity cluster of Indonesian ethnic languages available. We formulate an approach of creating language similarity clusters by utilizing ASJP database to generate the language similarity matrix, then generate the hierarchical clusters with complete linkage and mean linkage clustering, and further extract two stable clusters with high language similarities. We introduced an extended k-means clustering semi-supervised learning to evaluate the stability level of the hierarchical stable clusters being grouped together despite of changing the number of cluster. The higher the number of the trial, the more likely we can distinctly find the two hierarchical stable clusters in the generated k-clusters. However, for all five experiments, the stability level of the two hierarchical stable clusters is the highest on 5 clusters. Therefore, we take the 5 clusters as the best clusters of Indonesian ethnic languages. Finally, we plot the generated 5 clusters to a geographical map.
The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction a difficult task for low-resource languages. The pivot language and cognate recognition approaches have been ...proven useful for inducing bilingual lexicons for such languages. We propose constraint-based bilingual lexicon induction for closely related languages by extending constraints from the recent pivot-based induction technique and further enabling multiple symmetry assumption cycle to reach many more cognates in the transgraph. We further identify cognate synonyms to obtain many-to-many translation pairs. This article utilizes four datasets: one Austronesian low-resource language and three Indo-European high-resource languages. We use three constraint-based methods from our previous work, the Inverse Consultation method and translation pairs generated from Cartesian product of input dictionaries as baselines. We evaluate our result using the metrics of precision, recall, and F-score. Our customizable approach allows the user to conduct cross validation to predict the optimal hyperparameters (cognate threshold and cognate synonym threshold) with various combination of heuristics and number of symmetry assumption cycles to gain the highest F-score. Our proposed methods have statistically significant improvement of precision and F-score compared to our previous constraint-based methods. The results show that our method demonstrates the potential to complement other bilingual dictionary creation methods like word alignment models using parallel corpora for high-resource languages while well handling low-resource languages.
Indonesia has a diverse ethnic and cultural background. However, this diversity sometimes creates social problems, such as intertribal conflict. Because of the large differences among tribal ...languages, it is often difficult for conflicting parties to dialog for conflict resolution. To address this problem, we aim to find intermediary closely related languages from a language similarity knowledge graph using the best-performing pathfinding algorithms. In this research, we analyze the performances of two pathfinding algorithms, namely, Dijkstra and Yen’s K, by comparing their execution time and the total lexical distances of the intermediary languages (called “the cost”). Our research findings show that even though the Dijkstra and Yen’s K algorithms have equal total cost for all the cases, Yen’s K outperformed Dijkstra at searching for intermediary languages that are closely related, with an average of 160% higher performance on execution time. The selection of native speakers of the obtained intermediary languages as mediators is formalized as an optimization problem with four criteria: language similarity, geographical distance, background, and expected salary. We present a case study where the intermediary closely related languages can be used as a guideline to find mediators who can help resolve the intertribal conflicts among Indonesian tribes. To calculate the first criteria, we implemented the Yen’s K algorithm to calculate the shortest path between target languages and return the path via the intermediary languages. This implementation shows the potential use of the mediator selection model defined in this paper in various other roles such as trader or salesman, politician’s spokesman, reporter or journalist, etc.
The COVID-19 pandemic presents its own challenges in the education sector (schools) to adapt quickly so that they can still carry out and present education optimally to students even in limited ...circumstances. One of the efforts that continue to be made by the school is the application of information technology as a learning medium. The partner for this Community Service activity is SMP YLPI Marpoyan Pekanbaru, which during the pandemic experienced obstacles in using the library and using the Learning Management System (Google Classroom). To that end, the team built the book&quiz application and conducted training on the use of the book&quiz application and Google Classroom. The application of the book&quiz application is believed to be able to facilitate the management of digital libraries and increase students' reading interest. In addition, the use of Google Classroom can support more optimal online class management. Overall, the teachers are very satisfied and helped by this community service activity.
Pembuatan Plugin Tile-Based Game Pada Unity 3D Nasution, Salhazan; Nasution, Arbi Haza; Hakim, Arif Lukman
IT journal research and development (Online),
08/2019, Volume:
4, Issue:
1
Journal Article
Peer reviewed
Open access
Saat ini video games sudah menjadi hal umum dalam kehidupan masyarakat dunia. Sejalan dengan itu, proses pengembangan sebuah game menjadi lebih baik dengan kemunculan game engine. Salah satu dari ...sekian banyak game engine yang paling sering digunakan adalah Unity. Unity memberikan berbagai macam fitur, salah satunya adalah kemampuan untuk menggunakan plugin. Hanya saja, Unity sendiri belum memiliki plugin untuk pengembangan game berbasis tile. Tanpa dukungan plugin, pengembangan tile-based game akan memakan waktu sangat lama, karena setiap tile harus diatur ulang masing-masing posisinya pada koordinat x, y, dan z dengan sangat presisi setiap kali tile baru dibuat. Solusi dari masalah tersebut adalah dengan membuat GUI (Graphical User Interface) pada editor Unity, dengan melakukan ekstensi kelas Editor milik Unity. Dengan melakukan ekstensi kelas tersebut, sebuah sistem menu baru dapat dibuat khusus untuk melakukan level editing pada tile-based game. Dengan menggunakan plugin ini, pengembangan tile-based game dapat menjadi lebih efektif dan efisien, baik dari segi sumber daya, waktu, dan kemudahan pengerjaan.
Bahasa Inggris adalah salah satu bahasa yang digunakan sebagai alat komunikasi universal, karenanya tanpa kemampuan bahasa Inggris seorang akan mengalami kesulitan berkomunikasi secara baik dan benar ...pada ruang lingkup internasional. Penelitian ini mengembangkan aplikasi mesin penerjemah berbasis augmented reality yang dapat memberikan edukasi kepada siswa dengan media yang berbeda agar meningkatkan minat siswa dalam belajar bahasa Inggris. Aplikasi ini menggunakan library vuforia sdk yang mampu menampilkan karakter 3 dimensi dengan teknik markerless dalam bentuk augmented reality. Hasil akhir dari penelitian ini berupa aplikasi yang dapat digunakan pada smartphone dengan sistem operasi android, berdasar hasil pengujian terhadap aplikasi didapat kesimpulan bahwa aplikasi ini dapat menampilkan karakter 3 dimensi pada cahaya yang redup dengan intensitas cahaya 28 lux pada jarak 10cm-60 cm dan sudut penglihatan 10°-90°, setelah dilakukan peninjauan terhadap aplikasi 99% koresponden menyatakan aplikasi ini baik, maka aplikasi ini dapat membantu siswa mempelajari kembali bahasa Inggris selain disekolah.