OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. The first contact with the chatbot ...reveals its ability to provide detailed and precise answers in various areas. Several publications on ChatGPT evaluation test its effectiveness on well-known natural language processing (NLP) tasks. However, the existing studies are mostly non-automated and tested on a very limited scale. In this work, we examined ChatGPT’s capabilities on 25 diverse analytical NLP tasks, most of them subjective even to humans, such as sentiment analysis, emotion recognition, offensiveness, and stance detection. In contrast, the other tasks require more objective reasoning like word sense disambiguation, linguistic acceptability, and question answering. We also evaluated GPT-4 model on five selected subsets of NLP tasks. We automated ChatGPT and GPT-4 prompting process and analyzed more than 49k responses. Our comparison of its results with available State-of-the-Art (SOTA) solutions showed that the average loss in quality of the ChatGPT model was about 25% for zero-shot and few-shot evaluation. For GPT-4 model, a loss for semantic tasks is significantly lower than for ChatGPT. We showed that the more difficult the task (lower SOTA performance), the higher the ChatGPT loss. It especially refers to pragmatic NLP problems like emotion recognition. We also tested the ability to personalize ChatGPT responses for selected subjective tasks via Random Contextual Few-Shot Personalization, and we obtained significantly better user-based predictions. Additional qualitative analysis revealed a ChatGPT bias, most likely due to the rules imposed on human trainers by OpenAI. Our results provide the basis for a fundamental discussion of whether the high quality of recent predictive NLP models can indicate a tool’s usefulness to society and how the learning and validation procedures for such systems should be established.
•The results of ChatGPT and GPT-4 evaluation on 25 tasks using 48k+ prompts.•Context-awareness and personalization are valuable capabilities of ChatGPT.•ChatGPT and GPT-4 are always worse compared to SOTA methods from 4% to over 70%.•ChatGPT loss tends to be higher for more difficult reasoning problems.•ChatGPT can boost AI development and change our daily lives.
This open access book provides an in-depth description of the EU project European Language Grid (ELG). Its motivation lies in the fact that Europe is a multilingual society with 24 official European ...Union Member State languages and dozens of additional languages including regional and minority languages. The only meaningful way to enable multilingualism and to benefit from this rich linguistic heritage is through Language Technologies (LT) including Natural Language Processing (NLP), Natural Language Understanding (NLU), Speech Technologies and language-centric Artificial Intelligence (AI) applications. The European Language Grid provides a single umbrella platform for the European LT community, including research and industry, effectively functioning as a virtual home, marketplace, showroom, and deployment centre for all services, tools, resources, products and organisations active in the field. Today the ELG cloud platform already offers access to more than 13,000 language processing tools and language resources. It enables all stakeholders to deposit, upload and deploy their technologies and datasets. The platform also supports the long-term objective of establishing digital language equality in Europe by 2030 – to create a situation in which all European languages enjoy equal technological support. This is the very first book dedicated to Language Technology and NLP platforms. Cloud technology has only recently matured enough to make the development of a platform like ELG feasible on a larger scale. The book comprehensively describes the results of the ELG project. Following an introduction, the content is divided into four main parts: (I) ELG Cloud Platform; (II) ELG Inventory of Technologies and Resources; (III) ELG Community and Initiative; and (IV) ELG Open Calls and Pilot Projects.
Some tasks in content processing, e.g., natural language processing (NLP), like hate or offensive speech and emotional or funny text detection, are subjective by nature. Each human may perceive some ...content individually. The existing reasoning methods commonly rely on agreed output values, the same for all recipients. We propose fundamentally different — personalized solutions applicable to any subjective NLP task. Our five new deep learning models take into account not only the textual content but also the opinions and beliefs of a given person. They differ in their approaches to learning Human Bias (HuBi) and fusion with content (text) representation. The experiments were carried out on 14 tasks related to offensive, emotional, and humorous texts. Our personalized HuBi methods radically outperformed the generalized ones for all NLP problems. Personalization also has a greater impact on reasoning quality than commonly explored pre-trained and fine-tuned language models. We discovered a high correlation between human bias calculated using our dedicated formula and that learned by the model. Multi-task solutions achieved better outcomes than single-task architectures. Human and word embeddings also provided additional insights.
•Human-centered neural architectures suitable for subjective NLP problems are introduced.•Personalized NLP requires dedicated validation procedures.•Personalized methods revealed their superiority over generalized approaches for 14 tasks related to hate speech, emotions and humor.•Language models, multi-tasking and fine-tuning have less impact than personalization.•There is correlation between formula-based human bias and bias learned by the neural model.
Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine ...translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish four phases by discussing different levels of NLP and components of
N
atural
L
anguage
G
eneration followed by presenting the history and evolution of NLP. We then discuss in detail the state of the art presenting the various applications of NLP, current trends, and challenges. Finally, we present a discussion on some available datasets, models, and evaluation metrics in NLP.
Abstract
The social media can be a platform for information consumption nowadays. On the one hand, it’s free of cost, easy access, and different data dissemination lead people to hunt out and consume ...social media news. On the contrary, it allows for the broad spread of “spams,” i.e., inferiority news with deliberately false information. The widespread spread of spams has the potential for very negative impacts on people and society. Consequently, the detection of spam on social media has recently become an important research that draws tremendous attention. NLP, an artificial intelligence (AI) division, uses computers and human natural language to produce useful data. In text classification activities, such as spam detection and sentiment analysis, text generation, language translations and document classification, NLP is widely used.
ChatGpt: Open Possibilities Mohammad Aljanabi; Mohanad Ghazi; Ahmed Hussein Ali ...
Iraqi Journal for Computer Science and Mathematics,
2023, Volume:
4, Issue:
1
Journal Article
Peer reviewed
Open access
ChatGPT-3 is a powerful language model developed by OpenAI that has the potential to revolutionize the way we interact with technology. This model has been trained on a massive amount of data, ...allowing it to understand and generate human-like text with remarkable accuracy. One of the most exciting possibilities of ChatGPT-3 is its potential to improve natural language processing (NLP) and natural language understanding (NLU) in a wide range of applications. In particular, ChatGPT-3 can be used to power chatbots, virtual assistants, and other conversational interfaces. These types of systems are becoming increasingly important as more and more people use voice and text to interact with technology, we list ChatGpt role in each of the follwoing sections
We introduce Bidirectional and Auto-Regressive Transformer for Reactions (BARTReact), a self-supervised deep learning model designed to predict chemical reactions. Built on the powerful Bidirectional ...and Auto-Regressive Transformer (BART) architecture, BARTReact is trained using the SELF-referencIng Embedded Strings (SELFIES), a molecular representation that ensures the production of only viable molecules, achieving an outstanding accuracy of 98.6 %.
This study investigates the effectiveness of the DistilBERT model in classifying tweets related to disasters. This study achieved significant predictive accuracy through a comprehensive analysis of ...the dataset and iterative refinement of the model, including adjustments to hyperparameters. The benchmark model developed highlights the benefits of DistilBERT, with its reduced size and improved processing speed contributing to greater computational efficiency while maintaining over 95% of BERT's capabilities. The results indicate an impressive average training accuracy of 92.42% and a validation accuracy of 82.11%, demonstrating the practical advantages of DistilBERT in emergency management and disaster response. These findings underscore the potential of advanced transformer models to analyze social media data, contributing to better public safety and emergency preparedness.
We present a statistical model, GERNERMED++, for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. We demonstrate the ...effectiveness of combining multiple techniques in order to achieve strong results in entity recognition performance by the means of transfer-learning on pre-trained deep language models (LM), word-alignment and neural machine translation, outperforming a pre-existing baseline model on several datasets. Due to the sparse situation of open, public medical entity recognition models for German texts, this work offers benefits to the German research community on medical NLP as a baseline model. The work serves as a refined successor to our first GERNERMED model. Similar to our previous work, our trained model is publicly available to other researchers. The sample code and the statistical model is available at: https://github.com/frankkramer-lab/GERNERMED-pp.
Display omitted