Artificial Intelligence and Language Technologies in the Field of Translation
The present paper deals with the opportunities which language technologies and artificial intelligence might offer for ...translation professionals. The paper also examines and elucidates that these technologies might contain possible problems, hurdles and obstacles. But the researcher reinforces that they also possess the necessary skills, competencies and prerequisites for the successful implementation and use of these technologies as illustrated in the Literature Review and Discussion parts of this research. It has also been shown that the successful implementation of such tools requires both institutional and technical prerequisites. A not inconsiderable problem, however, is the question of the what extent to which and whether the significance of the digital translator can also be adequately illustrated and clarified in the non-institutional private sector. Current developments in this respect unfortunately tend to point to the opposite, i.e. to the degradation of his role to a mere "helper" of the machine that (apparently) does most of the translation work.
Translators, then, will ultimately only be able to truly benefit from language technology and artificial intelligence if their status is not further compromised as a result. It is not very helpful if productivity and speed can be increased, but in inverse proportion to this, the fees constantly decrease, because it is made to appear that the computer is now doing most of the work. The end result of this development would be the digital slave rather than the highly professional digital language mediator. The paper projects that language technologies and artificial intelligence potentially offer benefits in the sense that they promote a better understanding and functioning of the symbiosis between humans and machines through the automation and streamlining of certain language-related workflow processes, i.e., a set of tasks that can be performed primarily - or even entirely - through workflow automation and with the help of content management software. Through the various past research on the subject and through secondary data analysis it has been analysed and stated in the paper that the potential acceleration of work processes improves overall efficiency by requiring less time to complete the same tasks. Therefore, this study evidently establishes the argument that artificial intelligence allows human actors to focus on more meaningful tasks associated with quality control, while routine and especially technical tasks are delegated (for the most part) to the machine. This also leaves more time for creative aspects. Artificial intelligence and language technology are not only relevant for translating, but also for interpreting, even if progress in this area is still somewhat slower and a paradigm shift has yet to occur. In this regard, conference interpreting is of particular interest. Problematically, the relevant software has so far lacked the cognitive, cultural, intellectual, and emotional skills that inevitably underlie qualitatively responsive interpretation. But at least it is possible to improve the interpreter's preparation for a meeting or conference. The relevant tools have so far only a supporting role to play. Language technologies and artificial intelligence potentially offer benefits in the sense that they promote a better understanding and functioning of the symbiosis between humans and machines through the automation and streamlining of certain language-related workflow processes, i.e., a set of tasks that can be performed mainly - or even entirely through workflow automation and with the help of content management software. Further development of artificial intelligence, machine translation has recently made significant progress. Nevertheless, it is common practice to first subject the machine-translated text to review or proofreading by a human translator. Post-editing is the process of using a machine-translated text as a basis and having it improved by a human translator. This means that the human translator ultimately creates the final translation.
•The study describes the problem of fake news phenomena in digital information.•The study provides a systematic review of the state-of-the-art regarding automatic fake news detection.•From the ...review, the main subtasks involved in automatic fake news detection are detected and classified.•The review covers systems, resources and competitions in automatic fake news detection.•The review outlines knowledge gaps and future challenges related to automatic fake news detection.
Post-truth is a term that describes a distorting phenomenon that aims to manipulate public opinion and behavior. One of its key engines is the spread of Fake News. Nowadays most news is rapidly disseminated in written language via digital media and social networks. Therefore, to detect fake news it is becoming increasingly necessary to apply Artificial Intelligence (AI) and, more specifically Natural Language Processing (NLP). This paper presents a review of the application of AI to the complex task of automatically detecting fake news. The review begins with a definition and classification of fake news. Considering the complexity of the fake news detection task, a divide-and-conquer methodology was applied to identify a series of subtasks to tackle the problem from a computational perspective. As a result, the following subtasks were identified: deception detection; stance detection; controversy and polarization; automated fact checking; clickbait detection; and, credibility scores. From each subtask, a PRISMA compliant systematic review of the main studies was undertaken, searching Google Scholar. The various approaches and technologies are surveyed, as well as the resources and competitions that have been involved in resolving the different subtasks. The review concludes with a roadmap for addressing the future challenges that have emerged from the analysis of the state of the art, providing a rich source of potential work for the research community going forward.
Una de les àrees més rellevants de la IA és el processament del llenguatge natural (PLN). En aquest àmbit, tot i que actualment la majoria dels grans models de llenguatge ja són multilingües, hi ha ...una diferència substancial entre les capacitats dels models pel que fa a l’anglès i a la resta de llengües. En aquest sentit, el projecte AINA té per objectiu desenvolupar la infraestructura necessària per què la inclusió del català a les aplicacions d’IA sigui prou atractiva i viable. Aquest article presenta els objectius del projecte i n’explica les característiques generals.
One of the most relevant areas of AI is Natural Language Processing (NLP). In this area, even though most of the large language models are currently multilingual, there is an important difference between the capabilities of English models and the other languages. Thus, the AINA project aims at developing the necessary infrastructure so that the inclusion of Catalan in AI applications becomes appealing and feasible. This article presents the objectives of the project and explains its main characteristics.
This paper contains a large literature review in the research field of Text Summarisation (TS) based on Human Language Technologies (HLT). TS helps users manage the vast amount of information ...available, by condensing documents’ content and extracting the most relevant facts or topics included in them. The rapid development of emerging technologies poses new challenges to this research field, which still need to be solved. Therefore, it is essential to analyse its progress over the years, and provide an overview of the past, present and future directions, highlighting the main advances achieved and outlining remaining limitations. With this purpose, several important aspects are addressed within the scope of this survey. On the one hand, the paper aims at giving a general perspective on the state-of-the-art, describing the main concepts, as well as different summarisation approaches, and relevant international forums. Furthermore, it is important to stress upon the fact that the birth of new requirements and scenarios has led to new types of summaries with specific purposes (e.g. sentiment-based summaries), and novel domains within which TS has proven to be also suitable for (e.g. blogs). In addition, TS is successfully combined with a number of intelligent systems based on HLT (e.g. information retrieval, question answering, and text classification). On the other hand, a deep study of the evaluation of summaries is also conducted in this paper, where the existing methodologies and systems are explained, as well as new research that has emerged concerning the automatic evaluation of summaries’ quality. Finally, some thoughts about TS in general and its future will encourage the reader to think of novel approaches, applications and lines to conduct research in the next years. The analysis of these issues allows the reader to have a wide and useful background on the main important aspects of this research field.
The article emphasizes the critical importance of language generation today, particularly focusing on three key aspects: Multitasking, Multilinguality, and Multimodality, which are pivotal for the ...Natural Language Generation community. It delves into the activities conducted within the Multi3Generation COST Action (CA18231) and discusses current trends and future perspectives in language generation.
This paper presents a novel architecture for dealing with Automatic Fake News detection. The architecture factors in the discourse structure of news in traditional digital media and is based on two ...premises. First, fake news tends to mix true and false information with the purpose of confusing readers. Second, this research is focused on fake news delivered in traditional digital media, so our approach considers the influence of the journalistic structure of news, and the way journalists tend to introduce the essential content in a news story using 5W1H answer. Considering both premises, this proposal deals with the news components separately because some may be true or false, instead of considering the veracity value of the news article as a unit. A two-layer architecture is proposed, Structure and Veracity layers. To demonstrate the validity of the proposal, a new dataset was created and annotated with a new fine-grained annotation scheme (FNDeepML) that considers the different elements of the news document and their veracity. Due to the severity of the COVID-19 pandemic crisis, health is the chosen domain, and Spanish is the language used to validate the architecture, given the lack of research in this language. However, the proposal can be applied to any other language or domain. The performance of the Veracity layer of our proposal, which factors in the traditional news article structure and the 5W1H annotation, is capable of delivering a result of F1=0.807. This represents a strong improvement when compared to the baseline, which uses the whole document with a single veracity value, obtaining F1=0.605. These findings validate the suitability and effectiveness of our approach.
•A novel Automatic Fake News detection proposal based on determining the veracity of the essential content of news articles.•A new benchmark Spanish Fake News dataset focused on health news is presented.•A new Fake News Detection architecture comprising two layers (Structure Layer and Veracity Layer) is presented.•Each layer of the architecture involves a set of phases and each phase is thoroughly described.•Performance of each layer of the architecture is measured and analyzed.
Discovering the main features of virality patterns in Twitter is the focus of this research. Five trending topics related to the COVID-19 pandemic were selected for the study, with Spanish as the ...target language. To carry out the discovery of virality patterns, we applied opinion mining techniques that enable us to structure the information based on the polarity of the messages and the emotions they contain. After transforming the information from an unstructured textual representation to a structured one, data mining techniques were applied, specifically association rules mining. Message patterns with the highest virality (high shares and high likes), and at the same time the most relevant characteristics of the patterns with less impact were extracted. After an exhaustive analysis of the most relevant non-redundant rules, it can be concluded that messages with a high-negative polarity and a very high emotional charge, especially emotions that have intensified with the COVID-19 pandemic, such as fear, sadness, anger and surprise are more likely to go viral in social media. By contrast, messages with little news coverage in the media, few authors, and the absence of surprise are relevant features when it comes to seeing messages with very low dissemination in social media.
•A novel approach to extract virality patterns from social media Twitter is presented.•Opinion mining extracts subjective content transforming it into structured data.•Association rule mining is applied to structured data to extract virality patterns.•Virality patterns were discovered for high share/likes and low share/likes.•Extracted and relevant patterns were measured and analyzed.