Social media have become an integral part of our lives, expanding our interlinking capabilities to new levels. There is plenty to be said about their positive effects. On the other hand, however, ...some serious negative implications of social media have been repeatedly highlighted in recent years, pointing at various threats to society and its more vulnerable members, such as teenagers, in particular, ranging from much-discussed problems such as digital addiction and polarization to manipulative influences of algorithms and further to more teenager-specific issues (e.g., body stereotyping). The impact of social media-both at an individual and societal level-is characterized by the complex interplay between the users' interactions and the intelligent components of the platform. Thus, users' understanding of social media mechanisms plays a determinant role. We thus propose a theoretical framework based on an adaptive "
" for educating and supporting an entire community, teenage students, to interact in social media environments in order to achieve desirable conditions, defined in terms of a community-specific and participatory designed measure of Collective Well-Being (CWB). This Companion combines automatic processing with expert intervention and guidance. The virtual Companion will be powered by a
(
) that will optimize a
metric instead of engagement or platform profit, which currently largely drives recommender systems thereby disregarding any societal collateral effect. CWB-RS will optimize CWB both in the short term by balancing the level of social media threats the users are exposed to, and in the long term by adopting an
role and enabling adaptive and personalized sequencing of playful learning activities. We put an emphasis on
and
in the
of the Companion. They play five key roles: (a) use the Companion in classroom-based educational activities; (b) guide the definition of the CWB; (c) provide a hierarchical structure of learning strategies, objectives and activities that will support and contain the adaptive sequencing algorithms of the CWB-RS based on hierarchical reinforcement learning; (d) act as moderators of direct conflicts between the members of the community; and, finally, (e) monitor and address ethical and educational issues that are beyond the intelligent agent's competence and control. This framework offers a possible approach to understanding how to design social media systems and embedded educational interventions that favor a more healthy and positive society. Preliminary results on the performance of the Companion's components and studies of the educational and psychological underlying principles are presented.
Much recent effort has been devoted to creating large-scale language models. Nowadays, the most prominent approaches are based on deep neural networks, such as BERT. However, they lack transparency ...and interpretability, and are often seen as black boxes. This affects not only their applicability in downstream tasks but also the comparability of different architectures or even of the same model trained using different corpora or hyperparameters. In this paper, we propose a set of intrinsic evaluation tasks that inspect the linguistic information encoded in models developed for Brazilian Portuguese. These tasks are designed to evaluate how different language models generalise information related to grammatical structures and multiword expressions (MWEs), thus allowing for an assessment of whether the model has learned different linguistic phenomena. The dataset that was developed for these tasks is composed of a series of sentences with a single masked word and a cue phrase that helps in narrowing down the context. This dataset is divided into MWEs and grammatical structures, and the latter is subdivided into 6 tasks: impersonal verbs, subject agreement, verb agreement, nominal agreement, passive and connectors. The subset for MWEs was used to test BERTimbau Large, BERTimbau Base and mBERT. For the grammatical structures, we used only BERTimbau Large, because it yielded the best results in the MWE task. In both cases, we evaluated the results considering the best candidates and the top ten candidates. The evaluation was done both automatically (for MWEs) and manually (for grammatical structures). The results obtained for MWEs show that BERTimbau Large surpassed both the other models in predicting the correct masked element. However, the average accuracy of the best model was only 52% when only the best candidates were considered for each sentence, going up to 66% when the top ten candidates were taken into account. As for the grammatical tasks, the results presented better prediction, but also varied depending on the type of morphosyntactic agreement. On the one hand, cases such as connectors and impersonal verbs, which do not require any agreement in the produced candidates, had precision of 100% and 98.78% among the best candidates. On the other hand, tasks that require morphosyntactic agreement had results consistently below 90% overall precision, with the lowest scores being reported for nominal agreement and verb agreement, both having scores below 80% in overall precision among the best candidates. Therefore, we identified that a critical and widely adopted resource for Brazilian Portuguese NLP presents issues concerning MWE vocabulary and morphosyntactic agreement, even if it is prolific in most cases. These models are a core component in many NLP systems, and our findings demonstrate the need of additional improvements in these models and the importance of widely evaluating computational representations of language.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Natural language processing and other areas of artificial intelligence have seen staggering progress in recent years, yet much of this is reported with reference to somewhat limited benchmark ...datasets.
We see the deployment of these techniques in realistic use cases as the next step in this development. In particular, much progress is still needed in educational settings, which can strongly improve users’ safety on social media. We present our efforts to develop multi-modal machine learning algorithms to be integrated into a social media companion aimed at supporting and educating users in dealing with fake news and other social media threats.
Inside the companion environment, such algorithms can automatically assess and enable users to contextualize different aspects of their social media experience. They can estimate and display different characteristics of content in supported users’ feeds, such as ‘fakeness’ and ‘sentiment’, and suggest related alternatives to enrich users’ perspectives. In addition, they can evaluate the opinions, attitudes, and neighbourhoods of the users and of those appearing in their feeds. The aim of the latter process is to raise users’ awareness and resilience to filter bubbles and echo chambers, which are almost unnoticeable and rarely understood phenomena that may affect users’ information intake unconsciously and are unexpectedly widespread.
The social media environment is rapidly changing and complex. While our algorithms show state-of-the-art performance, they rely on task-specific datasets, and their reliability may decrease over time and be limited against novel threats. The negative impact of these limits may be exasperated by users’ over-reliance on algorithmic tools.
Therefore, companion algorithms and educational activities are meant to increase users’ awareness of social media threats while exposing the limits of such algorithms. This will also provide an educational example of the limits affecting the machine-learning components of social media platforms.
We aim to devise, implement and test the impact of the companion and connected educational activities in acquiring and supporting conscientious and autonomous social media usage.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Dans cet article, nous présentons une typologie de transformations syntaxiques permettant une adaptation des contenus textuels à destination d’enfants faibles lecteurs et dyslexiques. Pour arriver à ...cette proposition, nous avons analysé des textes parallèles originaux et adaptés. Nous avons aussi appliqué des transformations lexicales, morpho-syntaxiques et discursives à des corpus habituellement lus entre CE1 et CM1 que nous avons soumis à des enfants dans ces classes, tous profils confondus. Sur la base de ces deux études, nous avons défini une typologie de transformations syntaxiques, avec des informations supprimées, conservées ou ajoutées, qui pourra servir de guide pour adapter des textes et faciliter l’apprentissage de la lecture dans des cas d’enfants en difficulté.
Syntactic transformations to help learning to read : typology, adequacy and adapted corpora.
In this paper, we present a typology of syntactic transformations targeted at adapting textual contents addressed to poor-readers and dyslexic children. To make this proposition, we have analyzed a set of parallel texts (original and adapted). We have also applied lexical, morpho-syntactic and discursive transformations to corpora usually read at primary school (second to fourth grades). The different versions have been read by different reader profiles at school. Based on both studies, we have defined a typology of syntactic transformations, with deleted, kept or added information, that could be used as guidelines to adapt texts and facilitate reading to children facing difficulties.
Social media have become an integral part of our lives, expanding our interlinking capabilities to new levels. There is plenty to be said about their positive effects. On the other hand, however, ...some serious negative implications of social media have been repeatedly highlighted in recent years, pointing at various threats to society and its more vulnerable members, such as teenagers. We thus propose a theoretical framework based on an adaptive "Social Media Virtual Companion" for educating and supporting an entire community, teenage students, to interact in social media environments in order to achieve desirable conditions, defined in terms of a community-specific and participatory designed measure of Collective Well-Being (CWB). This Companion combines automatic processing with expert intervention and guidance. The virtual Companion will be powered by a Recommender System (CWB-RS) that will optimize a CWB metric instead of engagement or platform profit, which currently largely drives recommender systems thereby disregarding any societal collateral effect.We put an emphasis on experts and educators in the educationally managed social media community of the Companion. They play five key roles: (a) use the Companion in classroom-based educational activities; (b) guide the definition of the CWB; (c) provide a hierarchical structure of learning strategies, objectives and activities that will support and contain the adaptive sequencing algorithms of the CWB-RS based on hierarchical reinforcement learning; (d) act as moderators of direct conflicts between the members of the community; and, finally, (e) monitor and address ethical and educational issues that are beyond the intelligent agent's competence and control. Preliminary results on the performance of the Companion's components and studies of the educational and psychological underlying principles are presented.
The task of a Question Answering (QA) system is to automatically answer a question in natural language, searching for information in a given data source, such as structured databases or unstructured ...natural language documents. In this paper we investigate how much is needed in terms of tools and resources for a QA system for Brazilian Portuguese. In particular we assess the impact of shallow and deep methods and the contribution of different tools and resources in terms of the performance of the system in the task. We argue that a combination of deep and shallow strategies results in optimal performance, but that shallow methods alone may already give adequate results.
Much recent effort has been devoted to creating large-scale language models. Nowadays, the most prominent approaches are based on deep neural networks, such as BERT. However, they lack transparency ...and interpretability, and are often seen as black boxes. This affects not only their applicability in downstream tasks but also the comparability of different architectures or even of the same model trained using different corpora or hyperparameters. In this paper, we propose a set of intrinsic evaluation tasks that inspect the linguistic information encoded in models developed for Brazilian Portuguese. These tasks are designed to evaluate how different language models generalise information related to grammatical structures and multiword expressions (MWEs), thus allowing for an assessment of whether the model has learned different linguistic phenomena. The dataset that was developed for these tasks is composed of a series of sentences with a single masked word and a cue phrase that helps in narrowing down the context. This dataset is divided into MWEs and grammatical structures, and the latter is subdivided into 6 tasks: impersonal verbs, subject agreement, verb agreement, nominal agreement, passive and connectors. The subset for MWEs was used to test BERTimbau Large, BERTimbau Base and mBERT. For the grammatical structures, we used only BERTimbau Large, because it yielded the best results in the MWE task.
This article lies in the interface between specialized lexicography, terminology, technology and knowledge management, and it aims to present the preliminary results of the project GLOSSRI which ...concerns the development of a bilingual glossary (English -- Portuguese) for apprentices of the area of International Relations. The glossary, which will be available online, will have multimedia elements, will be collaborative and will be based on the needs of its users. In preparing the proposed glossary, the project uses theoretical contributions from two areas within linguistics, namely the Communicative Theory of Terminology and Function Theory, in Lexicography. For the construction of this project five methodological steps were established: (i) design, (ii) planning, (iii) development, (iv) adequacy, and (v) the socialization of knowledge. At the end of a year, 80 terms are expected to be included in the glossary. These terms are in a receptive position, that is, they are relevant to the students who need to read the literature in the English language recommended by their teachers in the first half of their university course. Adapted from the source document