Language and speech are the primary source of data for psychiatrists to diagnose and treat mental disorders. In psychosis, the very structure of language can be disturbed, including semantic ...coherence (e.g., derailment and tangentiality) and syntactic complexity (e.g., concreteness). Subtle disturbances in language are evident in schizophrenia even prior to first psychosis onset, during prodromal stages. Using computer‐based natural language processing analyses, we previously showed that, among English‐speaking clinical (e.g., ultra) high‐risk youths, baseline reduction in semantic coherence (the flow of meaning in speech) and in syntactic complexity could predict subsequent psychosis onset with high accuracy. Herein, we aimed to cross‐validate these automated linguistic analytic methods in a second larger risk cohort, also English‐speaking, and to discriminate speech in psychosis from normal speech. We identified an automated machine‐learning speech classifier – comprising decreased semantic coherence, greater variance in that coherence, and reduced usage of possessive pronouns – that had an 83% accuracy in predicting psychosis onset (intra‐protocol), a cross‐validated accuracy of 79% of psychosis onset prediction in the original risk cohort (cross‐protocol), and a 72% accuracy in discriminating the speech of recent‐onset psychosis patients from that of healthy individuals. The classifier was highly correlated with previously identified manual linguistic predictors. Our findings support the utility and validity of automated natural language processing methods to characterize disturbances in semantics and syntax across stages of psychotic disorder. The next steps will be to apply these methods in larger risk cohorts to further test reproducibility, also in languages other than English, and identify sources of variability. This technology has the potential to improve prediction of psychosis outcome among at‐risk youths and identify linguistic targets for remediation and preventive intervention. More broadly, automated linguistic analysis can be a powerful tool for diagnosis and treatment across neuropsychiatry.
Incoherent speech in schizophrenia has long been described as the mind making “leaps” of large distances between thoughts and ideas. Such a view seems intuitive, and for almost two decades, attempts ...to operationalize these conceptual “leaps” in spoken word meanings have used language-based embedding spaces. An embedding space represents meaning of words as numerical vectors where a greater proximity between word vectors represents more shared meaning. However, there are limitations with word vector-based operationalizations of coherence which can limit their appeal and utility in clinical practice. First, the use of esoteric word embeddings can be conceptually hard to grasp, and this is complicated by several different operationalizations of incoherent speech. This problem can be overcome by a better visualization of methods. Second, temporal information from the act of speaking has been largely neglected since models have been built using written text, yet speech is spoken in real time. This issue can be resolved by leveraging time stamped transcripts of speech. Third, contextual information - namely the situation of where something is spoken - has often only been inferred and never explicitly modeled. Addressing this situational issue opens up new possibilities for models with increased temporal resolution and contextual relevance. In this paper, direct visualizations of semantic distances are used to enable the inspection of examples of incoherent speech. Some common operationalizations of incoherence are illustrated, and suggestions are made for how temporal and spatial contextual information can be integrated in future implementations of measures of incoherence.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
From many perspectives, the election of Donald Trump was seen as a departure from long-standing political norms. An analysis of Trump’s word use in the presidential debates and speeches indicated ...that he was exceptionally informal but at the same time, spoke with a sense of certainty. Indeed, he is lower in analytic thinking and higher in confidence than almost any previous American president. Closer analyses of linguistic trends of presidential language indicate that Trump’s language is consistent with long-term linear trends, demonstrating that he is not as much an outlier as he initially seems. Across multiple corpora from the American presidents, non-US leaders, and legislative bodies spanning decades, there has been a general decline in analytic thinking and a rise in confidence in most political contexts, with the largest and most consistent changes found in the American presidency. The results suggest that certain aspects of the language style of Donald Trump and other recent leaders reflect long-evolving political trends. Implications of the changing nature of popular elections and the role of media are discussed.
Full text
Available for:
BFBNIB, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK
Researchers and policy makers worldwide are interested in measuring the subjective well-being of populations. When users post on social media, they leave behind digital traces that reflect their ...thoughts and feelings. Aggregation of such digital traces may make it possible to monitor well-being at large scale. However, social media-based methods need to be robust to regional effects if they are to produce reliable estimates. Using a sample of 1.53 billion geotagged English tweets, we provide a systematic evaluation of word-level and data-driven methods for text analysis for generating well-being estimates for 1,208 US counties. We compared Twitter-based county-level estimates with well-being measurements provided by the Gallup- Sharecare Well-Being Index survey through 1.73 million phone surveys. We find that word-level methods (e.g., Linguistic Inquiry and Word Count LIWC 2015 and Language Assessment by Mechanical Turk LabMT) yielded inconsistent county-level wellbeing measurements due to regional, cultural, and socioeconomic differences in language use. However, removing as few as three of the most frequent words led to notable improvements in well-being prediction. Data-driven methods provided robust estimates, approximating the Gallup data at up to r = 0.64. We show that the findings generalized to county socioeconomic and health outcomes and were robust when poststratifying the samples to be more representative of the general US population. Regional well-being estimation from social media data seems to be robust when supervised data-driven methods are used.
Full text
Available for:
BFBNIB, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK
Throughout history, scholars and laypeople alike have believed that our words contain subtle clues about what we are like as people, psychologically speaking. However, the ways in which language has ...been used to infer psychological processes has seen dramatic shifts over time and, with modern computational technologies and digital data sources, we are on the verge of a massive revolution in language analysis research. In this article, we discuss the past and current states of research at the intersection of language analysis and psychology, summarizing the central successes and shortcomings of psychological text analysis to date. We additionally outline and discuss a critical need for language analysis practitioners in the social sciences to expand their view of verbal behavior. Lastly, we discuss the trajectory of interdisciplinary research on language and the challenges of integrating analysis methods across paradigms, recommending promising future directions for the field along the way.
Full text
Available for:
NUK, OILJ, SAZU, UKNU, UL, UM, UPUK
Hate Speech in social media is a complex phenomenon, whose detection has recently gained significant traction in the Natural Language Processing community, as attested by several recent review works. ...Annotated corpora and benchmarks are key resources, considering the vast number of supervised approaches that have been proposed. Lexica play an important role as well for the development of hate speech detection systems. In this review, we systematically analyze the resources made available by the community at large, including their development methodology, topical focus, language coverage, and other factors. The results of our analysis highlight a heterogeneous, growing landscape, marked by several issues and venues for improvement.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
This study aims to describe the structure of turn-taking used by students and students, as well as classroom activities, with the application of a student-centered approach. This research is ...descriptive qualitative research. The data sources in this study are two voice recordings of students' and students' speeches in the Teaching Campus activities of the 3rd batch of placements at SD Negeri Dolopo 2 Madiun Regency. The data collection technique used in this research is the listening-recording technique. The results showed that, first, in the learning process of students and students, there were four structures of turn-taking exchange, including; (1) direct management speech, (2) indirect management speech, (3) direct discipline requests, and (4) verbal reactions. Second, this study found active learning activities using a student-centered approach. The active learning activities are; (1) the teacher divides the students into several small groups, (2) the teacher gives topics that each group will discuss, (3) in groups, students discuss the topics that have been given, and (4) students present the results of group discussions.
Human ratings of conceptual disorganization, poverty of content, referential cohesion and illogical thinking have been shown to predict psychosis onset in prospective clinical high risk (CHR) cohort ...studies. The potential value of linguistic biomarkers has been significantly magnified, however, by recent advances in natural language processing (NLP) and machine learning (ML). Such methodologies allow for the rapid and objective measurement of language features, many of which are not easily recognized by human raters. Here we review the key findings on language production disturbance in psychosis. We also describe recent advances in the computational methods used to analyze language data, including methods for the automatic measurement of discourse coherence, syntactic complexity, poverty of content, referential coherence, and metaphorical language. Linguistic biomarkers of psychosis risk are now undergoing cross-validation, with attention to harmonization of methods. Future directions in extended CHR networks include studies of sources of variance, and combination with other promising biomarkers of psychosis risk, such as cognitive and sensory processing impairments likely to be related to language. Implications for the broader study of social communication, including reciprocal prosody, face expression and gesture, are discussed.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
We used a distributed-language model to examine the moral language employed by U.S. political elites. In Study 1, we analyzed 687,360 Twitter messages (tweets) posted by accounts belonging to ...Democratic and Republican members of Congress from 2016 to 2018. In Study 2, we analyzed 2,630,688 speeches given on the floor of the House and Senate from 1981 to 2017. We found that partisan differences in moral-language use shifted over time as the parties gained or lost political power. Overall, lower political power was associated with greater use of moral language for both Democrats and Republicans. On Twitter, Democrats used more moral language in the period after Donald Trump won the 2016 presidential election. In Congressional transcripts, both Democrats and Republicans used more of most kinds of moral language when they were in the minority.
Full text
Available for:
NUK, OILJ, SAZU, UKNU, UL, UM, UPUK
Deep learning for sentiment analysis Rojas-Barahona, Lina Maria
Language and linguistics compass,
12/2016, Volume:
10, Issue:
12
Journal Article
Peer reviewed
Open access
Research and industry are becoming more and more interested in finding automatically the polarised opinion of the general public regarding a specific subject. The advent of social networks has opened ...the possibility of having access to massive blogs, recommendations, and reviews. The challenge is to extract the polarity from these data, which is a task of opinion mining or sentiment analysis. The specific difficulties inherent in this task include issues related to subjective interpretation and linguistic phenomena that affect the polarity of words. Recently, deep learning has become a popular method of addressing this task. However, different approaches have been proposed in the literature. This article provides an overview of deep learning for sentiment analysis in order to place these approaches in context.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK