The technology for evaluating patient-provider interactions in psychotherapy-observational coding-has not changed in 70 years. It is labor-intensive, error prone, and expensive, limiting its use in ...evaluating psychotherapy in the real world. Engineering solutions from speech and language processing provide new methods for the automatic evaluation of provider ratings from session recordings. The primary data are 200 Motivational Interviewing (MI) sessions from a study on MI training methods with observer ratings of counselor empathy. Automatic Speech Recognition (ASR) was used to transcribe sessions, and the resulting words were used in a text-based predictive model of empathy. Two supporting datasets trained the speech processing tasks including ASR (1200 transcripts from heterogeneous psychotherapy sessions and 153 transcripts and session recordings from 5 MI clinical trials). The accuracy of computationally-derived empathy ratings were evaluated against human ratings for each provider. Computationally-derived empathy scores and classifications (high vs. low) were highly accurate against human-based codes and classifications, with a correlation of 0.65 and F-score (a weighted average of sensitivity and specificity) of 0.86, respectively. Empathy prediction using human transcription as input (as opposed to ASR) resulted in a slight increase in prediction accuracies, suggesting that the fully automatic system with ASR is relatively robust. Using speech and language processing methods, it is possible to generate accurate predictions of provider performance in psychotherapy from audio recordings alone. This technology can support large-scale evaluation of psychotherapy for dissemination and process studies.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Qualitative data analysis software (QDAS) programs are well-established research tools, but little is known about how researchers use them. This article reports the results of a content analysis of ...763 empirical articles, published in the Scopus database between 1994 and 2013, which explored how researchers use the ATLAS.ti™ and NVivo™ QDAS programs.* The analysis specifically investigated who is using these tools (in terms of subject discipline and author country of origin), and how they are being used to support research (in terms of type of data, type of study, and phase of the research process that QDAS were used to support). The study found that the number of articles reporting QDAS is increasing each year, and that the majority of studies using ATLAS.ti™ and NVivo™ were published in health sciences journals by authors from the United Kingdom, United States, Netherlands, Canada, and Australia. Researchers used QDAS to support a variety of research designs and most commonly used the programs to support analyses of data gathered through interviews, focus groups, documents, field notes, and open-ended survey questions. Although QDAS can support multiple phases of the research process, the study found the vast majority of researchers are using it for data management and analysis, with fewer using it for data collection/creation or to visually display their methods and findings. This article concludes with some discussion of the extent to which QDAS users appear to have leveraged the potential of these programs to support new approaches to research.
Mobile apps for mental health have the potential to overcome access barriers to mental health care, but there is little information on whether patients use the interventions as intended and the ...impact they have on mental health outcomes.
The objective of our study was to document and compare use patterns and clinical outcomes across the United States between 3 different self-guided mobile apps for depression.
Participants were recruited through Web-based advertisements and social media and were randomly assigned to 1 of 3 mood apps. Treatment and assessment were conducted remotely on each participant's smartphone or tablet with minimal contact with study staff. We enrolled 626 English-speaking adults (≥18 years old) with mild to moderate depression as determined by a 9-item Patient Health Questionnaire (PHQ-9) score ≥5, or if their score on item 10 was ≥2. The apps were (1) Project: EVO, a cognitive training app theorized to mitigate depressive symptoms by improving cognitive control, (2) iPST, an app based on an evidence-based psychotherapy for depression, and (3) Health Tips, a treatment control. Outcomes were scores on the PHQ-9 and the Sheehan Disability Scale. Adherence to treatment was measured as number of times participants opened and used the apps as instructed.
We randomly assigned 211 participants to iPST, 209 to Project: EVO, and 206 to Health Tips. Among the participants, 77.0% (482/626) had a PHQ-9 score >10 (moderately depressed). Among the participants using the 2 active apps, 57.9% (243/420) did not download their assigned intervention app but did not differ demographically from those who did. Differential treatment effects were present in participants with baseline PHQ-9 score >10, with the cognitive training and problem-solving apps resulting in greater effects on mood than the information control app (χ22=6.46, P=.04).
Mobile apps for depression appear to have their greatest impact on people with more moderate levels of depression. In particular, an app that is designed to engage cognitive correlates of depression had the strongest effect on depressed mood in this sample. This study suggests that mobile apps reach many people and are useful for more moderate levels of depression.
Clinicaltrials.gov NCT00540865; https://www.clinicaltrials.gov/ct2/show/NCT00540865 (Archived by WebCite at http://www.webcitation.org/6mj8IPqQr).
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
During a psychotherapy session, the counselor typically adopts techniques which are codified along specific dimensions (e.g., 'displays warmth and confidence', or 'attempts to set up collaboration') ...to facilitate the evaluation of the session. Those constructs, traditionally scored by trained human raters, reflect the complex nature of psychotherapy and highly depend on the context of the interaction. Recent advances in deep contextualized language models offer an avenue for accurate in-domain linguistic representations which can lead to robust recognition and scoring of such psychotherapy-relevant behavioral constructs, and support quality assurance and supervision. In this work, we propose a BERT-based model for automatic behavioral scoring of a specific type of psychotherapy, called Cognitive Behavioral Therapy (CBT), where prior work is limited to frequency-based language features and/or short text excerpts which do not capture the unique elements involved in a spontaneous long conversational interaction. The model focuses on the classification of therapy sessions with respect to the overall score achieved on the widely-used Cognitive Therapy Rating Scale (CTRS), but is trained in a multi-task manner in order to achieve higher interpretability. BERT-based representations are further augmented with available therapy metadata, providing relevant non-linguistic context and leading to consistent performance improvements. We train and evaluate our models on a set of 1,118 real-world therapy sessions, recorded and automatically transcribed. Our best model achieves an F1 score equal to 72.61% on the binary classification task of low vs. high total CTRS.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Marital and family researchers often study infrequent behaviors. These powerful psychological variables, such as abuse, criticism, and drug use, have important ramifications for families and society ...as well as for the statistical models used to study them. Most researchers continue to rely on ordinary least-squares (OLS) regression for these types of data, but estimates and inferences from OLS regression can be seriously biased for count data such as these. This article presents a tutorial on statistical methods for positively skewed event data, including Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. These statistical methods are introduced through a marital commitment example, and the data and computer code to run the example analyses in R, SAS, SPSS, and Mplus are included in the online supplemental material. Extensions and practical advice are given to assist researchers in using these tools with their data.
Display omitted
•Mild Parkinson’s can be distinguished from no Parkinson’s using voice features.•Identity confounding leads to overestimates of model performance.•Gradient boosted model outperforms ...Random Forest and Logistic Regression models.
Voice technology has grown tremendously in recent years and using voice as a biomarker has also been gaining evidence. We demonstrate the potential of voice in serving as a deep phenotype for Parkinson’s Disease (PD), the second most common neurodegenerative disorder worldwide, by presenting methodology for voice signal processing for clinical analysis. Detection of PD symptoms typically requires an exam by a movement disorder specialist and can be hard to access and inconsistent in findings. A vocal digital biomarker could supplement the cumbersome existing manual exam by detecting and quantifying symptoms to guide treatment. Specifically, vocal biomarkers of PD are a potentially effective method of assessing symptoms and severity in daily life, which is the focus of the current research. We analyzed a database of PD patient and non-PD subjects containing voice recordings that were used to extract paralinguistic features, which served as inputs to machine learning models to predict PD severity. The results are presented here and the limitations are discussed given the nature of the recordings. We note that our methodology only advances biomarker research and is not cleared for clinical use. Specifically, we demonstrate that conventional machine learning models applied to voice signals can be used to differentiate participants with PD who exhibit little to no symptoms from healthy controls. This work highlights the potential of voice to be used for early detection of PD and indicates that voice may serve as a deep phenotype for PD, enabling precision medicine by improving the speed, accuracy, accessibility, and cost of PD management.
Advances in artificial intelligence (AI) are enabling systems that augment and collaborate with humans to perform simple, mechanistic tasks such as scheduling meetings and grammar-checking text. ...However, such human–AI collaboration poses challenges for more complex tasks, such as carrying out empathic conversations, due to the difficulties that AI systems face in navigating complex human emotions and the open-ended nature of these tasks. Here we focus on peer-to-peer mental health support, a setting in which empathy is critical for success, and examine how AI can collaborate with humans to facilitate peer empathy during textual, online supportive conversations. We develop HAILEY, an AI-in-the-loop agent that provides just-in-time feedback to help participants who provide support (peer supporters) respond more empathically to those seeking help (support seekers). We evaluate HAILEY in a non-clinical randomized controlled trial with real-world peer supporters on TalkLife (N = 300), a large online peer-to-peer support platform. We show that our human–AI collaboration approach leads to a 19.6% increase in conversational empathy between peers overall. Furthermore, we find a larger, 38.9% increase in empathy within the subsample of peer supporters who self-identify as experiencing difficulty providing support. We systematically analyse the human–AI collaboration patterns and find that peer supporters are able to use the AI feedback both directly and indirectly without becoming overly reliant on AI while reporting improved self-efficacy post-feedback. Our findings demonstrate the potential of feedback-driven, AI-in-the-loop writing systems to empower humans in open-ended, social and high-stakes tasks such as empathic conversations.AI language modelling and generation approaches have developed fast in the last decade, opening promising new directions in human–AI collaboration. An AI-in-the loop conversational system called HAILEY is developed to empower peer supporters in providing empathic responses to mental health support seekers.
Abstract Motivational interviewing (MI) is an efficacious treatment for substance use disorders and other problem behaviors. Studies on MI fidelity and mechanisms of change typically use human raters ...to code therapy sessions, which requires considerable time, training, and financial costs. Natural language processing techniques have recently been utilized for coding MI sessions using machine learning techniques, rather than human coders, and preliminary results have suggested these methods hold promise. The current study extends this previous work by introducing two natural language processing models for automatically coding MI sessions via computer. The two models differ in the way they semantically represent session content, utilizing either 1) simple discrete sentence features (DSF model) and 2) more complex recursive neural networks (RNN model). Utterance- and session-level predictions from these models were compared to ratings provided by human coders using a large sample of MI sessions ( N = 341 sessions; 78,977 clinician and client talk turns) from 6 MI studies. Results show that the DSF model generally had slightly better performance compared to the RNN model. The DSF model had “good” or higher utterance-level agreement with human coders (Cohen's kappa > 0.60) for open and closed questions, affirm, giving information, and follow/neutral (all therapist codes); considerably higher agreement was obtained for session-level indices, and many estimates were competitive with human-to-human agreement. However, there was poor agreement for client change talk, client sustain talk, and therapist MI-inconsistent behaviors. Natural language processing methods provide accurate representations of human derived behavioral codes and could offer substantial improvements to the efficiency and scale in which MI mechanisms of change research and fidelity monitoring are conducted.
IMPORTANCE: Accessible and cost-effective interventions for suicidality are needed to address high rates of suicidal behavior among military service members. Caring Contacts are brief periodic ...messages that express unconditional care and concern and have been previously shown to prevent suicide deaths, attempts, ideation, and hospitalizations. OBJECTIVE: To test the effectiveness of augmenting standard military health care with Caring Contacts delivered via text message to reduce suicidal thoughts and behaviors over 12 months. DESIGN, SETTING, AND PARTICIPANTS: This randomized clinical trial was conducted at 3 military installations in the southern and western United States. Soldiers and Marines identified as being at risk of suicide were recruited between April 2013 and September 2016. The final follow-up was in September 2017. INTERVENTIONS: Both groups received standard care, and the Caring Contacts group also received consisted of 11 text messages delivered on day 1, at week 1, at months 1, 2, 3, 4, 6, 8, 10, and 12, and on participants’ birthdays. MAIN OUTCOMES AND MEASURES: Primary outcomes were current suicidal ideation and suicide risk incidents (hospitalization or medical evacuation). Secondary outcomes were worst-point suicidal ideation, emergency department visits, and suicide attempts. Suicidal ideation was measured by the Scale for Suicide Ideation, suicide risk incidents, and emergency department visits by the Treatment History Interview; attempted suicide was measured by the Suicide Attempt Self-Injury Count. RESULTS: Among 658 randomized participants (329 randomizely assigned to each group), data were analyzed for 657 individuals (mean SD age, 25.2 6.1 years; 539 men 82.0%). All participants reported suicidal ideation at baseline, and 291 (44.3%) had previously attempted suicide. Of the 657 participants, 461 (70.2%) were assessed at 12 months. Primary outcomes were nonsignificant. There was no significant effect on likelihood or severity of current suicidal ideation or likelihood of a suicide risk incident; there was also no effect on emergency department visits. However, participants who received Caring Contacts (172 of 216 participants 79.6%) had lower odds than those receiving standard care alone (179 of 204 participants 87.7%) of experiencing any suicidal ideation between baseline and follow-up (odds ratio, 0.56 95% CI, 0.33-0.95; P = .03) and fewer had attempted suicide since baseline (21 of 233 9.0% in the group receiving Caring Contacts vs 34 of 228 14.9% in the standard-care group; odds ratio, 0.52 95% CI, 0.29-0.92; P = .03). CONCLUSIONS AND RELEVANCE: This trial provides inconsistent results on the effectiveness of caring text messages between primary and secondary outcomes, but this inexpensive and scalable intervention offers promise for preventing suicide attempts and ideation in military personnel. Additional research is needed. TRIAL REGISTRATION: ClinicalTrials.gov identifier: NCT01829620
Critical research questions in the study of addictive behaviors concern how these behaviors change over time: either as the result of intervention or in naturalistic settings. The combination of ...count outcomes that are often strongly skewed with many zeroes (e.g., days using, number of total drinks, number of drinking consequences) with repeated assessments (e.g., longitudinal follow-up after intervention or daily diary data) present challenges for data analyses. The current article provides a tutorial on methods for analyzing longitudinal substance use data, focusing on Poisson, zero-inflated, and hurdle mixed models, which are types of hierarchical or multilevel models. Two example datasets are used throughout, focusing on drinking-related consequences following an intervention and daily drinking over the past 30 days, respectively. Both datasets as well as R, SAS, Mplus, Stata, and SPSS code showing how to fit the models are available on a supplemental website.