Machine learning is increasingly used in mental health research and has the potential to advance our understanding of how to characterize, predict, and treat mental disorders and associated adverse ...health outcomes (e.g., suicidal behavior). Machine learning offers new tools to overcome challenges for which traditional statistical methods are not well-suited. This paper provides an overview of machine learning with a specific focus on supervised learning (i.e., methods that are designed to predict or classify an outcome of interest). Several common supervised learning methods are described, along with applied examples from the published literature. We also provide an overview of supervised learning model building, validation, and performance evaluation. Finally, challenges in creating robust and generalizable machine learning algorithms are discussed.
•Machine learning may help characterize and predict psychiatric outcomes•Description of commonly used supervised learning methods•Introduction to model building, validation, and evaluation of algorithms•Discussion of challenges and opportunities to move field forward
Objective
Depression is one of the most common mental disorders in the United States in both civilian and military populations, but few prospective studies assess a wide range of predictors across ...multiple domains for new‐onset (incident) depression in adulthood. Supervised machine learning methods can identify predictors of incident depression out of many different candidate variables, without some of the assumptions and constraints that underlie traditional regression analyses. The objectives of this study were to identify predictors of incident depression across 5 years of follow‐up using machine learning, and to assess prediction accuracy of the algorithms.
Methods
Data were from a cohort of Army National Guard members free of history of depression at baseline (n = 1951 men and 298 women), interviewed once per year for probable depression. Classification trees and random forests were constructed and cross‐validated, using 84 candidate predictors from the baseline interviews.
Results
Stressors and traumas such as emotional mistreatment and adverse childhood experiences, demographics such as being a parent or student, and military characteristics including paygrade and deployment location were predictive of probable depression. Cross‐validated random forest algorithms were moderately accurate (68% for women and 73% for men).
Conclusions
Events and characteristics throughout the life course, both in and outside of deployment, predict incident depression in adulthood among military personnel. Although replication studies are needed, these results may help inform potential intervention targets to reduce depression incidence among military personnel. Future research should further refine and explore interactions between identified variables.
HIGHLIGHTS
In a cohort study of U.S. Army National Guard personnel, we found that stressors and traumas such as emotional mistreatment and adverse childhood experiences, demographics such as being a parent or student, and military characteristics including paygrade and deployment location were predictive of new‐onset depression.
Cross‐validated random forest algorithms were moderately accurate for predicting depression.
Among military personnel, events and characteristics throughout the life course—in and outside of deployment—predict new‐onset depression in adulthood.
Objective: Directed acyclic graphs (DAGs) are visual representations of the presumed causal structure of an empirical research data set. They are important tools for researchers but have been used ...rarely in the psychological trauma literature. The purpose of this article is to explain what DAGs are and why (and how) they are useful for trauma researchers. Method: We first describe the utility of DAGs for making causal assumptions explicit, identifying causal effects, and preventing bias. Basic definitions and rules governing the use of DAGs are presented using a hypothetical DAG. We explain why conditioning on a variable, for example, by controlling for it in a multivariable model, can in some circumstances actually introduce bias and not prevent it. We also provide references for topics related to DAGs that are beyond the scope of this introductory article. Results: DAGs are illustrated using the example of the effect of posttraumatic stress disorder (PTSD) on Parkinson's disease. We demonstrate that a multivariable model controlling for all covariates that are being considered introduces bias and would make it impossible to identify the causal effect of PTSD on Parkinson's disease. Conclusions: DAGs can help trauma researchers to understand when they can and when they cannot draw causal conclusions based on research data. This introduction to DAGs should help readers understand their use in the articles on marginal structural models, causal mediation analysis, and instrumental variable methods in this special section, Causal inference and agent-based modeling in trauma research.
Clinical Impact Statement
This article is an introduction to directed acyclic graphs (DAGs). DAGs are a visual depiction of a researcher's assumptions about causal effects in a group of variables. They can help prevent research bias by showing which variables should and which should not be controlled in statistical analyses. In this article, DAGs are illustrated through an example of the effect of posttraumatic stress disorder on Parkinson's disease. Readers of this article should be able to understand what a DAG is and what it is useful for.
This is an introduction to the special section “Causal Inference and Agent-Based Modeling” in trauma research. (PsycInfo Database Record (c) 2023 APA, all rights reserved)
Suicide is a major public health concern in the United States. Between 2000 and 2018, US suicide rates increased by 35%, contributing to the stagnation and subsequent decrease in US life expectancy. ...During 2019, suicide declined modestly, mostly owing to slight reductions in suicides among Whites. Suicide rates, however, continued to increase or remained stable among all other racial ethnic groups, and little is known about recent suicide trends among other vulnerable groups. This article (
a
) summarizes US suicide mortality trends over the twentieth and early twenty-first centuries, (
b
) reviews potential group-level causes of increased suicide risk among subpopulations characterized by markers of vulnerability to suicide, and (
c
) advocates for combining recent advances in population-based suicide prevention with a socially conscious perspective that captures the social, economic, and political contexts in which suicide risk unfolds over the life course of vulnerable individuals.
Abstract
Although variables are often measured with error, the impact of measurement error on machine-learning predictions is seldom quantified. The purpose of this study was to assess the impact of ...measurement error on the performance of random-forest models and variable importance. First, we assessed the impact of misclassification (i.e., measurement error of categorical variables) of predictors on random-forest model performance (e.g., accuracy, sensitivity) and variable importance (mean decrease in accuracy) using data from the National Comorbidity Survey Replication (2001–2003). Second, we created simulated data sets in which we knew the true model performance and variable importance measures and could verify that quantitative bias analysis was recovering the truth in misclassified versions of the data sets. Our findings showed that measurement error in the data used to construct random forests can distort model performance and variable importance measures and that bias analysis can recover the correct results. This study highlights the utility of applying quantitative bias analysis in machine learning to quantify the impact of measurement error on study results.
Suicide is a public health problem, with multiple causes that are poorly understood. The increased focus on combining health care data with machine-learning approaches in psychiatry may help advance ...the understanding of suicide risk.
To examine sex-specific risk profiles for death from suicide using machine-learning methods and data from the population of Denmark.
A case-cohort study nested within 8 national Danish health and social registries was conducted from January 1, 1995, through December 31, 2015. The source population was all persons born or residing in Denmark as of January 1, 1995. Data were analyzed from November 5, 2018, through May 13, 2019.
Exposures included 1339 variables spanning domains of suicide risk factors.
Death from suicide from the Danish cause of death registry.
A total of 14 103 individuals died by suicide between 1995 and 2015 (10 152 men 72.0%; mean SD age, 43.5 18.8 years and 3951 women 28.0%; age, 47.6 18.8 years). The comparison subcohort was a 5% random sample (n = 265 183) of living individuals in Denmark on January 1, 1995 (130 591 men 49.2%; age, 37.4 21.8 years and 134 592 women 50.8%; age, 39.9 23.4 years). With use of classification trees and random forests, sex-specific differences were noted in risk for suicide, with physical health more important to men's suicide risk than women's suicide risk. Psychiatric disorders and possibly associated medications were important to suicide risk, with specific results that may increase clarity in the literature. Generally, diagnoses and medications measured 48 months before suicide were more important indicators of suicide risk than when measured 6 months earlier. Individuals in the top 5% of predicted suicide risk appeared to account for 32.0% of all suicide cases in men and 53.4% of all cases in women.
Despite decades of research on suicide risk factors, understanding of suicide remains poor. In this study, the first to date to develop risk profiles for suicide based on data from a full population, apparent consistency with what is known about suicide risk was noted, as well as potentially important, understudied risk factors with evidence of unique suicide risk profiles among specific subpopulations.
Depression increases the risk of suicide death and non-fatal suicide attempt. Between 2 - 6% of persons with depression will die by suicide1 and 25 - 31% of persons with depression will make a ...non-fatal suicide attempt during their lifetime.2,3 Despite the strong association between depression and suicidal behavior, the vast majority of persons with depression will not engage in suicidal behavior, making it difficult to accurately predict who is at risk for suicide and non-fatal suicide attempt. Identifying high risk persons who should be connected to suicide prevention interventions is an important public health goal. Furthermore, depression often co-occurs with other mental disorders, which may exert an interactive influence on the risk of suicide and suicide attempt. Understanding the joint influence of depression and other mental disorders on suicide outcomes may inform prevention strategies. The goals of this dissertation were to predict suicide and non-fatal suicide attempt among persons with depression and to quantify the causal joint effect of depression and comorbid psychiatric disorders on suicide and suicide attempt. For all three studies, we used data from Danish registries, which routinely collect high-quality data in a setting of universal health care with long-term follow-up and registration of most health and life events.4;
In Study 1, we predicted suicide deaths among men and women diagnosed with depression using a case-cohort design (n = 14,737). Approximately 800 predictors were included in the machine learning models (classification trees and random forests), spanning demographic characteristics, income, employment, immigrant status, citizenship, family suicidal history (parent or spouse), previous suicide attempts, mental disorders, physical health disorders, surgeries, prescription drugs, and psychotherapy. In depressed men, we found interactions between hypnotics and sedatives, analgesics and antipyretics, and previous poisonings that were associated with a high risk of suicide. In depressed women, there were interactions between poisoning and anxiolytics and between anxiolytics and hypnotics and sedatives that were associated with suicide risk. The variables in the random forests that contributed the most to prediction accuracy in depressed men were previous poisoning diagnoses and prescriptions of hypnotics and sedatives and anxiolytics. In depressed women, the most important predictors of suicide were receipt of state pension, prescriptions for psychiatric medications (anxiolytics and antipsychotics) and diagnoses of poisoning, alcohol related disorders, and reaction to severe stress and adjustment disorders. Prescriptions of analgesics and antipyretics (e.g., acetaminophen) and antithrombotic agents (e.g., aspirin) emerged as important predictors for both depressed men and women. ;
Study 2 predicted non-fatal suicide attempts among men and women diagnosed with depression using a case-cohort design (n = 17,995). Among depressed men, there was a high risk of suicide attempt among those who received a state pension and were diagnosed with toxic effects of substances. There was also an interaction between reaction to severe stress and adjustment disorder and not receiving psychological help that was associated with suicide attempt risk among depressed men. In depressed women, suicide attempt risk was high in those who were prescribed antipsychotics, diagnosed with specific personality disorders, did not have a poisoning diagnosis, and were not receiving a state pension. For both men and women, the random forest results showed that the strongest contributors to prediction accuracy of suicide attempts were poisonings, alcohol related disorders, reaction to severe stress and adjustment disorders, drugs used to treat psychiatric disorders (e.g., drugs used in addictive disorders, anxiolytics, hypnotics and sedatives), anti-inflammatory medications, receipt of state pension, and remaining single. ;
Study 3 examined the joint effect of depression and other mental disorders on suicide and non-fatal suicide attempts using a case-cohort design (suicide death analysis n = 279,286; suicide attempt analysis n = 288,157). We examined pairwise combinations of depression with: 1) organic disorders, 2) substance use disorders, 3) schizophrenia, 4) bipolar disorder, 5) neurotic disorders, 6) eating disorders, 7) personality disorders, 8) intellectual disabilities, 9) developmental disorders, and 10) behavioral disorders. We fit sex-stratified joint marginal structural Cox models to account for time-varying confounding. We observed large hazard ratios for the joint effect of depression and comorbid mental disorders on suicide and suicide attempts, the effect of depression in the absence of comorbid mental disorders, and for the effect of comorbid mental disorders in the absence of depression. We observed positive and negative interdependence between different combinations of depression and comorbid mental disorders on the rate of suicide and suicide attempt, with variation by sex. Overall, depression and comorbid mental disorders are harmful exposures, both independently and jointly. ;
All of the studies in this dissertation highlight the important role of interactions between risk factors in suicidal behavior among persons with depression. Depression is one of the most commonly assessed risk factors for suicide,5,6 and our findings underscore the value of considering additional risk factors such as other psychiatric disorders, psychiatric medications, and social factors in combination with depression. The results of this dissertation may help inform potential risk identification strategies which may facilitate the targeting of suicide prevention interventions to those most vulnerable.
In the absence of head-to-head trials, indirect treatment comparisons (ITCs) are often used to compare the efficacy of different therapies to support decision-making. Matching-adjusted indirect ...comparison (MAIC), a type of ITC, is increasingly used to compare treatment efficacy when individual patient data are available from one trial and only aggregate data are available from the other trial. This paper examines the conduct and reporting of MAICs to compare treatments for spinal muscular atrophy (SMA), a rare neuromuscular disease. A literature search identified three studies comparing approved treatments for SMA including nusinersen, risdiplam, and onasemnogene abeparvovec. The quality of the MAICs was assessed on the basis of the following principles consolidated from published MAIC best practices: (1) justification for the use of MAIC is clearly stated, (2) the included trials with respect to study population and design are comparable, (3) all known confounders and effect modifiers are identified a priori and accounted for in the analysis, (4) outcomes should be similar in definition and assessment, (5) baseline characteristics are reported before and after adjustment, along with weights, and (6) key details of a MAIC are reported. In the three MAIC publications in SMA to date, the quality of analysis and reporting varied greatly. Various sources of bias in the MAICs were identified, including lack of control for key confounders and effect modifiers, inconsistency in outcome definitions across trials, imbalances in important baseline characteristics after weighting, and lack of reporting key elements. These findings highlight the importance of evaluating MAICs according to best practices when assessing the conduct and reporting of MAICs.