To date, nutritional epidemiology has relied heavily on relatively weak methods including simple observational designs and substandard measurements. Despite low internal validity and other sources of ...bias, claims of causality are made commonly in this literature. Nutritional epidemiology investigations can be improved through greater scientific rigor and adherence to scientific reporting commensurate with research methods used. Some commentators advocate jettisoning nutritional epidemiology entirely, perhaps believing improvements are impossible. Still others support only normative refinements. But neither abolition nor minor tweaks are appropriate. Nutritional epidemiology, in its present state, offers utility, yet also needs marked, reformational renovation. Changing the status quo will require ongoing, unflinching scrutiny of research questions, practices, and reporting-and a willingness to admit that "good enough" is no longer good enough. As such, a workshop entitled "Toward more rigorous and informative nutritional epidemiology: the rational space between dismissal and defense of the status quo" was held from July 15 to August 14, 2020. This virtual symposium focused on: (1) Stronger Designs, (2) Stronger Measurement, (3) Stronger Analyses, and (4) Stronger Execution and Reporting. Participants from several leading academic institutions explored existing, evolving, and new better practices, tools, and techniques to collaboratively advance specific recommendations for strengthening nutritional epidemiology.
Objective
Topic modeling (TM) refers to a group of methods for mathematically identifying latent topics in large corpora of data. Although TM shows promise as a tool for social science research, most ...researchers lack awareness of the tool's utility. Therefore, this article provides a brief overview of TM's logic and processes, offers a simple example, and suggests several possible uses in social sciences.
Methods
Using latent semantic analysis in our example, we analyzed transcripts of the 2016 U.S. presidential debates between Hillary Clinton and Donald Trump.
Results
Resulting topics paralleled the most frequent policy‐related Internet searches at the time. When divided by candidate, changes in emergent topics reflected individual policy stances, with nuanced differences between the two.
Conclusion
Findings underscored the utility of TM to identify thematic patterns embedded in large quantities of text. TM, therefore, represents a valuable addition to the social scientist's methodological tool set.
Background The COVID-19 pandemic led to mental health fallout in the US; yet research about mental health and COVID-19 primarily rely on samples that may overlook variance in regional mental health. ...Indeed, between-city comparisons of mental health decline in the US may provide further insight into how the pandemic is disproportionately affecting at-risk groups. Purpose This study leverages social media and COVID-19-city infection data to measure the longitudinal (January 22- July 31, 2020) mental health effects of the COVID-19 pandemic in 20 metropolitan areas. Methods We used longitudinal VADER sentiment analysis of Twitter timelines (January-July 2020) for cohorts in 20 metropolitan areas to examine mood changes over time. We then conducted simple and multivariate Ordinary Least Squares (OLS) regressions to examine the relationship between COVID-19 infection city data, population, population density, and city demographics on sentiment across those 20 cities. Results Longitudinal sentiment tracking showed mood declines over time. The univariate OLS regression highlighted a negative linear relationship between COVID-19 city data and online sentiment (beta = -.017). Residing in predominantly white cities had a protective effect against COVID-19 driven negative mood (beta = .0629, p < .001). Discussion Our results reveal that metropolitan areas with larger communities of color experienced a greater subjective well-being decline than predominantly white cities, which we attribute to clinical and socioeconomic correlates that place communities of color at greater risk of COVID-19. Conclusion The COVID-19 pandemic is a driver of declining US mood in 20 metropolitan cities. Other factors, including social unrest and local demographics, may compound and exacerbate mental health outlook in racially diverse cities.
Negative affect variability is associated with increased symptoms of internalizing psychopathology (i.e., depression, anxiety). The Contrast Avoidance Model (CAM) suggests that individuals with ...anxiety avoid negative emotional shifts by maintaining pathological worry. Recent evidence also suggests that the CAM can be applied to major depression and social phobia, both characterized by negative affect changes. Here, we compare negative affect variability between individuals with a variety of anxiety and depression diagnoses by measuring the levels and degree of change in the sentiment of their online communications.
Participants were 1,853 individuals on Twitter who reported that they had been clinically diagnosed with an anxiety disorder (A cohort, n = 896) or a depressive disorder (D cohort, n = 957). Mean negative affect (NA) and negative affect variability were calculated using the Valence Aware Dictionary for Sentiment Reasoning (VADER), an accurate sentiment analysis tool that scores text in terms of its negative affect content.
Findings showed differences in negative affect variability between the D and A cohort, with higher levels of NA variability in the D cohort than the A cohort, U = 367210, p < .001, r = 0.14, d = 0.25. Furthermore, we found that A and D cohorts had different average NA, with the D cohort showing higher NA overall, U = 377368, p < .001, r = 0.12, d = 0.21.
Our sample is limited to individuals who disclosed their diagnoses online, which may involve bias due to self-selection and stigma. Our sentiment analysis of online text may not completely capture all nuances of individual affect.
Individuals with depression diagnoses showed a higher degree of negative affect variability compared to individuals with anxiety disorders. Our findings support the idea that negative affect variability can be measured using computational approaches on large-scale social media data and that social media data can be used to study naturally occurring mental health effects at scale.
The purpose of this research was to examine individual differences related to fear of, perceived susceptibility to, and perceived severity of mpox as well as mpox knowledge, fear, perceived ...susceptibility, and perceived severity as predictors of vaccine intention in a national survey of U.S. adults (aged ≥18 years). Address-based sampling (ABS) methods were used to ensure full coverage of all households in the nation, reflecting the 2021 March Supplement of the Current Population Survey. Internet-based surveys were self-administered by Ipsos between September 16-26, 2022. N = 1018 participants completed the survey. The survey included items, based partially on the Health Belief Model, assessing vaccine intention (1 item; responses from 1 Definitely not to 5 Definitely), fear of mpox (7-item scale; α = .89; theoretical mean = 7-35), perceived susceptibility to mpox (3-item scale; α = .85; theoretical mean = 3-15), and perceived severity of mpox (4-item scale; α = .65; theoretical mean = 4-20). Higher scores indicate greater fear, susceptibility, and severity. One-way ANOVAs were run to examine mean score differences by demographic groups (e.g., gender, race/ethnicity, sexual orientation), and multiple regression analyses assessed the relationship between predictors (mpox knowledge, susceptibility/severity, fear) and a single outcome (vaccination intention), while controlling for demographic covariates. Sampling weights were applied to all analyses. Only 1.8% (n = 18) of respondents reported having received the mpox vaccine. While mpox vaccine intention was low (M = 2.09, SD = 0.99), overall differences between racial/ethnic, sexual orientation, education, and household income groups were statistically significant. Fear of mpox was very low (M = 13.13, SD = 5.33), and there were overall statistically significant differences in both fear and perceived severity among gender, race/ethnicity, sexual orientation, education, and household income groups. While respondents reported not feeling very susceptible to mpox (M = 5.77, SD = 2.50), they generally rated mpox as just above the theoretical mean in terms of severity (M = 11.01, SD = 2.85). Mpox knowledge, fear, severity, and susceptibility, as well as race/ethnicity, were all statistically significant predictors of intention to vaccinate, with susceptibility representing the strongest predictor. Overall, Americans' vaccination for mpox/vaccine intent was low. Gay/lesbian and racial/ethnic minority respondents felt more susceptible to and viewed mpox more severely, compared with heterosexual and White respondents, respectively. These data may be used to tailor risk and prevention (e.g., vaccination) interventions, as cases continue to surge in the current global mpox outbreak. Greater perceptions of susceptibility, severity, and fear about mpox exist largely among minority populations. While public health messaging to promote mpox vaccination can focus on improving knowledge, as well as addressing fear and perceived severity of, and susceptibility to, mpox, such messages should be carefully crafted to prevent disproportionate negative effects on marginalized communities.
Attitudes toward abortion have historically been characterized via dichotomized labels, yet research suggests that these labels do not appropriately encapsulate beliefs on abortion. Rather, contexts, ...circumstances, and lived experiences often shape views on abortion into more nuanced and complex perspectives. Qualitative data have also been shown to underpin belief systems regarding abortion. Social media, as a form of qualitative data, could reveal how attitudes toward abortion are communicated publicly in web-based spaces. Furthermore, in some cases, social media can also be leveraged to seek health information.
This study applies natural language processing and social media mining to analyze Reddit (Reddit, Inc) forums specific to abortion, including r/Abortion (the largest subreddit about abortion) and r/AbortionDebate (a subreddit designed to discuss and debate worldviews on abortion). Our analytical pipeline intends to identify potential themes within the data and the affect from each post.
We applied a neural network-based topic modeling pipeline (BERTopic) to uncover themes in the r/Abortion (n=2151) and r/AbortionDebate (n=2815) subreddits. After deriving the optimal number of topics per subreddit using an iterative coherence score calculation, we performed a sentiment analysis using the Valence Aware Dictionary and Sentiment Reasoner to assess positive, neutral, and negative affect and an emotion analysis using the Text2Emotion lexicon to identify potential emotionality per post. Differences in affect and emotion by subreddit were compared.
The iterative coherence score calculation revealed 10 topics for both r/Abortion (coherence=0.42) and r/AbortionDebate (coherence=0.35). Topics in the r/Abortion subreddit primarily centered on information sharing or offering a source of social support; in contrast, topics in the r/AbortionDebate subreddit centered on contextualizing shifting or evolving views on abortion across various ethical, moral, and legal domains. The average compound Valence Aware Dictionary and Sentiment Reasoner scores for the r/Abortion and r/AbortionDebate subreddits were 0.01 (SD 0.44) and -0.06 (SD 0.41), respectively. Emotionality scores were consistent across the r/Abortion and r/AbortionDebate subreddits; however, r/Abortion had a marginally higher average fear score of 0.36 (SD 0.39).
Our findings suggest that people posting on abortion forums on Reddit are willing to share their beliefs, which manifested in diverse ways, such as sharing abortion stories including how their worldview changed, which critiques the value of dichotomized abortion identity labels, and information seeking. Notably, the style of discourse varied significantly by subreddit. r/Abortion was principally leveraged as an information and outreach source; r/AbortionDebate largely centered on debating across various legal, ethical, and moral abortion domains. Collectively, our findings suggest that abortion remains an opaque yet politically charged issue for people and that social media can be leveraged to understand views and circumstances surrounding abortion.
Social media is an important information source for a growing subset of the population and can likely be leveraged to provide insight into the evolving drug overdose epidemic. Twitter can provide ...valuable insight into trends, colloquial information available to potential users, and how networks and interactivity might influence what people are exposed to and how they engage in communication around drug use.
This exploratory study was designed to investigate the ways in which unsupervised machine learning analyses using natural language processing could identify coherent themes for tweets containing substance names.
This study involved harnessing data from Twitter, including large-scale collection of brand name (N=262,607) and street name (N=204,068) prescription drug-related tweets and use of unsupervised machine learning analyses (ie, natural language processing) of collected data with data visualization to identify pertinent tweet themes. Latent Dirichlet allocation (LDA) with coherence score calculations was performed to compare brand (eg, OxyContin) and street (eg, oxys) name tweets.
We found people discussed drug use differently depending on whether a brand name or street name was used. Brand name categories often contained political talking points (eg, border, crime, and political handling of ongoing drug mitigation strategies). In contrast, categories containing street names occasionally referenced drug misuse, though multiple social uses for a term (eg, Sonata) muddled topic clarity.
Content in the brand name corpus reflected discussion about the drug itself and less often reflected personal use. However, content in the street name corpus was notably more diverse and resisted simple LDA categorization. We speculate this may reflect effective use of slang terminology to clandestinely discuss drug-related activity. If so, straightforward analyses of digital drug-related communication may be more difficult than previously assumed. This work has the potential to be used for surveillance and detection of harmful drug use information. It also might be used for appropriate education and dissemination of information to persons engaged in drug use content on Twitter.
Shortly after the worst of the COVID-19 pandemic, an outbreak of mpox introduced another critical public health emergency. Like the COVID-19 pandemic, the mpox outbreak was characterized by a rising ...prevalence of public health misinformation on social media, through which many US adults receive and engage with news. Digital misinformation continues to challenge the efforts of public health officials in providing accurate and timely information to the public. We examine the evolving topic distributions of social media narratives during the mpox outbreak to map the tension between rapidly diffusing misinformation and public health communication.
This study aims to observe topical themes occurring in a large-scale collection of tweets about mpox using deep learning.
We leveraged a data set comprised of all mpox-related tweets that were posted between May 7, 2022, and July 23, 2022. We then applied Sentence Bidirectional Encoder Representations From Transformers (S-BERT) to the content of each tweet to generate a representation of its content in high-dimensional vector space, where semantically similar tweets will be located closely together. We projected the set of tweet embeddings to a 2D map by applying principal component analysis and Uniform Manifold Approximation Projection (UMAP). Finally, we group these data points into 7 topical clusters using k-means clustering and analyze each cluster to determine its dominant topics. We analyze the prevalence of each cluster over time to evaluate longitudinal thematic changes.
Our deep-learning pipeline revealed 7 distinct clusters of content: (1) cynicism, (2) exasperation, (3) COVID-19, (4) men who have sex with men, (5) case reports, (6) vaccination, and (7) World Health Organization (WHO). Clusters that largely communicated erroneous or irrelevant information began earlier and grew faster, reaching a wider audience than later communications by official instances and health officials.
Within a few weeks of the first reported mpox cases, an avalanche of mostly false, misleading, irrelevant, or damaging information started to circulate on social media. Official institutions, including the WHO, acted promptly, providing case reports and accurate information within weeks, but were overshadowed by rapidly spreading social media chatter. Our results point to the need for real-time monitoring of social media content to optimize responses to public health emergencies.
Dry January, a temporary alcohol abstinence campaign, encourages individuals to reflect on their relationship with alcohol by temporarily abstaining from consumption during the month of January. ...Though Dry January has become a global phenomenon, there has been limited investigation into Dry January participants' experiences. One means through which to gain insights into individuals' Dry January-related experiences is by leveraging large-scale social media data (eg, Twitter chatter) to explore and characterize public discourse concerning Dry January.
We sought to answer the following questions: (1) What themes are present within a corpus of tweets about Dry January, and is there consistency in the language used to discuss Dry January across multiple years of tweets (2020-2022)? (2) Do unique themes or patterns emerge in Dry January 2021 tweets after the onset of the COVID-19 pandemic? and (3) What is the association with tweet composition (ie, sentiment and human-authored vs bot-authored) and engagement with Dry January tweets?
We applied natural language processing techniques to a large sample of tweets (n=222,917) containing the term "dry january" or "dryjanuary" posted from December 15 to February 15 across three separate years of participation (2020-2022). Term frequency inverse document frequency, k-means clustering, and principal component analysis were used for data visualization to identify the optimal number of clusters per year. Once data were visualized, we ran interpretation models to afford within-year (or within-cluster) comparisons. Latent Dirichlet allocation topic modeling was used to examine content within each cluster per given year. Valence Aware Dictionary and Sentiment Reasoner sentiment analysis was used to examine affect per cluster per year. The Botometer automated account check was used to determine average bot score per cluster per year. Last, to assess user engagement with Dry January content, we took the average number of likes and retweets per cluster and ran correlations with other outcome variables of interest.
We observed several similar topics per year (eg, Dry January resources, Dry January health benefits, updates related to Dry January progress), suggesting relative consistency in Dry January content over time. Although there was overlap in themes across multiple years of tweets, unique themes related to individuals' experiences with alcohol during the midst of the COVID-19 global pandemic were detected in the corpus of tweets from 2021. Also, tweet composition was associated with engagement, including number of likes, retweets, and quote-tweets per post. Bot-dominant clusters had fewer likes, retweets, or quote tweets compared with human-authored clusters.
The findings underscore the utility for using large-scale social media, such as discussions on Twitter, to study drinking reduction attempts and to monitor the ongoing dynamic needs of persons contemplating, preparing for, or actively pursuing attempts to quit or cut down on their drinking.
Although social connection to others with lived addiction experiences is a strong predictor of long-term recovery from substance use disorders (SUD), the COVID-19 pandemic greatly altered global ...abilities to physically connect with other people. Evidence suggests online forums for people with SUD may serve as a sufficient proxy for social connection, however efficacy of online spaces as addiction treatment adjuncts remains empirically understudied.
The purpose of this study is to analyze a collection of Reddit posts germane to addiction and recovery collected between March-August 2022.
We collected (n = 9,066) Reddit posts (1) r/addiction; (2) r/DecidingToBeBetter, (3) r/SelfImprovement, (4) r/OpitatesRecovery, (5) r/StopSpeeding, (6) r/RedditorsInRecovery, and (7) r/StopSmoking subreddits. We applied several classes of natural language processing (NLP) methods to analyze and visualize our data including term frequency inverse document frequency (TF-IDF) calculations, k-means clustering, and principal components analysis (PCA). We also applied a Valence Aware Dictional and sEntiment sic Reasoner (VADER) sentiment analysis to determine affect in our data.
Our analyses revealed three distinct clusters: (1) Personal addiction struggle, or sharing one's recovery journey (n = 2,520), (2) Giving advice, or offering counseling based on first-hand experiences (n = 3,885), and (3) Seeking advice, or asking for support or advice related to addiction (n = 2,661).
Addiction, SUD, and recovery dialogue on Reddit is exceedingly robust. Much of the content mirrors tenets for established addiction-recovery programs, which suggests Reddit, and other social networking websites, may serve as efficient tools to promote social connection among people with SUD.