Psychology advances knowledge by testing statistical hypotheses using empirical observations and data. The expectation is that most statistically significant findings can be replicated in new data ...and in new laboratories, but in practice many findings have replicated less often than expected, leading to claims of a replication crisis. We review recent methodological literature on questionable research practices, meta-analysis, and power analysis to explain the apparently high rates of failure to replicate. Psychologists can improve research practices to advance knowledge in ways that improve replicability. We recommend that researchers adopt open science conventions of preregi-stration and full disclosure and that replication efforts be based on multiple studies rather than on a single replication attempt. We call for more sophisticated power analyses, careful consideration of the various influences on effect sizes, and more complete disclosure of nonsignificant as well as statistically significant findings.
Social psychologists place high importance on understanding mechanisms and frequently employ mediation analyses to shed light on the process underlying an effect. Such analyses can be conducted with ...observed variables (e.g., a typical regression approach) or latent variables (e.g., a structural equation modeling approach), and choosing between these methods can be a more complex and consequential decision than researchers often realize. The present article adds to the literature on mediation by examining the relative trade-off between accuracy and precision in latent versus observed variable modeling. Whereas past work has shown that latent variable models tend to produce more accurate estimates, we demonstrate that this increase in accuracy comes at the cost of increased standard errors and reduced power, and examine this relative trade-off both theoretically and empirically in a typical 3-variable mediation model across varying levels of effect size and reliability. We discuss implications for social psychologists seeking to uncover mediating variables and provide 3 practical recommendations for maximizing both accuracy and precision in mediation analyses.
Maxwell, Cole, and Mitchell (2011) extended the work of
Maxwell and Cole (2007)
, which raised important questions about whether mediation analyses based on cross-sectional data can shed light on ...longitudinal mediation process. The latest article considers longitudinal processes that can only be partially explained by an intervening variable, and Maxwell et al. showed that the same general conclusions are obtained, namely that analyses of cross-sectional data will not reveal the longitudinal mediation process. While applauding the advances of the target article, this comment encourages the detailed exploration of alternate causal models in psychology beyond the autoregressive model considered by Maxwell et al. When inferences based on cross-sectional analyses are compared to alternate models, different patterns of bias are likely to be observed. I illustrate how different models of the causal process can be derived using examples from research on psychopathology.
The belief that the target of sexism has shifted from women to men is gaining popularity. Yet despite its potential theoretical and practical importance, the belief that men are now the primary ...target of sexism has not been systematically defined nor has it been reliably measured. In this paper, we define the belief in sexism shift (BSS) and introduce a scale to measure it. We contend that BSS constitutes a new form of contemporary sexism characterized by the perception that anti-male discrimination is pervasive, that it now exceeds anti-female discrimination, and that it is caused by women's societal advancement. In four studies (N = 666), we develop and test a concise, one-dimensional, 15-item measure of BSS: the BSS scale. Our findings demonstrate that BSS is related to, yet distinct from other forms of sexism (traditional, modern, and ambivalent sexism). Moreover, our results show that the BSS scale is a stable and reliable measure of BSS across different samples, time, and participant gender. The BSS scale is also less susceptible to social desirability concerns than other sexism measures. In sum, the BSS scale can be a valuable tool to help understand a new and potentially growing type of sexism that may hinder women in unprecedented ways.
Abstract
Background
Mediation analysis is an important tool for understanding the processes through which interventions affect health outcomes over time. Typically the temporal intervals between X, ...M, and Y are fixed by design, and little focus is given to the temporal dynamics of the processes.
Purpose
In this article, we aim to highlight the importance of considering the timing of the causal effects of a between-person intervention X, on M and Y, resulting in a deeper understanding of mediation.
Methods
We provide a framework for examining the impact of a between-person intervention X on M and Y over time when M and Y are measured repeatedly. Five conceptual and analytic steps involve visualizing the effects of the intervention on Y, M, the relationship of M and Y, and the mediating process over time and selecting an appropriate analytic model.
Results
We demonstrate how these steps can be applied to two empirical examples of health behavior change interventions. We show that the patterns of longitudinal mediation can be fit with versions of longitudinal multilevel structural equation models that represent how the magnitude of direct and indirect effects vary over time.
Conclusions
We urge researchers and methodologists to pay more attention to temporal dynamics in the causal analysis of interventions.
Five conceptual and analytic steps are proposed to assess and better understand the causal effects of interventions on health outcomes over time.
Mediation is said to occur when a causal effect of some variable
X
on an outcome
Y
is explained by some intervening variable
M.
The authors recommend that with small to moderate
samples, bootstrap ...methods
(
B. Efron & R. Tibshirani, 1993
) be used to assess mediation.
Bootstrap tests are powerful because they detect that the sampling
distribution of the mediated effect is skewed away from 0. They argue
that
R. M. Baron and D. A. Kenny's (1986)
recommendation of first testing the
X
→
Y
association
for statistical significance should not be a requirement when there is a
priori belief that the effect size is small or suppression is a
possibility. Empirical examples and computer setups for bootstrap
analyses are provided.
In this article, we consider the statistical models that are appropriate to understand relationship processes between two people who are in a committed relationship. Some of these processes capture ...inherently individual processes where individuals happen to be interrelated and others capture inherently dyadic processes. We compare several different statistical approaches to model these phenomena, including the actor–partner interdependence model, common fate model, and a dyadic score model. We compare and contrast these models using a data set on closeness and time spent together from 201 couples where one partner is distinguished by stress associated with an upcoming professional exam. The models yield results that appear to give different interpretations of the data. We discuss situations in which each model may be preferred and invite relationship researchers to model relationship data using the statistical model that matches their conceptual framework, rather than using a rigid statistical methodology.
An important step in demonstrating the validity of a new measure is to show that it is a better predictor of outcomes than existing measures-often called incremental validity. Investigators can use ...regression methods to argue for the incremental validity of new measures, while adjusting for competing or existing measures. The argument is often based on patterns of binary significance tests (BST): (a) both measures are significantly related to the outcome, (b) when adjusted for the new measure the competing measure is no longer significantly related to the outcome, but (c) when adjusted for the competing measure the new measure is still significantly related to the outcome. We show that the BST argument can lead to false conclusions up to 30% of the time when the validity study has modest statistical power. We review alternate methods for making strong inferences about validity and illustrate these with data on construal level in the context of relationships. Researchers often present results in black and white terms using statistical significance tests; the conclusions from such results can be misleading. We focus on a special case of this style of reporting whereby a new measure is said to be as good as, or better than, another measure because it is significantly related to an outcome whereas the other measure is not significant when both measures are tested jointly. In our tutorial on inference in regression, we show that arguments based on binary (black and white) patterns can lead to incorrect conclusions more than a third of the time, and we explain why this result is obtained. We further distinguish 3 situations where 2 measures are compared and show better ways of making arguments: (a) when 2 measures are thought to be literally equivalent, (b) when the new measure is thought to be better than the other, and (c) when the new measure adds information to the other, even if it is not equivalent or superior. We illustrate the statistical arguments with data on a new measure of construal level (specific vs. general thinking) in the context of relationships.
Background
Despite their good intentions, people often do not eat healthily. This is known as the intention-behavior gap. Although the intention-behavior relationship is theorized as a within-person ...process, most evidence is based on between-person differences.
Purpose
The purpose of the present study is to investigate the within-person intention-behavior association for unhealthy snack consumption.
Methods
Young adults (
N
= 45) participated in an intensive longitudinal study. They reported intentions and snack consumption five times daily for 7 days (
n
= 1068 observations analyzed).
Results
A within-person unit difference in intentions was associated with a halving of the number of unhealthy snacks consumed in the following 3 h (CI
95
27–70 %). Between-person differences in average intentions did not predict unhealthy snack consumption.
Conclusions
Consistent with theory, the intention-behavior relation for healthy eating is best understood as a within-person process. Interventions to reduce unhealthy snacking should target times of day when intentions are weakest.
Three articles in this issue detail the process and results of reliability tests for proposed DSM-5 diagnoses and cross-diagnosis symptom domains. The editorial highlights the good reliability of ...borderline personality disorder and relates the questionable reliability of major depressive disorder to its heterogeneity. The editorial is also available in Spanish, Traditional Chinese, and Simplified Chinese.