A sizeable literature exists on the use of frequentist power analysis in the null-hypothesis significance testing (NHST) paradigm to facilitate the design of informative experiments. In contrast, ...there is almost no literature that discusses the design of experiments when Bayes factors (BFs) are used as a measure of evidence. Here we explore Bayes Factor Design Analysis (BFDA) as a useful tool to design studies for maximum efficiency and informativeness. We elaborate on three possible BF designs, (a) a fixed-
n
design, (b) an open-ended Sequential Bayes Factor (SBF) design, where researchers can test after each participant and can stop data collection whenever there is strong evidence for either
ℋ
1
or
ℋ
0
, and (c) a modified SBF design that defines a maximal sample size where data collection is stopped regardless of the current state of evidence. We demonstrate how the properties of each design (i.e., expected strength of evidence, expected sample size, expected probability of misleading evidence, expected probability of weak evidence) can be evaluated using Monte Carlo simulations and equip researchers with the necessary information to compute their own Bayesian design analyses.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OBVAL, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Unplanned optional stopping rules have been criticized for inflating Type I error rates under the null hypothesis significance testing (NHST) paradigm. Despite these criticisms, this research ...practice is not uncommon, probably because it appeals to researcher's intuition to collect more data to push an indecisive result into a decisive region. In this contribution, we investigate the properties of a procedure for Bayesian hypothesis testing that allows optional stopping with unlimited multiple testing, even after each participant. In this procedure, which we call Sequential Bayes Factors (SBFs), Bayes factors are computed until an a priori defined level of evidence is reached. This allows flexible sampling plans and is not dependent upon correct effect size guesses in an a priori power analysis. We investigated the long-term rate of misleading evidence, the average expected sample sizes, and the biasedness of effect size estimates when an SBF design is applied to a test of mean differences between 2 groups. Compared with optimal NHST, the SBF design typically needs 50% to 70% smaller samples to reach a conclusion about the presence of an effect, while having the same or lower long-term rate of wrong inference.
Translational Abstract
Unplanned optional stopping rules have been criticized for inflating Type I error rates under the null hypothesis significance testing (NHST) paradigm. Despite these criticisms this research practice is not uncommon, probably as it appeals to researcher's intuition to collect more data in order to push an indecisive result into a decisive region. In this contribution we investigate the properties of a procedure for Bayesian hypothesis testing that allows optional stopping with unlimited multiple testing, even after each participant. In this procedure, which we call Sequential Bayes Factors (SBF), Bayes factors are computed until an a priori defined level of evidence is reached. This allows flexible sampling plans and is not dependent upon correct effect size guesses in an a priori power analysis. We investigated the long-term rate of misleading evidence, the average expected sample sizes, and the biasedness of effect size estimates when an SBF design is applied to a test of mean differences between two groups. Compared with optimal NHST, the SBF design typically needs 50% to 70% smaller samples to reach a conclusion about the presence of an effect, while having the same or lower long-term rate of wrong inference.
Full text
Available for:
CEKLJ, FFLJ, NUK, ODKLJ, PEFLJ, UPUK
Response surface analysis (RSA) is a statistical approach that enables researchers to test congruence hypotheses; the proposition that the degree of congruence between people's values in 2 ...psychological constructs should be positively or negatively related to their value in an outcome variable. This is done by estimating a polynomial regression model and using the graph of the model and several parameters as a guide to interpret the resulting regression coefficients in terms of the congruence hypothesis. One problem with using RSA in applied research is that the model and the interpretation of the model's parameters in terms of congruence effects have only been thoroughly developed for single-level data. Here, we present an extension of RSA to multilevel data. Among other things we show how the standard errors can be computed and how researchers can decide whether the occurrence of a congruence effect depends on a Level 2 covariate. We illustrate the suggested extension with 2 examples that guide readers through the test of congruence effects in the case of multilevel data. We also provide R scripts that researchers can adopt to conduct multilevel RSA.
Translational Abstract
Many psychological theories propose that the amount of congruence between two psychological variables is related to an outcome variable (e.g., that the congruence between competence demands of a person's job and the person's competence relates to job satisfaction). The present article introduces an extension of RSA that can be applied to multilevel data and provide R scripts to facilitate these analyses. The suggested approach allows researchers to examine their congruence hypotheses and other RSA effects when they have multilevel data.
Full text
Available for:
CEKLJ, FFLJ, NUK, ODKLJ, PEFLJ, UPUK
Researchers would be more willing to prioritize research quality over quantity if the incentive structure of the academic system aligned with this goal. The winner of a 2023 Einstein Foundation Award ...for Promoting Quality in Research explains how they rose to this challenge.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Replication-an important, uncommon, and misunderstood practice-is gaining appreciation in psychology. Achieving replicability is important for making research progress. If findings are not ...replicable, then prediction and theory development are stifled. If findings are replicable, then interrogation of their meaning and validity can advance knowledge. Assessing replicability can be productive for generating and testing hypotheses by actively confronting current understandings to identify weaknesses and spur innovation. For psychology, the 2010s might be characterized as a decade of active confrontation. Systematic and multi-site replication projects assessed current understandings and observed surprising failures to replicate many published findings. Replication efforts highlighted sociocultural challenges such as disincentives to conduct replications and a tendency to frame replication as a personal attack rather than a healthy scientific practice, and they raised awareness that replication contributes to self-correction. Nevertheless, innovation in doing and understanding replication and its cousins, reproducibility and robustness, has positioned psychology to improve research practices and accelerate progress.
Full text
Available for:
CMK, FFLJ, NUK, UL, UM, UPUK
Are religious people psychologically better or worse adjusted than their nonreligious counterparts? Hundreds of studies have reported a positive relation between religiosity and psychological ...adjustment. Recently, however, a comparatively small number of cross-cultural studies has questioned this staple of religiosity research. The latter studiesclude tables with Pearson correlations between all Stu find that religious adjustment benefits are restricted to religious cultures. Gebauer, Sedikides, and Neberich (2012) suggested the religiosity as social value hypothesis (RASV) as one explanation for those cross-cultural differences. RASV states that, in religious cultures, religiosity possesses much social value, and, as such, religious people will feel particularly good about themselves. In secular cultures, however, religiosity possesses limited social value, and, as such, religious people will feel less good about themselves, if at all. Yet, previous evidence has been inconclusive regarding RASV and regarding cross-cultural differences in religious adjustment benefits more generally. To clarify matters, we conducted 3 replication studies. We examined the relation between religiosity and self-esteem (the most direct and appropriate adjustment indicator, according to RASV) in a self-report study across 65 countries (N = 2,195,301), an informant-report study across 36 countries (N = 560,264), and another self-report study across 1,932 urban areas from 243 federal states in 18 countries (N = 1,188,536). Moreover, we scrutinized our results against 7, previously untested, alternative explanations. Our results fully and firmly replicated and extended prior evidence for cross-cultural differences in religious adjustment benefits. These cross-cultural differences were best explained by RASV.
Full text
Available for:
CEKLJ, FFLJ, NUK, ODKLJ, PEFLJ, UPUK
Publication bias and questionable research practices in primary research can lead to badly overestimated effects in meta-analysis. Methodologists have proposed a variety of statistical approaches to ...correct for such overestimation. However, it is not clear which methods work best for data typically seen in psychology. Here, we present a comprehensive simulation study in which we examined how some of the most promising meta-analytic methods perform on data that might realistically be produced by research in psychology. We simulated several levels of questionable research practices, publication bias, and heterogeneity, and used study sample sizes empirically derived from the literature. Our results clearly indicated that no single meta-analytic method consistently outperformed all the others. Therefore, we recommend that meta-analysts in psychology focus on sensitivity analyses—that is, report on a variety of methods, consider the conditions under which these methods fail (as indicated by simulation studies such as ours), and then report how conclusions might change depending on which conditions are most plausible. Moreover, given the dependence of meta-analytic methods on untestable assumptions, we strongly recommend that researchers in psychology continue their efforts to improve the primary literature and conduct large-scale, preregistered replications. We provide detailed results and simulation code at https://osf.io/rf3ys and interactive figures at http://www.shinyapps.org/apps/metaExplorer/.
Congruence hypotheses play a major role in many areas of psychology. They refer to, for example, the consequences of person-environment fit, similarity, or self-other agreement. For example, are ...people psychologically better adjusted when their self-view is in line with their reputation? A valid statistical approach that can be applied to investigate congruence hypotheses of this kind is quadratic Response Surface Analysis (RSA) in which a second-order polynomial model is fit to the data and appropriately interpreted. However, quadratic RSA does not allow researchers to investigate more precise expectations about a congruence effect. Do the data support an asymmetric congruence effect, in the sense that congruence leads to the highest (or lowest) outcome, but incongruence in one direction (e.g., self-view exceeds reputation) affects the outcome differently than incongruence in the other direction (e.g., self-view falls behind reputation)? Is there a level-dependent congruence effect, such that the amount of congruence is more strongly related to the outcome variable for some levels of the predictors (e.g., high self-view and reputation) than for others (e.g., low self-view and reputation)? Such complex congruence hypotheses have frequently been suggested in the literature, but they could not be investigated because an appropriate statistical approach has yet to be developed. Here, we present analytical strategies, based on third-order polynomial models, that enable users to investigate asymmetric and level-dependent congruence effects, respectively. To facilitate the correct application of the suggested approaches, we provide respective step-by-step guidelines, corresponding R syntax, and illustrative analyses using simulated and real data.
Translational Abstract
Psychologists are often interested in examining whether the degree of congruence between 2 psychological variables is related to an outcome variable (e.g., whether people whose self-view is in line with the way they are seen by others are more satisfied with their social relationships than people whose reputation is discrepant from their self-view). To investigate whether such congruence effects are present in empirical data, one can analyze the data with quadratic Response Surface Analysis (RSA). However, researchers often have very nuanced expectations about congruence effects (e.g., that a discrepancy between self-view and reputation is especially detrimental when people see themselves in an overly positive light, whereas it is less detrimental when people hold a self-view that falls behind their reputation). These nuances could not yet be examined empirically. In the present article, we present cubic RSA, an extension of quadratic RSA, which can be used to detect such complex congruence effects. We also show how these analyses can be conducted with the software R and demonstrate it with 3 examples.
Full text
Available for:
CEKLJ, FFLJ, NUK, ODKLJ, PEFLJ, UPUK