Bayes factors provide a symmetrical measure of evidence for one model versus another (e.g. H1 versus H0) in order to relate theory to data. These properties help solve some (but not all) of the ...problems underlying the credibility crisis in psychology. The symmetry of the measure of evidence means that there can be evidence for H0 just as much as for H1; or the Bayes factor may indicate insufficient evidence either way. P-values cannot make this three-way distinction. Thus, Bayes factors indicate when the data count against a theory (and when they count for nothing); and thus they indicate when replications actually support H0 or H1 (in ways that power cannot). There is every reason to publish evidence supporting the null as going against it, because the evidence can be measured to be just as strong either way (thus the published record can be more balanced). Bayes factors can be B-hacked but they mitigate the problem because a) they allow evidence in either direction so people will be less tempted to hack in just one direction; b) as a measure of evidence they are insensitive to the stopping rule; c) families of tests cannot be arbitrarily defined; and d) falsely implying a contrast is planned rather than post hoc becomes irrelevant (though the value of pre-registration is not mitigated).
•Bayes factors would help science deal with the credibility crisis.•Bayes factors retain their meaning regardless of optional stopping.•Bayes factors retain their meaning despite other tests being conducted.•Bayes factors retain their meaning regardless of time of analysis.•The logic of Bayes helps illuminate the benefits of pre-registration.
Researchers are often confused about what can be inferred from significance tests. One problem occurs when people apply Bayesian intuitions to significance testing—two approaches that must be firmly ...separated. This article presents some common situations in which the approaches come to different conclusions; you can see where your intuitions initially lie. The situations include multiple testing, deciding when to stop running participants, and when a theory was thought of relative to finding out results. The interpretation of nonsignificant results has also been persistently problematic in a way that Bayesian inference can clarify. The Bayesian and orthodox approaches are placed in the context of different notions of rationality, and I accuse myself and others as having been irrational in the way we have been using statistics on a key notion of rationality. The reader is shown how to apply Bayesian inference in practice, using free online software, to allow more coherent inferences from data.
No scientific conclusion follows automatically from a statistically non-significant result, yet people routinely use non-significant results to guide conclusions about the status of theories (or the ...effectiveness of practices). To know whether a non-significant result counts against a theory, or if it just indicates data insensitivity, researchers must use one of: power, intervals (such as confidence or credibility intervals), or else an indicator of the relative evidence for one theory over another, such as a Bayes factor. I argue Bayes factors allow theory to be linked to data in a way that overcomes the weaknesses of the other approaches. Specifically, Bayes factors use the data themselves to determine their sensitivity in distinguishing theories (unlike power), and they make use of those aspects of a theory's predictions that are often easiest to specify (unlike power and intervals, which require specifying the minimal interesting value in order to address theory). Bayes factors provide a coherent approach to determining whether non-significant results support a null hypothesis over a theory, or whether the data are just insensitive. They allow accepting and rejecting the null hypothesis to be put on an equal footing. Concrete examples are provided to indicate the range of application of a simple online Bayes calculator, which reveal both the strengths and weaknesses of Bayes factors.
How Do I Know What My Theory Predicts? Dienes, Zoltan
Advances in methods and practices in psychological science,
12/2019, Letnik:
2, Številka:
4
Journal Article
Recenzirano
Odprti dostop
To get evidence for or against a theory relative to the null hypothesis, one needs to know what the theory predicts. The amount of evidence can then be quantified by a Bayes factor. Specifying the ...sizes of the effect one’s theory predicts may not come naturally, but I show some ways of thinking about the problem, some simple heuristics that are often useful when one has little relevant prior information. These heuristics include the room-to-move heuristic (for comparing mean differences), the ratio-of-scales heuristic (for regression slopes), the ratio-of-means heuristic (for regression slopes), the basic-effect heuristic (for analysis of variance effects), and the total-effect heuristic (for mediation analysis).
The Role of Phenomenological Control in Experience Dienes, Zoltan; Lush, Peter
Current directions in psychological science : a journal of the American Psychological Society,
04/2023, Letnik:
32, Številka:
2
Journal Article
Recenzirano
Odprti dostop
To varying degrees, people have the capacity to alter their subjective experience such that it misrepresents reality in ways consistent with their goals and such that the misrepresentation can be ...sustained over at least minutes despite clear contrary evidence. In other words, people have a capacity for phenomenological control. People can use this capacity to fulfill requirements of social situations or personal needs. One such prominent situation is hypnosis. Another situation that psychologists often place people in is the psychological experiment, in which it is often clear to subjects what experiences are desired. Situations in life may also call for certain experiences, for example, encountering a spiritual world according to one’s religious beliefs. These experiences can be constructed so that they seem to confirm the beliefs of all the people involved.
Obtaining evidence that something does not exist requires knowing how big it would be were it to exist. Testing a theory that predicts an effect thus entails specifying the range of effect sizes ...consistent with the theory, in order to know when the evidence counts against the theory. Indeed, a theoretically relevant effect size must be specified for power calculations, equivalence testing, and Bayes factors in order that the inferential statistics test the theory. Specifying relevant effect sizes for power, or the equivalence region for equivalence testing, or the scale factor for Bayes factors, is necessary for many journal formats, such as registered reports, and should be necessary for all articles that use hypothesis testing. Yet there is little systematic advice on how to approach this problem. This article offers some principles and practical advice for specifying theoretically relevant effect sizes for hypothesis testing.
Inference using significance testing and Bayes factors is compared and contrasted in five case studies based on real research. The first study illustrates that the methods will often agree, both in ...motivating researchers to conclude that H1 is supported better than H0, and the other way round, that H0 is better supported than H1. The next four, however, show that the methods will also often disagree. In these cases, the aim of the paper will be to motivate the sensible evidential conclusion, and then see which approach matches those intuitions. Specifically, it is shown that a high-powered non-significant result is consistent with no evidence for H0 over H1 worth mentioning, which a Bayes factor can show, and, conversely, that a low-powered non-significant result is consistent with substantial evidence for H0 over H1, again indicated by Bayesian analyses. The fourth study illustrates that a high-powered significant result may not amount to any evidence for H1 over H0, matching the Bayesian conclusion. Finally, the fifth study illustrates that different theories can be evidentially supported to different degrees by the same data; a fact that
P
-values cannot reflect but Bayes factors can. It is argued that appropriate conclusions match the Bayesian inferences, but not those based on significance testing, where they disagree.
While theories of consciousness differ substantially, the ‘conscious access hypothesis’, which aligns consciousness with the global accessibility of information across cortical regions, is present in ...many of the prevailing frameworks. This account holds that consciousness is necessary to integrate information arising from independent functions such as the specialist processing required by different senses. We directly tested this account by evaluating the potential for associative learning between novel pairs of subliminal stimuli presented in different sensory modalities. First, pairs of subliminal stimuli were presented and then their association assessed by examining the ability of the first stimulus to prime classification of the second. In Experiments 1–4 the stimuli were word-pairs consisting of a male name preceding either a creative or uncreative profession. Participants were subliminally exposed to two name-profession pairs where one name was paired with a creative profession and the other an uncreative profession. A supraliminal task followed requiring the timed classification of one of those two professions. The target profession was preceded by either the name with which it had been subliminally paired (concordant) or the alternate name (discordant). Experiment 1 presented stimuli auditorily, Experiment 2 visually, and Experiment 3 presented names auditorily and professions visually. All three experiments revealed the same inverse priming effect with concordant test pairs associated with significantly slower classification judgements. Experiment 4 sought to establish if learning would be more efficient with supraliminal stimuli and found evidence that a different strategy is adopted when stimuli are consciously perceived. Finally, Experiment 5 replicated the unconscious cross-modal association achieved in Experiment 3 utilising non-linguistic stimuli. The results demonstrate the acquisition of novel cross-modal associations between stimuli which are not consciously perceived and thus challenge the global access hypothesis and those theories embracing it.
A nonsignificant result against an H0 of no effect does not distinguish evidence for no effect from no evidence at all one way or the other. Thus, a researcher engaged primarily in significance ...testing may decide to follow up just the nonsignificant results with a test from another system of inference, such as equivalence tests (more generally, inference by intervals) or Bayes factors. However, selectively using two systems of inference in this way, can lead to inferential inconsistency because different tests are based on different principles, and therefore a researcher can be tempted to select the way each system is used to get the results the researcher wants for just the tests that system is applied to. For a related set of tests, one system of inference should be consistently used.
Public Significance Statement
The article argues that when a researcher argues for there not being an effect, they should use the same method as when they argue that there is an effect. If different standards are used in the two cases, then there is an inconsistency of standards. If a researcher argues there are no side effects using weaker criteria than are used for claiming a drug had useful effects, then a drug that did as much harm as good may appear to do no harm and much good. Clearly having consistent standards for claims of effect and no effect is in the public interest.
Does sense of agency (SoA) arise merely from action-outcome associations, or does an additional real-time process track each step along the chain? Tracking control predicts that deviant intermediate ...steps between action and outcome should reduce SoA. In two experiments, participants learned mappings between two finger actions and two tones. In later test blocks, actions triggered a robot hand moving either the same or a different finger, and also triggered tones, which were congruent or incongruent with the mapping. The perceived delay between actions and tones gave a proxy measure for SoA. Action-tone binding was stronger for congruent than incongruent tones, but only when the robot movement was also congruent. Congruent tones also had reduced N1 amplitudes, but again only when the robot movement was congruent. We suggest that SoA partly depends on a real-time tracking control mechanism, since deviant intermediate action of the robot reduced SoA over the tone.