Abstract
Researchers often conclude an effect is absent when a null-hypothesis significance test yields a nonsignificant p value. However, it is neither logically nor statistically correct to ...conclude an effect is absent when a hypothesis test is not significant. We present two methods to evaluate the presence or absence of effects: Equivalence testing (based on frequentist statistics) and Bayes factors (based on Bayesian statistics). In four examples from the gerontology literature, we illustrate different ways to specify alternative models that can be used to reject the presence of a meaningful or predicted effect in hypothesis tests. We provide detailed explanations of how to calculate, report, and interpret Bayes factors and equivalence tests. We also discuss how to design informative studies that can provide support for a null model or for the absence of a meaningful effect. The conceptual differences between Bayes factors and equivalence tests are discussed, and we also note when and why they might lead to similar or different inferences in practice. It is important that researchers are able to falsify predictions or can quantify the support for predicted null effects. Bayes factors and equivalence tests provide useful statistical tools to improve inferences about null effects.
Psychologists must be able to test both for the presence of an effect and for the absence of an effect. In addition to testing against zero, researchers can use the two one-sided tests (TOST) ...procedure to test for equivalence and reject the presence of a smallest effect size of interest (SESOI). The TOST procedure can be used to determine if an observed effect is surprisingly small, given that a true effect at least as extreme as the SESOI exists. We explain a range of approaches to determine the SESOI in psychological science and provide detailed examples of how equivalence tests should be performed and reported. Equivalence tests are an important extension of the statistical tools psychologists currently use and enable researchers to falsify predictions about the presence, and declare the absence, of meaningful effects.
In this article, we introduce two community-contributed commands, rddensity and rdbwdensity, that implement automatic manipulation tests based on density discontinuity and are constructed using the ...results for local-polynomial density estimators in Cattaneo, Jansson, and Ma (2017b, Simple local polynomial density estimators, Working paper, University of Michigan). These new tests exhibit better size properties (and more power under additional assumptions) than other conventional approaches currently available in the literature. The first command, rddensity, implements manipulation tests based on a novel local-polynomial density estimation technique that avoids prebinning of the data (improving size properties) and allows for restrictions on other features of the model (improving power properties). The second command, rdbwdensity, implements several bandwidth selectors specifically tailored for the manipulation tests discussed herein. We also provide a companion R package with the same syntax and capabilities as rddensity and rdbwdensity.
Hypothesizing after the results are known, or HARKing, occurs when researchers check their research results and then add or remove hypotheses on the basis of those results without acknowledging this ...process in their research report (Kerr, 1998). In the present article, I discuss 3 forms of HARKing: (a) using current results to construct post hoc hypotheses that are then reported as if they were a priori hypotheses; (b) retrieving hypotheses from a post hoc literature search and reporting them as a priori hypotheses; and (c) failing to report a priori hypotheses that are unsupported by the current results. These 3 types of HARKing are often characterized as being bad for science and a potential cause of the current replication crisis. In the present article, I use insights from the philosophy of science to present a more nuanced view. Specifically, I identify the conditions under which each of these 3 types of HARKing is most and least likely to be bad for science. I conclude with a brief discussion about the ethics of each type of HARKing.
Despite its wide usage in explaining political dynamics of non-democracies, preference falsification remains an empirical myth for students of authoritarian politics due to the challenge of ...measurement. We offer the first quantitative study of this phenomenon in a non-democratic setting by exploiting a rare coincidence between a major political purge in Shanghai, China, and the administration of a nationwide survey in 2006. We construct two synthetic measures for expressed and actual political support and track their changes before and after the purge. We find that the purge caused a dramatic increase in expressed support among Shanghai respondents, yet the increase was paralleled by an equally evident decline in actual support. We interpret this divergence as evidence for preference falsification and conduct a number of robustness checks to rule out alternative explanations. We also show that falsification was most intense among groups that had access to alternative information but were vulnerable to political sanctions.
Adversarial collaboration has been championed as the gold standard for resolving scientific disputes but has gained relatively limited traction in neuroscience and allied fields. In this perspective, ...we argue that adversarial collaborative research has been stymied by an overly restrictive concern with the falsification of scientific theories. We advocate instead for a more expansive view that frames adversarial collaboration in terms of Bayesian belief updating, model comparison, and evidence accumulation. This framework broadens the scope of adversarial collaboration to accommodate a wide range of informative (but not necessarily definitive) studies while affording the requisite formal tools to guide experimental design and data analysis in the adversarial setting. We provide worked examples that demonstrate how these tools can be deployed to score theoretical models in terms of a common metric of evidence, thereby furnishing a means of tracking the amount of empirical support garnered by competing theories over time.
Corcoran et al. present a Bayesian treatment of adversarial collaboration that operationalizes competing theoretical hypotheses as generative models. This approach enables the evaluation of alternative theories in terms of the evidential support their models garner across (potentially disparate) experimental settings.
The (latest) crisis in confidence in social psychology has generated much heated discussion about the importance of replication, including how it should be carried out as well as interpreted by ...scholars in the field. For example, what does it mean if a replication attempt "fails"-does it mean that the original results, or the theory that predicted them, have been falsified? And how should "failed" replications affect our belief in the validity of the original research? In this paper, we consider the replication debate from a historical and philosophical perspective, and provide a conceptual analysis of both replication and falsification as they pertain to this important discussion. Along the way, we highlight the importance of auxiliary assumptions (for both testing theories and attempting replications), and introduce a Bayesian framework for assessing "failed" replications in terms of how they should affect our confidence in original findings.
The factors contributing to the violation of the current legislation on accounting and reporting, which lead to distortion of the reporting of financial and property status in the reporting, have ...been clarified. The purpose of the entity, the subject and objects of the audit to identify signs of falsification of financial statements. It is substantiated that analytical procedures allow establishing the absence or presence of deviations between the actual and expected values of indicators disclosed in the financial statements. It has been proven that the use of analytical procedures in the audit may confirm the reliability of financial statements or identify atypical deviations that may be highly likely to indicate fraudulent actions during the documentation of major business processes and distortion of interrelated financial statements. It is established that modern approaches to the audit methodology to verify the reliability of financial statements are based mainly on the use of documentary and factual methods of control, analytical procedures are limited to horizontal and vertical analysis, rapid analysis of financial condition, based on correlation-regression analysis. The classification of analytical procedures according to the information base and according to the complexity of the methodology has been developed. The article presents the typical violations that can be identified during the audit to identify signs of falsification of financial statements using analytical procedures.
Disproval of the Starch-Amyloplast Hypothesis? Richter, Peter; Strauch, Sebastian M.; Lebert, Michael
Trends in plant science,
April 2019, 2019-04-00, 20190401, Volume:
24, Issue:
4
Journal Article
Peer reviewed
In a recent publication, Edelmann (Protoplasma 2018; 255,1877–1881) refuted the well-established starch-amyloplast hypothesis of gravitropism in plants. Gravitropic curvatures of shoots and roots ...were still present after amyloplast-containing tissues (in sheath of vascular bundles and root caps) were dissected. Here, we discuss Edelmann’s data in the light of Popper’s falsification principle.