The last 20 years has seen an uptick in research on missing data problems, and most software applications now implement one or more sophisticated missing data handling routines (e.g., multiple ...imputation or maximum likelihood estimation). Despite their superior statistical properties (e.g., less stringent assumptions, greater accuracy and power), the adoption of these modern analytic approaches is not uniform in psychology and related disciplines. Thus, the primary goal of this manuscript is to describe and illustrate the application of multiple imputation. Although maximum likelihood estimation is perhaps the easiest method to use in practice, psychological data sets often feature complexities that are currently difficult to handle appropriately in the likelihood framework (e.g., mixtures of categorical and continuous variables), but relatively simple to treat with imputation. The paper describes a number of practical issues that clinical researchers are likely to encounter when applying multiple imputation, including mixtures of categorical and continuous variables, item-level missing data in questionnaires, significance testing, interaction effects, and multilevel missing data. Analysis examples illustrate imputation with software packages that are freely available on the internet.
•An overview of multiple imputation and its application to clinical and psychological research.•Multiple imputation has a flexible assumption about the cause of missingness, and it provides greater accuracy and power.•Imputation is a straightforward solution for practical problems that may be difficult to deal with in other frameworks.•Multiple imputation is available in statistical software packages, and included analysis examples illustrate its use.
White-nose syndrome (WNS) is an emerging disease of hibernating bats associated with cutaneous infection by the fungus Geomyces destructans (Gd), and responsible for devastating declines of bat ...populations in eastern North America. Affected bats appear emaciated and one hypothesis is that they spend too much time out of torpor during hibernation, depleting vital fat reserves required to survive the winter. The fungus has also been found at low levels on bats throughout Europe but without mass mortality. This finding suggests that Gd is either native to both continents but has been rendered more pathogenic in North America by mutation or environmental change, or that it recently arrived in North America as an invader from Europe. Thus, a causal link between Gd and mortality has not been established and the reason for its high pathogenicity in North America is unknown. Here we show that experimental inoculation with either North American or European isolates of Gd causes WNS and mortality in the North American bat, Myotis lucifugus. In contrast to control bats, individuals inoculated with either isolate of Gd developed cutaneous infections diagnostic of WNS, exhibited a progressive increase in the frequency of arousals from torpor during hibernation, and were emaciated after 3–4 mo. Our results demonstrate that altered torpor-arousal cycles underlie mortality from WNS and provide direct evidence that Gd is a novel pathogen to North America from Europe.
The year 2022 is the 20th anniversary of Joseph Schafer and John Graham's paper titled "Missing data: Our view of the state of the art," currently the most highly cited paper in the history of
. Much ...has changed since 2002, as missing data methodologies have continually evolved and improved; the range of applications that are possible with modern missing data techniques has increased dramatically, and software options are light years ahead of where they were. This article provides an update on the state of the art that catalogs important innovations from the past two decades of missing data research. The paper addresses topics described in the original paper, including developments related to missing data theory, full information maximum likelihood, Bayesian estimation, multiple imputation, and models for missing not at random processes. The paper also describes newer factored regression specifications and missing data handling for multilevel models, both of which have been a focus of recent research. The paper concludes with a summary of the current software landscape and a discussion of several practical issues. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Attenuation of drug response with repeated administration is referred to as tachyphylaxis or tolerance, though the distinction between these two is obscured through both their usage in the literature ...and imprecise definitions in common pharmacology texts. In this perspective, I propose that these terms be distinguished by the mechanisms underlying the attenuation of drug response. Specifically, tachyphylaxis should be reserved for attenuation that occurs in response to cellular depletion, whereas tolerance should be used to describe attenuation that arises from cellular adaptations. A framework for understanding behavioral tolerance, physiologic tolerance, and dispositional tolerance as distinct phenomena is also discussed. Using this framework, a classification of drugs exhibiting attenuation of drug response with repeated administration is presented. SIGNIFICANCE STATEMENT: Distinction between tachyphylaxis and tolerance is unclear in the literature. Nonetheless, a mechanistic basis for distinguishing these important terms has practical implications for managing or preventing attenuation of drug response with repeated administration.
Approaches to handling missing data have improved dramatically in recent years and researchers can now choose from a variety of sophisticated analysis options. The methodological literature favors ...maximum likelihood and multiple imputation because these approaches offer substantial improvements over older approaches, including a strong theoretical foundation, less restrictive assumptions, and the potential for bias reduction and greater power. These benefits are especially important for developmental research where attrition is a pervasive problem. This article provides a brief introduction to modern methods for handling missing data and their application to developmental research.
The past decade has seen a noticeable shift in missing data handling techniques that assume a missing at random (MAR) mechanism, where the propensity for missing data on an outcome is related to ...other analysis variables. Although MAR is often reasonable, there are situations where this assumption is unlikely to hold, leading to biased parameter estimates. One such example is a longitudinal study of substance use where participants with the highest frequency of use also have the highest likelihood of attrition, even after controlling for other correlates of missingness. There is a large body of literature on missing not at random (MNAR) analysis models for longitudinal data, particularly in the field of biostatistics. Because these methods allow for a relationship between the outcome variable and the propensity for missing data, they require a weaker assumption about the missing data mechanism. This article describes 2 classic MNAR modeling approaches for longitudinal data: the selection model and the pattern mixture model. To date, these models have been slow to migrate to the social sciences, in part because they required complicated custom computer programs. These models are now quite easy to estimate in popular structural equation modeling programs, particularly Mplus. The purpose of this article is to describe these MNAR modeling frameworks and to illustrate their application on a real data set. Despite their potential advantages, MNAR-based analyses are not without problems and also rely on untestable assumptions. This article offers practical advice for implementing and choosing among different longitudinal models.
Appropriately centering Level 1 predictors is vital to the interpretation of intercept and slope parameters in multilevel models (MLMs). The issue of centering has been discussed in the literature, ...but it is still widely misunderstood. The purpose of this article is to provide a detailed overview of grand mean centering and group mean centering in the context of 2-level MLMs. The authors begin with a basic overview of centering and explore the differences between grand and group mean centering in the context of some prototypical research questions. Empirical analyses of artificial data sets are used to illustrate key points throughout. The article provides a number of practical recommendations designed to facilitate centering decisions in MLM applications.
Missing data methodology has improved dramatically in recent years, and popular computer programs now offer a variety of sophisticated options. Despite the widespread availability of theoretically ...justified methods, researchers in many disciplines still rely on subpar strategies that either eliminate incomplete cases or impute the missing scores with a single set of replacement values. This article provides readers with a nontechnical overview of some key issues from the missing data literature and demonstrates several of the techniques that methodologists currently recommend. This article begins by describing Rubin's missing data mechanisms. After a brief discussion of popular ad hoc approaches, the article provides a more detailed description of five analytic approaches that have received considerable attention in the missing data literature: maximum likelihood estimation, multiple imputation, the selection model, the shared parameter model, and the pattern mixture model. Finally, a series of data analysis examples illustrate the application of these methods.
Despite the broad appeal of missing data handling approaches that assume a missing at random (MAR) mechanism (e.g., multiple imputation and maximum likelihood estimation), some very common analysis ...models in the behavioral science literature are known to cause bias-inducing problems for these approaches. Regression models with incomplete interactive or polynomial effects are a particularly important example because they are among the most common analyses in behavioral science research applications. In the context of single-level regression, fully Bayesian (model-based) imputation approaches have shown great promise with these popular analysis models. The purpose of this article is to extend model-based imputation to multilevel models with up to 3 levels, including functionality for mixtures of categorical and continuous variables. Computer simulation results suggest that this new approach can be quite effective when applied to multilevel models with random coefficients and interaction effects. In most scenarios that we examined, imputation-based parameter estimates were quite accurate and tracked closely with those of the complete data. The new procedure is available in the Blimp software application for macOS, Windows, and Linux, and the article includes a data analysis example illustrating its use.
Translational Abstract
Multiple imputation is a missing data handling technique that creates several copies of the incomplete data, each with different estimates of the missing values. The researcher analyzes each data set, and the resulting estimates and standard errors are averaged into a single set of results. The primary goal of this paper was to outline a novel multiple imputation approach to multilevel analyses with interactive effects. Multilevel data are exceedingly common throughout psychology and the behavioral sciences, examples of such nested data structures include children within classrooms, individuals within families, employees within workgroups, and repeated measurements within individuals, to name a few. Interactive effects are equally common and occur when the magnitude of an association between two variables is modified by a third variable. Most popular current approaches to handling multilevel missing data produced biased estimates of interactive effects, and our approach addresses this important practical problem. The study used computer simulation to create many artificial data sets with missing values, after which it imputed each data set and examined the accuracy of the resulting estimates. The computer simulation results indicated that the proposed procedure works quite well, with trivial biases in most cases. We provide a software program for MacOS and Windows that implements the imputation strategy, and the paper illustrates its use.