Despite the broad appeal of missing data handling approaches that assume a missing at random (MAR) mechanism (e.g., multiple imputation and maximum likelihood estimation), some very common analysis ...models in the behavioral science literature are known to cause bias-inducing problems for these approaches. Regression models with incomplete interactive or polynomial effects are a particularly important example because they are among the most common analyses in behavioral science research applications. In the context of single-level regression, fully Bayesian (model-based) imputation approaches have shown great promise with these popular analysis models. The purpose of this article is to extend model-based imputation to multilevel models with up to 3 levels, including functionality for mixtures of categorical and continuous variables. Computer simulation results suggest that this new approach can be quite effective when applied to multilevel models with random coefficients and interaction effects. In most scenarios that we examined, imputation-based parameter estimates were quite accurate and tracked closely with those of the complete data. The new procedure is available in the Blimp software application for macOS, Windows, and Linux, and the article includes a data analysis example illustrating its use.
Translational Abstract
Multiple imputation is a missing data handling technique that creates several copies of the incomplete data, each with different estimates of the missing values. The researcher analyzes each data set, and the resulting estimates and standard errors are averaged into a single set of results. The primary goal of this paper was to outline a novel multiple imputation approach to multilevel analyses with interactive effects. Multilevel data are exceedingly common throughout psychology and the behavioral sciences, examples of such nested data structures include children within classrooms, individuals within families, employees within workgroups, and repeated measurements within individuals, to name a few. Interactive effects are equally common and occur when the magnitude of an association between two variables is modified by a third variable. Most popular current approaches to handling multilevel missing data produced biased estimates of interactive effects, and our approach addresses this important practical problem. The study used computer simulation to create many artificial data sets with missing values, after which it imputed each data set and examined the accuracy of the resulting estimates. The computer simulation results indicated that the proposed procedure works quite well, with trivial biases in most cases. We provide a software program for MacOS and Windows that implements the imputation strategy, and the paper illustrates its use.
Abstract
Specialized imputation routines for multilevel data are widely available in software packages, but these methods are generally not equipped to handle a wide range of complexities that are ...typical of behavioral science data. In particular, existing imputation schemes differ in their ability to handle random slopes, categorical variables, differential relations at Level-1 and Level-2, and incomplete Level-2 variables. Given the limitations of existing imputation tools, the purpose of this manuscript is to describe a flexible imputation approach that can accommodate a diverse set of 2-level analysis problems that includes any of the aforementioned features. The procedure employs a fully conditional specification (also known as chained equations) approach with a latent variable formulation for handling incomplete categorical variables. Computer simulations suggest that the proposed procedure works quite well, with trivial biases in most cases. We provide a software program that implements the imputation strategy, and we use an artificial data set to illustrate its use.
Translational Abstract
Multiple imputation is a missing data handling technique that creates several copies of the incomplete data, each with different estimates of the missing values. The researcher analyzes each data set, and the resulting estimates and standard errors are averaged into a single set of results. The primary goal of this article was to outline a novel multiple imputation approach to multilevel data sets and examine its performance. Multilevel data are exceedingly common throughout psychology and the behavioral sciences. Examples of such nested data structures include children within classrooms, individuals within families, employees within workgroups, and repeated measurements within individuals, to name a few. Current approaches to handling multilevel missing data have limitations, and our approach addresses practical problems that are common in applied research (e.g., incomplete categorical variables, complex model structures). The study used computer simulation to create many artificial data sets with missing values, after which it imputed each data set and examined the accuracy of the resulting estimates. The computer simulation results indicated that the proposed procedure works quite well, with trivial biases in most cases. We provide a software program for MacOS and Windows that implements the imputation strategy, and the paper illustrates its use.
Multiple imputation has enjoyed widespread use in social science applications, yet the application of imputation-based inference to structural equation modeling has received virtually no attention in ...the literature. Thus, this study has 2 overarching goals: evaluate the application of Meng and Rubin's (1992) pooling procedure for likelihood ratio statistic to the SEM test of model fit, and explore the possibility of using this test statistic to define imputation-based versions of common fit indices such as the TLI, CFI, and RMSEA. Computer simulation results suggested that, when applied to a correctly specified model, the pooled likelihood ratio statistic performed well as a global test of model fit and was closely calibrated to the corresponding full information maximum likelihood (FIML) test statistic. However, when applied to misspecified models with high rates of missingness (30%-40%), the imputation-based test statistic generally exhibited lower power than that of FIML. Using the pooled test statistic to construct imputation-based versions of the TLI, CFI, and RMSEA worked well and produced indices that were well-calibrated with those of full information maximum likelihood estimation. This article gives Mplus and R code to implement the pooled test statistic, and it offers a number of recommendations for future research.
Variation in prey resources influences the diet and behaviour of predators. When prey become limiting, predators may travel farther to find preferred food or adjust to existing local resources. When ...predators are habitat limited, local resource abundance impacts foraging success. We analysed the diet of Myotis lucifugus (little brown bats) from Nova Scotia (eastern Canada) to the Northwest Territories (north‐western Canada). This distribution includes extremes of season length and temperature and encompasses colonies on rural monoculture farms, and in urban and unmodified areas. We recognized nearly 600 distinct species of prey, of which ≈30% could be identified using reference sequence libraries. We found a higher than expected use of lepidopterans, which comprised a range of dietary richness from ≈35% early in the summer to ≈55% by late summer. Diptera were the second largest prey group consumed, representing ≈45% of dietary diversity early in the summer. We observed extreme local dietary variability and variation among seasons and years. Based on the species of insects that were consumed, we observed that two locations support prey species with extremely low pollution and acidification tolerances, suggesting that these are areas without environmental contamination. We conclude that there is significant local population variability in little brown bat diet that is likely driven by seasonal and geographical changes in insect diversity, and that this prey may be a good indicator of environment quality.
White-nose syndrome (WNS) is a new disease of bats that has devastated populations in eastern North America. Infection with the fungus, Geomyces destructans, is thought to increase the time bats ...spend out of torpor during hibernation, leading to starvation. Little is known about hibernation in healthy, free-ranging bats and more data are needed to help predict consequences of WNS. Trade-offs presumably exist between the energetic benefits and physiological/ecological costs of torpor, leading to the prediction that the relative importance of spring energy reserves should affect an individual's use of torpor and depletion of energy reserves during winter. Myotis lucifugus mate during fall and winter but females do not become pregnant until after spring emergence. Thus, female reproductive success depends on spring fat reserves while male reproductive success does not. Consequently, females should be "thrifty" in their use of fat compared to males. We measured body condition index (BCI; mass/forearm length) of 432 M. lucifugus in Manitoba, Canada during the winter of 2009/2010. Bats were captured during the fall mating period (n = 200), early hibernation (n = 125), and late hibernation (n = 128). Adult females entered hibernation with greater fat reserves and consumed those reserves more slowly than adult males and young of the year. Consequently, adult females may be more likely than males or young of the year to survive the disruption of energy balance associated with WNS, although surviving females may not have sufficient reserves to support reproduction.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Bats are the natural reservoirs of a number of high-impact viral zoonoses. We present a quantitative analysis to address the hypothesis that bats are unique in their propensity to host zoonotic ...viruses based on a comparison with rodents, another important host order. We found that bats indeed host more zoonotic viruses per species than rodents, and we identified life-history and ecological factors that promote zoonotic viral richness. More zoonotic viruses are hosted by species whose distributions overlap with a greater number of other species in the same taxonomic order (sympatry). Specifically in bats, there was evidence for increased zoonotic viral richness in species with smaller litters (one young), greater longevity and more litters per year. Furthermore, our results point to a new hypothesis to explain in part why bats host more zoonotic viruses per species: the stronger effect of sympatry in bats and more viruses shared between bat species suggests that interspecific transmission is more prevalent among bats than among rodents. Although bats host more zoonotic viruses per species, the total number of zoonotic viruses identified in bats (61) was lower than in rodents (68), a result of there being approximately twice the number of rodent species as bat species. Therefore, rodents should still be a serious concern as reservoirs of emerging viruses. These findings shed light on disease emergence and perpetuation mechanisms and may help lead to a predictive framework for identifying future emerging infectious virus reservoirs.
This paper summarizes recent methodologic advances related to missing data and provides an overview of two "modern" analytic options, direct maximum likelihood (DML) estimation and multiple ...imputation (MI). The paper begins with an overview of missing data theory, as explicated by Rubin. Brief descriptions of traditional missing data techniques are given, and DML and MI are outlined in greater detail; special attention is given to an "inclusive" analytic strategy that incorporates auxiliary variables into the analytic model. The paper concludes with an illustrative analysis using an artificial quality of life data set. Computer code for all DML and MI analyses is provided, and the inclusion of auxiliary variables is illustrated.
Background Abdominal aortic aneurysm (AAA) disease is an insidious condition with an 85% chance of death after rupture. Ultrasound screening can reduce mortality, but its use is advocated only for a ...limited subset of the population at risk. Methods We used data from a retrospective cohort of 3.1 million patients who completed a medical and lifestyle questionnaire and were evaluated by ultrasound imaging for the presence of AAA by Life Line Screening in 2003 to 2008. Risk factors associated with AAA were identified using multivariable logistic regression analysis. Results We observed a positive association with increasing years of smoking and cigarettes smoked and a negative association with smoking cessation. Excess weight was associated with increased risk, whereas exercise and consumption of nuts, vegetables, and fruits were associated with reduced risk. Blacks, Hispanics, and Asians had lower risk of AAA than whites and Native Americans. Well-known risk factors were reaffirmed, including male gender, age, family history, and cardiovascular disease. A predictive scoring system was created that identifies aneurysms more efficiently than current criteria and includes women, nonsmokers, and individuals aged <65 years. Using this model on national statistics of risk factors prevalence, we estimated 1.1 million AAAs in the United States, of which 569,000 are among women, nonsmokers, and individuals aged <65 years. Conclusions Smoking cessation and a healthy lifestyle are associated with lower risk of AAA. We estimated that about half of the patients with AAA disease are not eligible for screening under current guidelines. We have created a high-yield screening algorithm that expands the target population for screening by including at-risk individuals not identified with existing screening criteria.
Protein glycosylation, one of the most heterogeneous post-translational modifications, can play a major role in cellular signal transduction and disease progression. Traditional mass spectrometry ...(MS)-based large-scale glycoprotein sequencing studies heavily rely on identifying enzymatically released glycans and their original peptide backbone separately, as there is no efficient fragmentation method to produce unbiased glycan and peptide product ions simultaneously in a single spectrum, and that can be conveniently applied to high throughput glycoproteome characterization, especially for
N-
glycopeptides, which can have much more branched glycan side chains than relatively less complex
O-
linked glycans. In this study, a redefined electron-transfer/higher-energy collision dissociation (EThcD) fragmentation scheme is applied to incorporate both glycan and peptide fragments in one single spectrum, enabling complete information to be gathered and great microheterogeneity details to be revealed. Fetuin was first utilized to prove the applicability with 19 glycopeptides and corresponding five glycosylation sites identified. Subsequent experiments tested its utility for human plasma
N-
glycoproteins. Large-scale studies explored
N-
glycoproteomics in rat carotid arteries over the course of restenosis progression to investigate the potential role of glycosylation. The integrated fragmentation scheme provides a powerful tool for the analysis of intact
N-
glycopeptides and
N-
glycoproteomics. We also anticipate this approach can be readily applied to large-scale
O-
glycoproteome characterization.
Graphical Abstract
ᅟ
A great deal of recent methodological research has focused on two modern missing data analysis methods: maximum likelihood and multiple imputation. These approaches are advantageous to traditional ...techniques (e.g. deletion and mean imputation techniques) because they require less stringent assumptions and mitigate the pitfalls of traditional techniques. This article explains the theoretical underpinnings of missing data analyses, gives an overview of traditional missing data techniques, and provides accessible descriptions of maximum likelihood and multiple imputation. In particular, this article focuses on maximum likelihood estimation and presents two analysis examples from the Longitudinal Study of American Youth data. One of these examples includes a description of the use of auxiliary variables. Finally, the paper illustrates ways that researchers can use intentional, or planned, missing data to enhance their research designs.