Missing data is a problem that occurs frequently in many scientific areas. The most sophisticated method for dealing with this problem is multiple imputation. Contrary to other methods, like listwise ...deletion, this method does not throw away information, and partly repairs the problem of systematic dropout. Although from a theoretical point of view multiple imputation is considered to be the optimal method, many applied researchers are reluctant to use it because of persistent misconceptions about this method. Instead of providing an(other) overview of missing data methods, or extensively explaining how multiple imputation works, this article aims specifically at rebutting these misconceptions, and provides applied researchers with practical arguments supporting them in the use of multiple imputation.
Whenever multiple regression is applied to a multiply imputed data set, several methods for combining significance tests for
and the change in
across imputed data sets may be used: the combination ...rules by Rubin, the Fisher
-test for
by Harel, and
-tests for the change in
by Chaurasia and Harel. For pooling
itself, Harel proposed a method based on a Fisher
transformation. In the current article, it is argued that the pooled
based on the Fisher
transformation, the Fisher
-test for
, and the
-test for the change in
have some theoretical flaws. An argument is made for using Rubin's method for pooling significance tests for
instead, and alternative procedures for pooling
are proposed: simple averaging and a pooled
constructed from the pooled significance test by Rubin. Simulations show that the Fisher
-test and Chaurasia and Harel's
-tests generally give inflated type-I error rates, whereas the type-I error rates of Rubin's method are correct. Of the methods for pooling the point estimates of
no method clearly performs best, but it is argued that the average of
's across imputed data set is preferred.
In the current study a three-generational design was used to investigate intergenerational transmission of child maltreatment (ITCM) using multiple sources of information on child maltreatment: ...mothers, fathers and children. A total of 395 individuals from 63 families reported on maltreatment. Principal Component Analysis (PCA) was used to combine data from mother, father and child about maltreatment that the child had experienced. This established components reflecting the convergent as well as the unique reports of father, mother and child on the occurrence of maltreatment. Next, we tested ITCM using the multi-informant approach and compared the results to those of two more common approaches: ITCM based on one reporter and ITCM based on different reporters from each generation. Results of our multi-informant approach showed that a component reflecting convergence between mother, father, and child reports explained most of the variance in experienced maltreatment. For abuse, intergenerational transmission was consistently found across approaches. In contrast, intergenerational transmission of neglect was only found using the perspective of a single reporter, indicating that transmission of neglect might be driven by reporter effects. In conclusion, the present results suggest that including multiple informants may be necessary to obtain more valid estimates of ITCM.
Whenever statistical analyses are applied to multiply imputed datasets, specific formulas are needed to combine the results into one overall analysis, also called combination rules. In the context of ...regression analysis, combination rules for the unstandardized regression coefficients, the
t
-tests of the regression coefficients, and the
F
-tests for testing
R
2
for significance have long been established. However, there is still no general agreement on how to combine the point estimators of
R
2
in multiple regression applied to multiply imputed datasets. Additionally, no combination rules for standardized regression coefficients and their confidence intervals seem to have been developed at all. In the current article, two sets of combination rules for the standardized regression coefficients and their confidence intervals are proposed, and their statistical properties are discussed. Additionally, two improved point estimators of
R
2
in multiply imputed data are proposed, which in their computation use the pooled standardized regression coefficients. Simulations show that the proposed pooled standardized coefficients produce only small bias and that their 95% confidence intervals produce coverage close to the theoretical 95%. Furthermore, the simulations show that the newly proposed pooled estimates for
R
2
are less biased than two earlier proposed pooled estimates.
As a procedure for handling missing data, Multiple imputation consists of estimating the missing data multiple times to create several complete versions of an incomplete data set. All these data sets ...are analyzed by the same statistical procedure, and the results are pooled for interpretation. So far, no explicit rules for pooling F tests of (repeated-measures) analysis of variance have been defined. In this article we outline the appropriate procedure for the results of analysis of variance (ANOVA) for multiply imputed data sets. It involves both reformulation of the ANOVA model as a regression model using effect coding of the predictors and applying already existing combination rules for regression models. The proposed procedure is illustrated using 3 example data sets. The pooled results of these 3 examples provide plausible F and p values.
Objective
The primary aim was assessing the cost‐effectiveness of an internet‐based self‐help program, expert‐patient support, and the combination of both compared to a care‐as‐usual condition.
...Method
An economic evaluation from a societal perspective was conducted alongside a randomized controlled trial. Participants aged 16 or older with at least mild eating disorder symptoms were randomly assigned to four conditions: (1) Featback, an online unguided self‐help program, (2) chat or e‐mail support from a recovered expert patient, (3) Featback with expert‐patient support, and (4) care‐as‐usual. After a baseline assessment and intervention period of 8 weeks, five online assessments were conducted over 12 months of follow‐up. The main result constituted cost‐utility acceptability curves with quality‐of‐life adjusted life years (QALYs) and societal costs over the entire study duration.
Results
No significant differences between the conditions were found regarding QALYs, health care costs and societal costs. Nonsignificant differences in QALYs were in favor of the Featback conditions and the lowest societal costs per participant were observed in the Featback only condition (€16,741) while the highest costs were seen in the care‐as‐usual condition (€28,479). The Featback only condition had the highest probability of being efficient compared to the alternatives for all acceptable willingness‐to‐pay values.
Discussion
Featback, an internet‐based unguided self‐help intervention, was likely to be efficient compared to Featback with guidance from an expert patient, guidance alone and a care‐as‐usual condition. Results suggest that scalable interventions such as Featback may reduce health care costs and help individuals with eating disorders that are currently not reached by other forms of treatment.
Public significance statement
Internet‐based interventions for eating disorders might reach individuals in society who currently do not receive appropriate treatment at low costs. Featback, an online automated self‐help program for eating disorders, was found to improve quality of life slightly while reducing costs for society, compared to a do‐nothing approach. Consequently, implementing internet‐based interventions such as Featback likely benefits both individuals suffering from an eating disorder and society as a whole.
Objective
Many individuals with an eating disorder do not receive appropriate care. Low‐threshold interventions could help bridge this treatment gap. The study aim was to evaluate the effectiveness ...of Featback, a fully automated online self‐help intervention, online expert‐patient support and their combination.
Method
A randomized controlled trial with a 12‐month follow‐up period was conducted. Participants aged 16 or older with at least mild eating disorder symptoms were randomized to four conditions: (1) Featback, a fully automated online self‐help intervention, (2) chat or email support from a recovered expert patient, (3) Featback with expert‐patient support and (4) a waiting list control condition. The intervention period was 8 weeks and there was a total of six online assessments. The main outcome constituted reduction of eating disorder symptoms over time.
Results
Three hundred fifty five participants, of whom 43% had never received eating disorder treatment, were randomized. The three active interventions were superior to a waitlist in reducing eating disorder symptoms (d = −0.38), with no significant difference in effectiveness between the three interventions. Participants in conditions with expert‐patient support were more satisfied with the intervention.
Discussion
Internet‐based self‐help, expert‐patient support and their combination were effective in reducing eating disorder symptoms compared to a waiting list control condition. Guidance improved satisfaction with the internet intervention but not its effectiveness. Low‐threshold interventions such as Featback and expert‐patient support can reduce eating disorder symptoms and reach the large group of underserved individuals, complementing existing forms of eating disorder treatment.
Public significance statement
Individuals with eating‐related problems who received (1) a fully automated internet‐based intervention, (2) chat and e‐mail support by a recovered individual or (3) their combination, experienced stronger reductions in eating disorder symptoms than those who received (4) usual care. Such brief and easy‐access interventions play an important role in reaching individuals who are currently not reached by other forms of treatment.
The proportion of explained variance is an important statistic in multiple regression for determining how well the outcome variable is predicted by the predictors. Earlier research on 20 different ...estimators for the proportion of explained variance, including the exact Olkin-Pratt estimator and the Ezekiel estimator, showed that the exact Olkin-Pratt estimator produced unbiased estimates, and was recommended as a default estimator. In the current study, the same 20 estimators were studied in incomplete data, with missing data being treated using multiple imputation. In earlier research on the proportion of explained variance in multiply imputed data sets, an estimator called
was shown to be the preferred pooled estimator for regular
. For each of the 20 estimators in the current study, two pooled estimators were proposed: one where the estimator was the average across imputed data sets, and one where
was used as input for the calculation of the specific estimator. Simulations showed that estimates based on
performed best regarding bias and accuracy, and that the Ezekiel estimator was generally the least biased. However, none of the estimators were unbiased at all times, including the exact Olkin-Pratt estimator based on
.
Whenever statistical analyses are applied to multiply imputed datasets, specific formulas are needed to combine the results into one overall analysis, also called combination rules. In the context of ...regression analysis, combination rules for the unstandardized regression coefficients, the t-tests of the regression coefficients, and the F-tests for testing Formula: see text for significance have long been established. However, there is still no general agreement on how to combine the point estimators of Formula: see text in multiple regression applied to multiply imputed datasets. Additionally, no combination rules for standardized regression coefficients and their confidence intervals seem to have been developed at all. In the current article, two sets of combination rules for the standardized regression coefficients and their confidence intervals are proposed, and their statistical properties are discussed. Additionally, two improved point estimators of Formula: see text in multiply imputed data are proposed, which in their computation use the pooled standardized regression coefficients. Simulations show that the proposed pooled standardized coefficients produce only small bias and that their 95% confidence intervals produce coverage close to the theoretical 95%. Furthermore, the simulations show that the newly proposed pooled estimates for Formula: see text are less biased than two earlier proposed pooled estimates.
Researchers frequently have to analyze scales in which some participants have failed to respond to some items. In this paper we focus on the exploratory factor analysis of multidimensional scales ...(i.e., scales that consist of a number of subscales) where each subscale is made up of a number of Likert-type items, and the aim of the analysis is to estimate participants’ scores on the corresponding latent traits. Our approach uses the following steps: (1) multiple imputation creates several copies of the data, in which the missing values are imputed; (2) each copy of the data is subject to independent factor analysis, and the same number of factors is extracted from all copies; (3) all factor solutions are simultaneously orthogonally (or obliquely) rotated so that they are both (a) factorially simple, and (b) as similar to one another as possible; (4) latent trait scores are estimated for ordinal data in each copy; and (5) participants’ scores on the latent traits are estimated as the average of the estimates of the latent traits obtained in the copies. We applied the approach in a real dataset where missing responses were artificially introduced following a real pattern of non-responses and a simulation study based on artificial datasets. The results show that our approach was able to compute factor score estimates even for participants that have missing data.