When estimating causal effects using observational data, it is desirable to replicate a randomized experiment as closely as possible by obtaining treated and control groups with similar covariate ...distributions. This goal can often be achieved by choosing well-matched samples of the original treated and control groups, thereby reducing bias due to the covariates. Since the 1970s, work on matching methods has examined how to best choose treated and control subjects for comparison. Matching methods are gaining popularity in fields such as economics, epidemiology, medicine and political science. However, until now the literature and related advice has been scattered across disciplines. Researchers who are interested in using matching methods—or developing methods related to matching—do not have a single place to turn to learn about past and current research. This paper provides a structure for thinking about matching methods and guidance on their use, coalescing the existing research (both old and new) and providing a summary of where the literature on matching methods is now and where it should be headed.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
Propensity score weighting is sensitive to model misspecification and outlying weights that can unduly influence results. The authors investigated whether trimming large weights downward can improve ...the performance of propensity score weighting and whether the benefits of trimming differ by propensity score estimation method. In a simulation study, the authors examined the performance of weight trimming following logistic regression, classification and regression trees (CART), boosted CART, and random forests to estimate propensity score weights. Results indicate that although misspecified logistic regression propensity score models yield increased bias and standard errors, weight trimming following logistic regression can improve the accuracy and precision of final parameter estimates. In contrast, weight trimming did not improve the performance of boosted CART and random forests. The performance of boosted CART and random forests without weight trimming was similar to the best performance obtainable by weight trimmed logistic regression estimated propensity scores. While trimming may be used to optimize propensity score weights estimated using logistic regression, the optimal level of trimming is difficult to determine. These results indicate that although trimming can improve inferences in some settings, in order to consistently improve the performance of propensity score weighting, analysts should focus on the procedures leading to the generation of weights (i.e., proper specification of the propensity score model) rather than relying on ad-hoc methods such as weight trimming.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
There is increasing interest in estimating the causal effects of treatments using observational data. Propensity-score matching methods are frequently used to adjust for differences in observed ...characteristics between treated and control individuals in observational studies. Survival or time-to-event outcomes occur frequently in the medical literature, but the use of propensity score methods in survival analysis has not been thoroughly investigated. This paper compares two approaches for estimating the Average Treatment Effect (ATE) on survival outcomes: Inverse Probability of Treatment Weighting (IPTW) and full matching. The performance of these methods was compared in an extensive set of simulations that varied the extent of confounding and the amount of misspecification of the propensity score model. We found that both IPTW and full matching resulted in estimation of marginal hazard ratios with negligible bias when the ATE was the target estimand and the treatment-selection process was weak to moderate. However, when the treatment-selection process was strong, both methods resulted in biased estimation of the true marginal hazard ratio, even when the propensity score model was correctly specified. When the propensity score model was correctly specified, bias tended to be lower for full matching than for IPTW. The reasons for these biases and for the differences between the two methods appeared to be due to some extreme weights generated for each method. Both methods tended to produce more extreme weights as the magnitude of the effects of covariates on treatment selection increased. Furthermore, more extreme weights were observed for IPTW than for full matching. However, the poorer performance of both methods in the presence of a strong treatment-selection process was mitigated by the use of IPTW with restriction and full matching with a caliper restriction when the propensity score model was correctly specified.
Full text
Available for:
NUK, OILJ, SAZU, UKNU, UL, UM, UPUK
We explore the structures of protoclusters and their relationship with high-redshift clusters using the Millennium Simulation combined with a semi-analytic model. We find that protoclusters are very ...extended, with 90 per cent of their mass spread across ∼35 h
−1 Mpc comoving at z = 2 ( ∼ 30 arcmin). The ‘main halo’, which can manifest as a high-redshift cluster or group, is only a minor feature of the protocluster, containing less than 20 per cent of all protocluster galaxies at z = 2. Furthermore, many protoclusters do not contain a main halo that is massive enough to be identified as a high-redshift cluster. Protoclusters exist in a range of evolutionary states at high redshift, independent of the mass they will evolve to at z = 0. We show that the evolutionary state of a protocluster can be approximated by the mass ratio of the first and second most massive haloes within the protocluster, and the z = 0 mass of a protocluster can be estimated to within 0.2 dex accuracy if both the mass of the main halo and the evolutionary state are known. We also investigate the biases introduced by only observing star-forming protocluster members within small fields. The star formation rate required for line-emitting galaxies to be detected is typically high, which leads to the artificial loss of low-mass galaxies from the protocluster sample. This effect is stronger for observations of the centre of the protocluster, where the quenched galaxy fraction is higher. This loss of low-mass galaxies, relative to the field, distorts the size of the galaxy overdensity, which in turn can contribute to errors in predicting the z = 0 evolved mass.
Objective
To provide a tutorial for using propensity score methods with complex survey data.
Data Sources
Simulated data and the 2008 Medical Expenditure Panel Survey.
Study Design
Using simulation, ...we compared the following methods for estimating the treatment effect: a naïve estimate (ignoring both survey weights and propensity scores), survey weighting, propensity score methods (nearest neighbor matching, weighting, and subclassification), and propensity score methods in combination with survey weighting. Methods are compared in terms of bias and 95 percent confidence interval coverage. In Example 2, we used these methods to estimate the effect on health care spending of having a generalist versus a specialist as a usual source of care.
Principal Findings
In general, combining a propensity score method and survey weighting is necessary to achieve unbiased treatment effect estimates that are generalizable to the original survey target population.
Conclusions
Propensity score methods are an essential tool for addressing confounding in observational studies. Ignoring survey weights may lead to results that are not generalizable to the survey target population. This paper clarifies the appropriate inferences for different propensity score methods and suggests guidelines for selecting an appropriate propensity score method based on a researcher's goal.
Full text
Available for:
BFBNIB, DOBA, FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, SIK, UILJ, UKNU, UL, UM, UPUK
Li et al talk about multiple imputation, a flexible tool for handling missing data. Multiple imputation fills in missing values by generating plausible numbers derived from distributions of and ...relationships among observed variables in the data set. Multiple imputation differs from single imputation methods because missing data are filled in many times, with many different plausible values estimated for each missing value. Using multiple plausible values provides a quantification of the uncertainty in estimating what the missing values might be, avoiding creating false precision (as can happen with single imputation). Multiple imputation provides accurate estimates of quantities or associations of interest, such as treatment effects in randomized trials, sample means of specific variables, correlations between 2 variables, as well as the related variances. In doing so, it reduces the chance of false-positive or false-negative conclusions.
The incorporation of causal inference in mediation analysis has led to theoretical and methodological advancements-effect definitions with causal interpretation, clarification of assumptions required ...for effect identification, and an expanding array of options for effect estimation. However, the literature on these results is fast-growing and complex, which may be confusing to researchers unfamiliar with causal inference or unfamiliar with mediation. The goal of this article is to help ease the understanding and adoption of causal mediation analysis. It starts by highlighting a key difference between the causal inference and traditional approaches to mediation analysis and making a case for the need for explicit causal thinking and the causal inference approach in mediation analysis. It then explains in as-plain-as-possible language existing effect types, paying special attention to motivating these effects with different types of research questions, and using concrete examples for illustration. This presentation differentiates 2 perspectives (or purposes of analysis): the explanatory perspective (aiming to explain the total effect) and the interventional perspective (asking questions about hypothetical interventions on the exposure and mediator, or hypothetically modified exposures). For the latter perspective, the article proposes tapping into a general class of interventional effects that contains as special cases most of the usual effect types-interventional direct and indirect effects, controlled direct effects and also a generalized interventional direct effect type, as well as the total effect and overall effect. This general class allows flexible effect definitions which better match many research questions than the standard interventional direct and indirect effects.
Full text
Available for:
CEKLJ, FFLJ, NUK, ODKLJ, PEFLJ, UPUK
In-person schooling has proved contentious and difficult to study throughout the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic. Data from a massive online survey in the United ...States indicate an increased risk of COVID-19-related outcomes among respondents living with a child attending school in person. School-based mitigation measures are associated with significant reductions in risk, particularly daily symptoms screens, teacher masking, and closure of extracurricular activities. A positive association between in-person schooling and COVID-19 outcomes persists at low levels of mitigation, but when seven or more mitigation measures are reported, a significant relationship is no longer observed. Among teachers, working outside the home was associated with an increase in COVID-19-related outcomes, but this association is similar to that observed in other occupations (e.g., health care or office work). Although in-person schooling is associated with household COVID-19 risk, this risk can likely be controlled with properly implemented school-based mitigation measures.
Missing data ubiquitously occur in randomized controlled trials and may compromise the causal inference if inappropriately handled. Some problematic missing data methods such as complete case (CC) ...analysis and last-observation-carried-forward (LOCF) are unfortunately still common in nutrition trials. This situation is partially caused by investigator confusion on missing data assumptions for different methods. In this statistical guidance, we provide a brief introduction of missing data mechanisms and the unreasonable assumptions that underlie CC and LOCF and recommend 2 appropriate missing data methods: multiple imputation and full information maximum likelihood.
Full text
Available for:
CMK, GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP