In observational studies, propensity scores are commonly estimated by maximum likelihood but may fail to balance high-dimensional pretreatment covariates even after specification search. We introduce ...a general framework that unifies and generalizes several recent proposals to improve covariate balance when designing an observational study. Instead of the likelihood function, we propose to optimize special loss functions—covariate balancing scoring rules (CBSR)—to estimate the propensity score. A CBSR is uniquely determined by the link function in the GLM and the estimand (a weighted average treatment effect). We show CBSR does not lose asymptotic efficiency in estimating the weighted average treatment effect compared to the Bernoulli likelihood, but CBSR is much more robust in finite samples. Borrowing tools developed in statistical learning, we propose practical strategies to balance covariate functions in rich function classes. This is useful to estimate the maximum bias of the inverse probability weighting (IPW) estimators and construct honest confidence intervals in finite samples. Lastly, we provide several numerical examples to demonstrate the tradeoff of bias and variance in the IPW-type estimators and the tradeoff in balancing different function classes of the covariates.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
Mendelian randomization (MR) is a method of exploiting genetic variation to unbiasedly estimate a causal effect in presence of unmeasured confounding. MR is being widely used in epidemiology and ...other related areas of population science. In this paper, we study statistical inference in the increasingly popular two-sample summary-data MR design. We show a linear model for the observed associations approximately holds in a wide variety of settings when all the genetic variants satisfy the exclusion restriction assumption, or in genetic terms, when there is no pleiotropy. In this scenario, we derive a maximum profile likelihood estimator with provable consistency and asymptotic normality. However, through analyzing real datasets, we find strong evidence of both systematic and idiosyncratic pleiotropy in MR, echoing the omnigenic model of complex traits that is recently proposed in genetics. We model the systematic pleiotropy by a random effects model, where no genetic variant satisfies the exclusion restriction condition exactly. In this case, we propose a consistent and asymptotically normal estimator by adjusting the profile score.We then tackle the idiosyncratic pleiotropy by robustifying the adjusted profile score. We demonstrate the robustness and efficiency of the proposed methods using several simulated and real datasets.
Full text
Available for:
BFBNIB, INZLJ, NMLJ, NUK, PNG, SAZU, UL, UM, UPUK, ZRSKP
Entropy Balancing is Doubly Robust Zhao, Qingyuan; Percival, Daniel
Journal of causal inference,
9/2017, Volume:
5, Issue:
1
Journal Article
Peer reviewed
Open access
Covariate balance is a conventional key diagnostic for methods estimating causal effects from observational studies. Recently, there is an emerging interest in directly incorporating covariate ...balance in the estimation. We study a recently proposed entropy maximization method called Entropy Balancing (EB), which exactly matches the covariate moments for the different experimental groups in its optimization problem. We show EB is doubly robust with respect to linear outcome regression and logistic propensity score regression, and it reaches the asymptotic semiparametric variance bound when both regressions are correctly specified. This is surprising to us because there is no attempt to model the outcome or the treatment assignment in the original proposal of EB. Our theoretical results and simulations suggest that EB is a very appealing alternative to the conventional weighting estimators that estimate the propensity score by maximum likelihood.
In randomized clinical trials, adjustments for baseline covariates at both design and analysis stages are highly encouraged by regulatory agencies. A recent trend is to use a model-assisted approach ...for covariate adjustment to gain credibility and efficiency while producing asymptotically valid inference even when the model is incorrect. In this article we present three considerations for better practice when model-assisted inference is applied to adjust for covariates under simple or covariate-adaptive randomized trials: (a) guaranteed efficiency gain: a model-assisted method should often gain but never hurt efficiency; (b) wide applicability: a valid procedure should be applicable, and preferably universally applicable, to all commonly used randomization schemes; (c) robust standard error: variance estimation should be robust to model misspecification and heteroscedasticity. To achieve these, we recommend a model-assisted estimator under an analysis of heterogeneous covariance working model that includes all covariates used in randomization. Our conclusions are based on an asymptotic theory that provides a clear picture of how covariate-adaptive randomization and regression adjustment alter statistical efficiency. Our theory is more general than the existing ones in terms of studying arbitrary functions of response means (including linear contrasts, ratios, and odds ratios), multiple arms, guaranteed efficiency gain, optimality, and universal applicability.
Supplementary materials
for this article are available online.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
The fields of machine learning and causal inference have developed many concepts, tools, and theory that are potentially useful for each other. Through exploring the possibility of extracting causal ...interpretations from black-box machine-trained models, we briefly review the languages and concepts in causal inference that may be interesting to machine learning researchers. We start with the curious observation that Friedman's partial dependence plot has exactly the same formula as Pearl's back-door adjustment and discuss three requirements to make causal interpretations: a model with good predictive performance, some domain knowledge in the form of a causal diagram and suitable visualization tools. We provide several illustrative examples and find some interesting and potentially causal relations using visualization tools for black-box models.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
Over a decade of genome-wide association studies (GWAS) have led to the finding of extreme polygenicity of complex traits. The phenomenon that "all genes affect every complex trait" complicates ...Mendelian Randomization (MR) studies, where natural genetic variations are used as instruments to infer the causal effect of heritable risk factors. We reexamine the assumptions of existing MR methods and show how they need to be clarified to allow for pervasive horizontal pleiotropy and heterogeneous effect sizes. We propose a comprehensive framework GRAPPLE to analyze the causal effect of target risk factors with heterogeneous genetic instruments and identify possible pleiotropic patterns from data. By using GWAS summary statistics, GRAPPLE can efficiently use both strong and weak genetic instruments, detect the existence of multiple pleiotropic pathways, determine the causal direction and perform multivariable MR to adjust for confounding risk factors. With GRAPPLE, we analyze the effect of blood lipids, body mass index, and systolic blood pressure on 25 disease outcomes, gaining new information on their causal relationships and potential pleiotropic pathways involved.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
This article proposes a new quantity called the "sensitivity value," which is defined as the minimum strength of unmeasured confounders needed to change the qualitative conclusions of a naive ...analysis assuming no unmeasured confounder. We establish the asymptotic normality of the sensitivity value in pair-matched observational studies. The theoretical results are then used to approximate the power of a sensitivity analysis and select the design of a study. We explore the potential to use sensitivity values to screen multiple hypotheses in the presence of unmeasured confounding using a microarray dataset. Supplementary materials for this article are available online.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
Thin superconducting films form a unique platform for geometrically confined, strongly interacting electrons. They allow an inherent competition between disorder and superconductivity, which in turn ...enables the intriguing superconducting-to-insulating transition and is believed to facilitate the comprehension of high-T sub(c) superconductivity. Furthermore, understanding thin film superconductivity is technologically essential, e.g., for photodetectors and quantum computers. Consequently, the absence of established universal relationships between critical temperature (T sub(c)), film thickness (d), and sheet resistance (R sub(s)) hinders both our understanding of the onset of the superconductivity and the development of miniaturized superconducting devices. We report that in thin films, superconductivity scales as dT sub(c)(R sub(s)). We demonstrated this scaling by analyzing the data published over the past 46 years for different materials (and facilitated this database for further analysis). Moreover, we experimentally confirmed the discovered scaling for NbN films, quantified it with a power law, explored its possible origin, and demonstrated its usefulness for nanometer-length-scale superconducting film-based devices.
Full text
Available for:
CMK, CTK, FMFMET, IJS, NUK, PNG, UM
In weakly cemented reservoirs or coal-bed methane reservoirs, the conductivity of hydraulic fractures always declines after a period of production, which greatly influences gas production. In this ...paper, a comprehensive model considering fine-grained particle migration and proppant embedment is proposed to give a precise prediction for conductivity decline. Then, an experiment was conducted to simulate this process. A published experiment using coal fines was also tested and simulated. The results indicate that both fine-grained particle migration and proppant embedment have great negative effect on conductivity of fractures in weakly cemented sandstone and coal-bed methane reservoirs. The formulation we proposed matches the experimental data smoothly and can be widely used in the prediction of conductivity decline in weakly cemented sandstone and coal-bed methane reservoirs. In order to discuss the influencing factors of the filtration coefficient in the particle transport model, a porous media network model was established based on the theoretical model. The simulation results show that the filtration coefficient increases with the increase in particle size and/or throat size, and the filtration coefficient increases with the decrease in the fluid velocity. At the same time, it was found that the large larynx did not easily cause particle retention. Large size particles tend to cause particle retention.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK