Oral

Causal Inference 1

Presenter: Scott Coggeshall

When: Monday, July 11, 2016      Time: 2:00 PM - 3:30 PM

Room: Salon C Carson Hall (Level 2)

Session Synopsis:

Causal Inference in Randomized Experiments with Partial Compliance

Randomized clinical trials (RCTs) are a powerful tool in causal inference for assessing the effect of an intervention. However, RCTs frequently encounter the problem of non-compliance to treatment assignment. Non-compliance does not affect inference for the intent-to-treat effect, but complicates causal inference on the effect of actually receiving the intervention. When estimates and inference for the effect of actually receiving the treatment are desired, information about compliance behavior must be taken into account in the analysis. In the case of binary noncompliance, well-established frameworks exist for both Bayesian and frequentist inference. However, for interventions that consist of treatment regimens rather than one-time interventions, compliance to treatment assignment may be better modeled on a continuous scale. As an example, intensive behavioral therapy interventions are currently being studied as a way to improve outcomes in children with autism. These interventions call for children to receive a large number of hours of therapy per week for an extended period of time. However, the actual amount of therapy received will typically vary and may fall short of the amount indicated in the study protocol. For interventions such as these, so-called partial compliance methods may be more appropriate. Existing frequentist approaches to modeling partial compliance have focused on two-stage maximum-likelihood (ML) methods. Although computationally simple, these methods may produce invalid standard error estimates. To address this, we propose an approach for causal inference in RCTs with partial compliance through a full-ML method based on a weighted Expectation-Maximization (EM) algorithm. Through simulations, we compare the performance of this method to the two-stage ML method. We demonstrate that the weighted EM approach results in improved performance over the two-stage ML approach through lower mean-squared error and better confidence interval coverages. We then use this approach to analyze data from an RCT of an intensive behavioral therapy intervention for children with autism.

Causal Inference 1

Presenter: Mohammad Karim

When: Monday, July 11, 2016      Time: 2:00 PM - 3:30 PM

Room: Salon C Carson Hall (Level 2)

Session Synopsis:

Estimating inverse probability weights using super learner when weight-model specification is unknown in a marginal structural Cox model context: An application to multiple sclerosis

This study is motivated by the investigation of the impact of beta-interferon treatment in delaying disability progression in subjects with multiple sclerosis from British Columbia, Canada (1995-2008) using a marginal structural Cox model (MSCM). The distribution of inverse probability weights (IPWs) can influence the estimated effects from MSCM and their accuracy. Previous research has shown that MSCM estimates are highly sensitive to weight model misspecification. It is common practice to estimate IPWs using parametric models, such as main-effects logistic regression modesl. In practical applications, researchers are usually unaware of the true specification of the weight model and assumptions underlying such parametric model likely do not hold. Data-adaptive statistical learning methods may provide an alternative in that respect. Many statistical learning approaches are available in the literature, and which particular approach works best in a given dataset is impossible to predict. Super Learner (SL) has been proposed as a tool for selecting an optimal learner from a set of candidates using cross-validation. In this study, we evaluate the usefulness of a SL in estimating IPW in four different MSCM simulation scenarios with respect to bias, MSE, and coverage probabilities of model-based nominal 95% confidence intervals. Simulation scenarios differed from each other with respect to whether the true weight model specification deviates from linearity, additivity or both. Our simulation shows that, in the presence of weight model misspecification, with a rich and diverse set of candidate algorithms, SL can offer a better alternative than using logistic regression model or other statistical learning approaches in terms of MSE of the estimated effect in the MSCM context. The findings from the simulation studies guided the fitting of the MSCM in the British Columbia multiple sclerosis cohort data.

Causal Inference 1

Presenter: Caleb Miles

When: Monday, July 11, 2016      Time: 2:00 PM - 3:30 PM

Room: Salon C Carson Hall (Level 2)

Session Synopsis:

A Class of Semiparametric Tests of Treatment Effect Robust to Measurement Error of a Confounder

When assessing the presence of an exposure causal effect on a given outcome, it is well known that classical measurement error of the exposure can seriously reduce the power of a test of the null hypothesis in question, although its type I error rate will generally remain controlled at the nominal level. In contrast, classical measurement error of a confounder can have disastrous consequences on the type I error rate of a test of treatment effect. In this paper, we develop a large class of semiparametric test statistics of an exposure causal effect, which are completely robust to classical measurement error of a subset of confounders. A unique and appealing feature of our proposed methods is that they require no external information such as validation data or replicates of error-prone confounders. We present a doubly-robust form of this test that requires only one of two models -- an outcome-regression and a propensity-score model -- to be correctly specified for the resulting test statistic to have correct type I error rate. Validity within our class of test statistics is demonstrated in a simulation study. We apply the methods to a multi-U.S.-city, time-series data set to test for an effect of temperature on mortality while adjusting for atmospheric particulate matter with diameter of 2.5 micrometres or less (PM2.5), which is well known to be measured with error.

Causal Inference 1

Presenter: Pedro Ramos Cerqueira

When: Monday, July 11, 2016      Time: 2:00 PM - 3:30 PM

Room: Salon C Carson Hall (Level 2)

Session Synopsis:

Investigating the underlying causal network on European football teams

Football, or soccer, can be considered one of the most important sports in the world. Managers, specialists and fans are always trying to find out the important keys to have a good team. The evaluation of the team quality can present many variables and subjective concepts, and for this reason, it is not simple to answer the following question: How is quality defined? Another point that should be considered is the importance of aspects such as offensive and defensive: Which one can be considered more important to measure quality of a football team? For this task, we propose the use of a causal model with latent variables as a tool to measure the subjectivity of the team quality and how it can be affected by other aspects. Information from the four most important football leagues (England, Germany, Italy and Spain) during three seasons (2011-2012; 2012-2013; 2013-2014) was collected. We chose championships as leagues instead of cups or playoffs, to avoid the “lucky” effect. A causal model was used to evaluate the team quality and verify if this quality could be affected by other factors. A causal model with latent variables was used in this work, which permitted us to infer the subjectivity of some concepts. The analyses were performed using the R software with the “lavann” package. A model with five latent variables (attack, creation, defense, discipline and quality) was considered the best. Quality was expressed as points rate, classification, goals difference, home points rate, away points rate and position. Attack was expressed as goals favor, shots, shots on goal, offside and wins. Defense can be expressed as goals against, clean sheet and shots conceded. Discipline was expressed as fouls, yellow cards and red cards. Creation was expressed as passes, possession, interceptions and dribbles. Causal relationships among the latent variables were also obtained, where creation and discipline exerts effects on the offensive and defensive skills, respectively. Both variables, attack and defense, affects the quality. It is possible to verify that the most important concept to quality is the offensive, once the results show that offensive is almost three times more influent than the defensive aspect. These results represent the players market strategies, because the most valuable players, generally, are those that present offensive skills more developed, such as midfielders, forwards and strikers.

Causal Inference 1

Presenter: Linbo Wang

When: Monday, July 11, 2016      Time: 2:00 PM - 3:30 PM

Room: Salon C Carson Hall (Level 2)

Session Synopsis:

Robust Estimation of Propensity Score Weights via Subclassification

The propensity score plays a central role in inferring causal effects from observational studies. In particular, weighting and subclassification are two principal approaches to estimate the average causal effect based on estimated propensity scores. Unlike the conventional version of the propensity score subclassification estimator, if the propensity score model is correctly specified, the weighting methods offer consistent and possibly efficient estimation of the average causal effect. However, this theoretical appeal may be diminished in practice by sensitivity to misspecification of the propensity score model. In contrast, subclassification methods are usually more robust to model misspecification. We hence propose to use subclassification for robust estimation of propensity score weights. Our approach is based on the intuition that the inverse probability weighting estimator can be seen as the limit of subclassification estimators as the number of subclasses goes to infinity. By formalizing this intuition, we propose novel propensity score weighting estimators that are both consistent and robust to model misspecification. Empirical studies show that the proposed estimators perform favorably compared to existing methods.

Causal Inference 1

Presenter: Yi Zhao

When: Monday, July 11, 2016      Time: 11:00 AM - 3:30 PM

Room: Salon C Carson Hall (Level 2)

Session Synopsis:

Pathway Lasso: Estimate and Select Sparse Mediation Pathways with High Dimensional Mediators

In many scientific studies, it becomes increasingly important to delineate the causal pathways through a large number of mediators, such as genetic and brain mediators. Structural equation modeling (SEM) is a popular technique to estimate the pathway effects, commonly expressed as products of coefficients. However, it becomes unstable to fit such models with high dimensional mediators as predictors, especially for a general setting where all the mediators are causally dependent but the exact causal relationships between them are unknown. This paper proposes a sparse mediation model using a regularized SEM approach, where sparsity here means that a small number of mediators have nonzero mediation effects between a treatment and an outcome. To address the model selection challenge, we innovate by introducing a new penalty called Pathway Lasso. This penalty function is a convex relaxation of the non-convex product function, and it enables a computationally tractable optimization criterion to estimate and select many pathway effects simultaneously. We develop a fast ADMM-type algorithm to compute the model parameters, and we show that the interative updates can be expressed in closed form. On both simulated data and a real fMRI dataset, the proposed approach yields higher pathway selection accuracy and lower estimation bias than other competing methods.