Oral

Spatial Epidemiology

Presenter: Patrick Brown

When: Thursday, July 14, 2016      Time: 4:30 PM - 6:00 PM

Room: Saanich 1-2 (Level 1)

Session Synopsis:

Using the Local-EM Algorithm for Spatio-Temporal Analysis of Spatially Aggregated Cancer Data

Public health data is often aggregated in space and time due to privacy concerns or simply because more precise information is not being collected. Modelling spatial variation in the risk of rare diseases often requires the consideration of data spanning many years which raises issues pertaining to aggregation boundaries changing over time. An extension is made to the local-EM methodology for aggregated spatial data and changing boundaries, incorporating a temporal component and spatial data which includes a mix of exact spatial locations and aggregated data which are not necessarily nested. This added flexibility allows the modelling of data amalgamated from different sources and collected over many years. Also, while local-EM leads naturally to an EMS algorithm, here it leads to a modified algorithm that includes an additive term at every iteration to account for exact case data. The spatio-temporal analysis of bladder cancer incidences for males diagnosed in southwestern Nova Scotia is used to illustrate the methodology.

Spatial Epidemiology

Presenter: Tomas Goicoa

When: Thursday, July 14, 2016      Time: 4:30 PM - 6:00 PM

Room: Saanich 1-2 (Level 1)

Session Synopsis:

Identifiability issues in spatio-temporal disease mapping models

High quality mortality and incidence registers together with an increasing demand on information from epidemiologists and health policy makers have led to a development of flexible statistical models, faster fitting techniques, and free software including recent methodological advances. However, a careless approach to this ready to use statistical resources can lead to misleading results due, among other causes, to wrong specification of identifiability constraints that standard software usually fixes at default values. Generalized linear mixed models (GLMM) used in disease mapping are generally not identifiable. Consequently, how to cope with identifiability issues is crucial. Inference in spatio-temporal disease mapping has been carried out withing a general Bayesian framework with two main approaches: an Empirical Bayes (EB) and a full Bayes approach. The first one has traditionally relied on penalized quasi-likelihood (PQL) whereas the second one has been based on Markov chain Monte Carlo (McMC) techniques or, very recently, on integrated nested Laplace approximations (INLA). The literature on spatial and spatio-temporal disease mapping is extensive and although identifiability problems have been mentioned, we think that this matter deserves further research and then, it should still be clarified for practitioners. In this talk we deal with identifiability issues in space-time disease mapping models including an intercept (overall risk level), spatial, temporal, and spatio-temporal random effects. We consider a conditional autoregressive (CAR) prior for the spatial random effects, and a first or second order random walk for time. The interaction random effects are defined in terms of the covariance/structure matrices of the spatial and the temporal random effects. We show that PQL automatically places contraints on the random effect estimates leading to correct or incorrect results if a first or second order random walk is used for time respectively. A discussion about identifiability constraints in INLA is also provided.

Spatial Epidemiology

Presenter: Samuel Manda

When: Thursday, July 14, 2016      Time: 4:30 PM - 6:00 PM

Room: Saanich 1-2 (Level 1)

Session Synopsis:

Assessing Joint Spatial Autocorrelations in Multiple Mortality Risks in South Africa, 1997-2014

Multivariate spatial disease models have currently been developed and applied to estimate common and disease-specific risks. For these analyses, estimated risk and exceedance probability maps showing areas with elevated disease risks are derived to aid health policy interventions and resource prioritisation. However, it may be important to assess whether or not the estimated high disease risks in a particular area occur independently or are related spatially or temporally. This paper is concerned with situations where multiple disease cases occur closely together, both in space and time; forming multiple spatial-temporal clustering. Local join-count statistics are used to determine joint spatial associations among multiple age-gender mortality risks in South Africa between 1997 and 2014. The multivariate spatial clustering statistics are used in conjunction with both univariate and multivariate Bayesian spatial models for multiple disease outcomes to assess both temporal spatial clustering in the same specific mortality rate and joint spatial clustering in related age-gender mortality rates.

Spatial Epidemiology

Presenter: Gyanendra Pokharel

When: Thursday, July 14, 2016      Time: 4:30 PM - 6:00 PM

Room: Saanich 1-2 (Level 1)

Session Synopsis:

Gaussian process emulators for parameterizing spatial infectious disease transmission models incorporating infection time uncertainty

Mechanistic models of infectious disease spread are key to inferring spatio-temporal infectious disease transmission dynamics. Ideally, covariate data and the infection status of individuals over time would be used to parameterize such models but in reality, complete data are rarely available. For example, infection times are almost never observed. Bayesian Data Augmented Markov Chain Monte Carlo (DA-MCMC) methods are commonly used to allow us to infer the missing or censored data. However, for large disease systems, the method is computationally very expensive. In this paper, we propose two methods of inference for such situations, based on so-called emulation techniques. Here, both methods are set in a Bayesian MCMC framework but avoid the calculation of computationally expensive likelihood function by replacing it with a Gaussian process-based likelihood approximation. In the first, we incorporate the incubation period, the time from infection to the disease diagnosis/reporting of infected individuals, distribution’s parameters into a Gaussian process emulator. This emulator is built by modelling the discrepancy between summary statistics of simulated and observed epidemic data. In the second method, we use a pseudo-marginal likelihood approximation to allow for infection time/period uncertainty and use the emulator to directly model the log-likelihood of the model parameters. We show how methods offer substantial computational efficiency gains over standard Bayesian MCMC-based methods and can be used to infer the transmission of complex infectious disease systems.

Spatial Epidemiology

Presenter: Loni Tabb

When: Thursday, July 14, 2016      Time: 4:30 PM - 6:00 PM

Room: Saanich 1-2 (Level 1)

Session Synopsis:

Spatial Analysis of County-Level Diabetes Prevalence – Project CHANGE: Creating a Healthier Mississippi One Community at a Time

Diabetes is one of the nation’s pressing public health concerns, where this chronic disease affects more than 25 million Americans. This disease has been shown to be associated with obesity, physical inactivity, as well as many built environment features. These built environment features range from access to healthy foods to rural-urban classification. The objective of this study was to evaluate the geographic distribution of diabetes by county in the state of Mississippi as it relates to both individual and county-level characteristics. In 2012, Mississippi ranked second in the nation in overall diabetes prevalence; therefore, understanding the spatial prevalence of diabetes in this state is crucial. Using My Brother’s Keeper’s Project CHANGE: Creating a Healthier Mississippi One Community at a Time survey, which was a statewide health assessment administered in 2013, as well as auxiliary data, we applied a small area estimation method to estimate county level diabetes prevalence. Specifically, we used integrated nested Laplace approximations within a Bayesian statistical framework to estimate the impact of various individual and county-level characteristics on diabetes. Our hierarchical models allowed for the estimation of both fixed and random effects, as well adjusting for the spatial correlation between neighboring counties within the entire state. We also assessed the geographic distribution of diabetes using county-level maps of Mississippi. We found that individual characteristics were significantly associated with diabetes. For instance, there is a 54% and 63% increase in the risk of diabetes for those who are African American and who currently use tobacco products, respectively. At the county level, the rural-urban classification was also associated with this disease, where those living in nonmetropolitan areas adjacent to metropolitan areas (compared to those living in a completely rural area) have a 44% increase in the risk of diabetes. Additionally, our maps allowed for an examination of the geographic variability of diabetes by county. Interventions should target specific regions of Mississippi in efforts to reducing the risk of diabetes, and our study findings should aid in the development of these policies and programs.

Spatial Epidemiology

Presenter: Kunihiko Takahashi

When: Thursday, July 14, 2016      Time: 4:30 PM - 6:00 PM

Room: Saanich 1-2 (Level 1)

Session Synopsis:

A multiple cluster detection test based on scan statistics and generalized linear models for disease clustering

A number of statistical tests have been proposed and widely used in spatial epidemiology for detecting disease clustering. In particular, the cluster detection test (CDT) assesses whether disease is randomly distributed over a space, and detects local clusters without any prior information on its location if it is statistically significant. The spatial scan statistic is one of the most powerful tools for the CDT, and the standard spatial scan statistic adopts the maximum likelihood ratio test, scanning various windows; examples include, Kulldorff’s circular scan statistic and Tango and Takahashi’s flexibly shaped scan statistic. Although these statistics assume the existence of “single” cluster, there are likely more than one cluster in a space under question. The standard scan statistic procedure evidently detects additional clusters, the secondary clusters, which are mutually exclusive, with a significantly large likelihood ratio. The p-values for those clusters are however calculated as if each of them were the primary cluster, the most likely cluster, while an adjustment is proposed by Zhang et al (2010). In this work, we propose a new test procedure that simultaneously detects multiple clusters, utilising generalized linear models. This framework encompasses the conventional single cluster detection procedure as its special case, since it uses the scan statistic test to list up candidate clusters. The p-value is calculated through Monte Carlo hypothesis testing using simulated data under the null hypothesis that there is no cluster in the area. We present practical examples applying the proposed procedure, and compare the results with ones by conventional procedures.