### New Advances in Underdispersed Count Data Analysis

**Presenter:** Célestin Kokonendji

**When:** Monday, July 11, 2016 **Time:** 2:00 PM - 3:30 PM

**Room:** Salon A Carson Hall (Level 2)

#### Session Synopsis:

###### Multivariate underdispersion extension

Although phenomenon less frequent than over-dispersion (e.g. [4,6] for univariate cases), the multivariate under-dispersion appears in real count data and theoretical count models for showing its departure in the opposite sense from a multivariate equi-dispersed distribution. Some tentative definitions are recently proposed; e.g. [8] for bivariate Fisher index of dispersion, [3] by deduction from scaled generalized variance, and [2] using dispersion matrix. Here we will introduce an appropriated approach by defining the generalized dispersion index [5]. It will be the scalar quantity for measuring the multivariate over-equi- and under-dispersion. Defining theoretically and empirically as ratio of two quadratic forms depending on mean vector and covariance matrix, the multivariate uncorrelated Poisson appears to be the good equi-dispersed multivariate distribution serving to the natural referential distribution. Thus, multivariate Poisson negatively correlated [1] is under-dispersed. Also, the bivariate Bernoulli distribution [7] is under-dispersed with respect to the uncorrelated Poisson distribution. Others illustrations and properties will be given as the relative (generalized) dispersion index. Examples of application on real data will be presented and discussions will be proposed. Dedicated in honor and memory of Professor Bent Jï¿½rgensen. References: [1] Cuenin,J., Jï¿½rgensen,B., Kokonendji,C.C. 2016. Simulations of full multivariate Tweedie with flexible dependence structure. Comput. Statist. In Press [2] Jï¿½rgensen,B., Kokonendji,C.C. 2016. Discrete dispersion models and their Tweedie asymptotics. AStA Adv. Statist. Anal. In Press [3] Karlis,D., Xekalaki,E. 2005. Mixed Poisson distribution. Internat. Statist. Rev. 73, 35-58 [4] Kokonendji,C.C. 2014. Over- and underdispersion models. In Methods and Applications of Statistics in Clinical Trials, vol.2; Balakrishnan N.(ed) Wiley, pp. 506-526 [5] Kokonendji,C.C., Puig,P. 2016. Generalized dispersion index for multivariate over-equi- and underdispersion. In Preparation [6] Kokonendji,C.C., Mizï¿½re,D., Balakrishnan,N. 2008. Connections of the Poisson weight function to overdispersion and underdispersion. J. Statist. Plann. Inference 138, 1287-1296 [7] Marshall,A.W., Olkin,I. 1985. A family of bivariate distributions generated by the bivariate Bernoulli distribution. J. Amer. Statist. Assoc. 80, 332-338 [8] Minkova,L.D., Balakrishnan,N. 2014. Type II bivariate Pï¿½lya-Aeppli distribution. Statist. Probab. Lett. 88, 40-49

### New Advances in Underdispersed Count Data Analysis

**Presenter:** Clarice Demétrio

**When:** Monday, July 11, 2016 **Time:** 2:00 PM - 3:30 PM

**Room:** Salon A Carson Hall (Level 2)

#### Session Synopsis:

###### COMPETITION AND UNDERDISPERSION

The standard distributions for the analysis of count and proportion data are the Poisson and binomial distributions. Frequently, in practice they are too restrictive in that the variability in the data is either significantly greater (overdispersed) or less (underdispersed) than that implied by the modelï¿½s variance function. For the analysis of count data [1] says that overdispersion is the norm and not the exception and this has been well studied, see [2] and many subsequent articles presenting a wide range of distributions. Although less common, underdispersion can arise, typically from dependent responses. For instance, when there is competition between plants and animals this can induce negative correlation in temporal and spatial counting processes. The range of distributions for modelling underdispersed count data is relatively limited, although models can be derived in specific situations. However, as [3] remark, the mechanisms leading to underdispersion may be unclear and so simple empirical models may also be useful for describing data. Here we will discuss some real data examples and possible models to highlight potential sources of underdispersion and consider general approaches for handling underdispersion. References: [1] McCullagh, P.; Nelder, J.A. (1989). Generalized Linear Models. Chapman and Hall. [2] Hinde, J.;Demétrio, C.G.B. Overdispersion: Models and estimation. Computational Statistics and Data Analysis. 27: 151-170, 1998. [3] Ridout, M.S.; Besbeas, P. An empirical model for underdispersed count data. Statistical Modelling, 4:77-89, 2004.

### New Advances in Underdispersed Count Data Analysis

**Presenter:** John Hinde

**When:** Monday, July 11, 2016 **Time:** 2:00 PM - 3:30 PM

**Room:** Salon A Carson Hall (Level 2)

#### Session Synopsis:

### New Advances in Underdispersed Count Data Analysis

**Presenter:** Pedro Puig

**When:** Monday, July 11, 2016 **Time:** 2:00 PM - 3:30 PM

**Room:** Salon A Carson Hall (Level 2)

#### Session Synopsis:

###### Some mechanisms leading to underdispersion

The theory of Poisson-overdispersed count models has been developed in deep, and consequently there are many known ï¿½physical mechanismsï¿½ leading to overdispersion. For instance, the general families of Mixed Poisson and Compound Poisson distributions are always overdispersed. These physical mechanisms can be interpreted and successfully used for health sciences and biological modelling. There are also some mechanisms leading to underdispersion but they are not very known. In this talk we are going to review some of them and present new methods and applications. The first mechanism considered is a Poisson-type process where the waiting times are not exponentially distributed. Barlow and Proschan in the 60s showed that Increasing(Decreasing) Failure Rate distributions for the waiting times produce under(over)-dispersed count distributions. Examples of this mechanism are the models of Winkelmann (1995) using Gamma and Weibull waiting times. The second mechanism is the extended Poisson process of Faddy and Bosch (2001) based on the fact that any count distribution can be represented as a pure birth process with non-constant rates. This representation not always has a simple and meaningful interpretation. The third mechanism is provided by the limiting distribution of a M/M/1 queuing model, where the service time depends of the number of individuals in the queue. An example of this is the original development of the COM-Poisson distribution. This mechanism allows to construct new distributions capable to explain the behaviour of the counts of chromosomal aberrations under high doses of radiation (see Pujol et al., 2014). Finally we will introduce some new mechanisms based on the binomial subsampling operation (p-thinning). It is known that the Poisson distribution is closed under p-thinnings, but if p depends of the number of Poisson realizations the resulting distribution can be underdispersed. Several examples of application will be analyzed and discussed. References [1] Faddy, MJ. and Bosch RJ. (2001). Likelihood-Based Modeling and Analysis of Data Underdispersed Relative to the Poisson Distribution. Biometrics, 57, 620-624. [2] Pujol M., Barquinero JF., Puig P., Puig R., Caballï¿½n MR., Barrios L. (2014). A New Model of Biodosimetry to Integrate Low and High Doses. PLoSONE, 9(12):e114137. [3] Winkelmann, R.(1995). Duration Dependence and Dispersion in Count-Data Models. Journal of Business and Economic Statistics, 13(4), 467-474.

### New Advances in Underdispersed Count Data Analysis

**Presenter:** Kimberly Sellers

**When:** Monday, July 11, 2016 **Time:** 2:00 PM - 3:30 PM

**Room:** Salon A Carson Hall (Level 2)

#### Session Synopsis:

###### Underdispersion models: models that "fly under the radar"

Most count data studies and analyses center around the Poisson assumption or data overdispersion. However, examples exhibiting data underdispersion relative to the Poisson distribution are increasing in existence frequency, yet little attention has focused on this phenomenon. This work surveys various count models that allow for data underdispersion (some better known than others), and studies their respective statistical properties. Example datasets are considered to illustrate and compare performance across these various models. Selected references: Kokonendji, C.C. (2014). Over- and underdispersion models. In: The Wiley Encyclopedia of Clinical Trials Methods and Applications of Statistics in Clinical Trials (Vol. 2: Planning, Analysis, and Inferential Methods); Balakrishnan N. (ed). Wiley, New York, pp. 506-526, Chap. 30. Ridout MS, and Besbeas P (2003). An empirical model for underdispersed count data. Statistical Modelling 4: 77-89.