Gamma-Poisson Distribution

Important properties and mathematical definitions

The Gamma-Poisson distribution is a statistical distribution for overdispersed count data. It is also known under the name Negative Binomial. Unlike the Negative Binomial which is primarily used for repeated trials and number of success / failures, the Gamma-Poisson is parametrized by the mean \(\mu\) and the overdispersion \(\alpha\). If \(\alpha = 0\), it reduces to the Poisson distribution.

General Properties

Definition: \[ Y \sim \text{GammaPoisson}\left(\mu,\alpha\right) \] where \(Y\) are non-negative integer (i.e., “counts”).

Probability Mass Function

\[ f_{\text{GP}}(y | \mu, \alpha) = \frac{\Gamma\left(y + 1/\alpha\right)}{\Gamma\left(1/\alpha\right)\Gamma\left(y + 1\right)} \left(\frac{\mu^2 \alpha}{\mu + \mu^2 \alpha}\right)^{y} \left(\frac{1}{1 + \mu \alpha}\right)^{1/\alpha} \]

Cumulative Distribution Function

\[ F_{\text{GP}}(y | \mu, \alpha) = I\left(\frac{1}{1+\mu\alpha}; 1/\alpha, 1 + y\right) \] where \(I(z; a, b)\) is the regularized incomplete beta function.

Momements

Mean

\[ \mathbb{E}[Y] = \mu \]

Variance

\[ \mathbb{V}\text{ar}[Y] = \mu + \alpha \mu^2 \]

Skewness

\[ \mathbb{S}\text{kew}[Y] = \frac{\sqrt{\frac{\mu}{1 + \mu\alpha}} \left(1 + 2 \mu\alpha \right)}{\mu} \]

Kurtosis

\[ \mathbb{K}\text{urtosis}[Y] = \frac{1 + 6 \mu\alpha + 6 \mu^2\alpha^2}{\mu + \mu^2\alpha} + 3 \]

Moment Generating Function

\[ M_{X\sim\text{GP}}(t) := \mathbb{E}[\exp\left(t X\right)] = \left(\frac{1}{1 - \mu\alpha (e^t-1)}\right)^{1/\alpha} \]

Characteristic Function

\[ M_{X\sim\text{GP}}(t) := \mathbb{E}[\exp\left(t X\right)] = \left(\frac{1}{1 - \mu\alpha (e^{i t}-1)}\right)^{1/\alpha} \]

Fisher Information

\[ \mathcal{I}(\mu) = \mathbb{E}\left[\left(\frac{\partial}{\partial\mu}\log f_{\text{GP}}(y)\right)^2\;\Bigg|\; \mu \right] = \frac{1}{\mu + \alpha \mu^2} \]

Mathematica fails to calculate \(\mathcal{I}(\alpha)\).

Useful derivatives

\[ \frac{\partial}{\partial\mu}\log\left(f_{GP}(y|\mu,\alpha\right) = \frac{y - \mu}{\mu+\alpha\mu^2} \]

\[ \frac{\partial}{\partial\beta_0}\log\left(f_{GP}(y|\mu,\alpha\right) = \frac{y - \mu}{1+\alpha\mu}, \] where \(\mu = \exp\left(\beta_0+\text{offset}\right)\).

Relation to the Negative Binomial Distribution

The Gamma-Poisson and the Negative Binomial distribution are mathematically identical, they just use different parametrizations. However, this can cause a lot of confusion, especially because the \(\texttt{R}\) and Wikipedia again use two different parametrizations of the Negative Binomial:

  • Wikipedia uses \(r\) for the number of trials and \(p\) for the number of failures
  • \(\texttt{R}\) uses \(\text{size}\) for the number of trials and \(p'\) for the number of successes, thus \(p' = 1 - p\).

To convert from the Negative Binomial to the Gamma-Poisson parametrization \[ \begin{aligned} \mu &= \frac{p r}{1 - p} &&= \frac{(1-p') r}{p'} \\ \alpha &= 1/r &&= 1/\text{size}. \end{aligned} \]

To convert from the Gamma-Poisson to the Negative Binomial parametrization: \[ \begin{aligned} p &= \frac{\alpha\mu}{1 + \alpha\mu} \\ p' &= \frac{1}{1+ \alpha\mu} \end{aligned} \] and \[ r = \text{size} = 1/\alpha. \]

\(\texttt{R}\) Implementation

dgampoi <- function(x, mean, overdispersion){
  gamma(x + 1/overdispersion)/(gamma(1/overdispersion) * gamma(x + 1)) * 
    ((mean^2 * overdispersion) / (mean + mean^2 * overdispersion))^x *
    (1/(1 + mean * overdispersion))^(1/overdispersion)
}


# Or based on the existing 'dnbinom'
dgampoi2 <- function(x, mean, overdispersion){
  dnbinom(x, mu = mean, size = 1 / overdispersion)
}
library(tidyverse)
cross_df(list(x = seq(0, 200), mu = 100, alpha = c(0, 0.05, 0.2, 2))) %>%
  mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
  ggplot(aes(x = x-0.5)) +
    geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
    annotate("text", x = 12, y = 0.03, label = expression(alpha==2), size = 7) +
    annotate("text", x = 30, y = 0.003, label = expression(alpha==0.2), size = 7, hjust = 1) +
    annotate("text", x = 120, y = 0.025, label = expression(alpha==0), size = 7) +
    annotate("text", x = 80, y = 0.015, label = expression(alpha==0.05), size = 7, hjust = 1) +
    cowplot::theme_cowplot(font_size = 20) +
    coord_trans(xlim = c(0, 150), ylim = c(0, 0.04)) + 
    scale_y_continuous(expand = expansion(0)) + 
    scale_x_continuous(expand = expansion(0), breaks = seq(0, 140, by = 20)) +
    labs(title ="Density of Gamma Poisson with µ=100",
         x = "y", y = expression(f[GP](y*"|"*mu==100,alpha)))

cross_df(list(x = seq(0, 30), mu = 10, alpha = c(0, 0.05, 0.2, 2))) %>%
  mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
  ggplot(aes(x = x-0.5)) +
    geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
    cowplot::theme_cowplot(font_size = 20) +
    coord_trans(xlim = c(0, 30)) +
    scale_y_continuous(expand = expansion(0)) +
    scale_x_continuous(expand = expansion(0), breaks = seq(0, 30, by = 5)) +
    labs(title ="Density of Gamma Poisson with µ = 10",
         x = "y", y = expression(f[GP](y*"|"*mu==10,alpha)))