Gamma-Poisson Distribution

Important properties and mathematical definitions

The Gamma-Poisson distribution is a statistical distribution for overdispersed count data. It is also known under the name Negative Binomial. Unlike the Negative Binomial which is primarily used for repeated trials and number of success / failures, the Gamma-Poisson is parametrized by the mean μ and the overdispersion α. If α=0, it reduces to the Poisson distribution.

General Properties

Definition: YGammaPoisson(μ,α) where Y are non-negative integer (i.e., “counts”).

Probability Mass Function

fGP(y|μ,α)=Γ(y+1/α)Γ(1/α)Γ(y+1)(μ2αμ+μ2α)y(11+μα)1/α

Cumulative Distribution Function

FGP(y|μ,α)=I(11+μα;1/α,1+y) where I(z;a,b) is the regularized incomplete beta function.

Momements

Mean

E[Y]=μ

Variance

Var[Y]=μ+αμ2

Skewness

Skew[Y]=μ1+μα(1+2μα)μ

Kurtosis

Kurtosis[Y]=1+6μα+6μ2α2μ+μ2α+3

Moment Generating Function

MXGP(t):=E[exp(tX)]=(11μα(et1))1/α

Characteristic Function

MXGP(t):=E[exp(tX)]=(11μα(eit1))1/α

Fisher Information

I(μ)=E[(μlogfGP(y))2|μ]=1μ+αμ2

Mathematica fails to calculate I(α).

Useful derivatives

μlog(fGP(y|μ,α)=yμμ+αμ2

β0log(fGP(y|μ,α)=yμ1+αμ, where μ=exp(β0+offset).

Relation to the Negative Binomial Distribution

The Gamma-Poisson and the Negative Binomial distribution are mathematically identical, they just use different parametrizations. However, this can cause a lot of confusion, especially because the R and Wikipedia again use two different parametrizations of the Negative Binomial:

  • Wikipedia uses r for the number of trials and p for the number of failures
  • R uses size for the number of trials and p for the number of successes, thus p=1p.

To convert from the Negative Binomial to the Gamma-Poisson parametrization μ=pr1p=(1p)rpα=1/r=1/size.

To convert from the Gamma-Poisson to the Negative Binomial parametrization: p=αμ1+αμp=11+αμ and r=size=1/α.

R Implementation

dgampoi <- function(x, mean, overdispersion){
  gamma(x + 1/overdispersion)/(gamma(1/overdispersion) * gamma(x + 1)) * 
    ((mean^2 * overdispersion) / (mean + mean^2 * overdispersion))^x *
    (1/(1 + mean * overdispersion))^(1/overdispersion)
}


# Or based on the existing 'dnbinom'
dgampoi2 <- function(x, mean, overdispersion){
  dnbinom(x, mu = mean, size = 1 / overdispersion)
}
library(tidyverse)
cross_df(list(x = seq(0, 200), mu = 100, alpha = c(0, 0.05, 0.2, 2))) %>%
  mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
  ggplot(aes(x = x-0.5)) +
    geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
    annotate("text", x = 12, y = 0.03, label = expression(alpha==2), size = 7) +
    annotate("text", x = 30, y = 0.003, label = expression(alpha==0.2), size = 7, hjust = 1) +
    annotate("text", x = 120, y = 0.025, label = expression(alpha==0), size = 7) +
    annotate("text", x = 80, y = 0.015, label = expression(alpha==0.05), size = 7, hjust = 1) +
    cowplot::theme_cowplot(font_size = 20) +
    coord_trans(xlim = c(0, 150), ylim = c(0, 0.04)) + 
    scale_y_continuous(expand = expansion(0)) + 
    scale_x_continuous(expand = expansion(0), breaks = seq(0, 140, by = 20)) +
    labs(title ="Density of Gamma Poisson with µ=100",
         x = "y", y = expression(f[GP](y*"|"*mu==100,alpha)))

cross_df(list(x = seq(0, 30), mu = 10, alpha = c(0, 0.05, 0.2, 2))) %>%
  mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
  ggplot(aes(x = x-0.5)) +
    geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
    cowplot::theme_cowplot(font_size = 20) +
    coord_trans(xlim = c(0, 30)) +
    scale_y_continuous(expand = expansion(0)) +
    scale_x_continuous(expand = expansion(0), breaks = seq(0, 30, by = 5)) +
    labs(title ="Density of Gamma Poisson with µ = 10",
         x = "y", y = expression(f[GP](y*"|"*mu==10,alpha)))