glmGamPoi

Histogram of gene expression counts from a single cell experiment

The glmGamPoi package is optimized to fit the Gamma-Poisson1 on large single cell datasets. It supports on-disk data via the HDF5Array package and is faster on large datasets than edgeR or DESeq2.

You can install it since the 3.11 release (May 2020) from Bioconductor

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("glmGamPoi")

Example

Fit Gamma-Poisson directly on a single dataset:

sce <- TENxPBMCData::TENxPBMCData("pbmc4k")
## snapshotDate(): 2020-04-27
## see ?TENxPBMCData and browseVignettes('TENxPBMCData') for documentation
## loading from cache
system.time(
  fit <- glmGamPoi::glm_gp(sce, on_disk = FALSE)
)
##    user  system elapsed 
##  80.529  11.687  92.399
summary(fit)
## glmGamPoiFit object:
## The data had 33694 rows and 4340 columns.
## A model with 1 coefficient was fitted.
## The design formula is: Y~1
## 
## Beta:
##            Min 1st Qu. Median 3rd Qu.  Max
## Intercept -Inf    -Inf  -7.68   -3.61 5.37
## 
## deviance:
##  Min 1st Qu. Median 3rd Qu.   Max
##    0       0   30.8     746 27797
## 
## overdispersion:
##  Min 1st Qu. Median 3rd Qu.   Max
##    0       0      0     0.5 26483
## 
## Shrunken quasi-likelihood overdispersion:
##    Min 1st Qu. Median 3rd Qu.  Max
##  0.449       1      1       1 36.7
## 
## size_factors:
##    Min 1st Qu. Median 3rd Qu.  Max
##  0.743   0.968      1    1.03 2.06
## 
## Mu:
##  Min 1st Qu.   Median 3rd Qu. Max
##    0       0 0.000489  0.0269 442

  1. Gamma-Poisson is an alternative name to Negative Binomial.↩︎