mixdir

mixdir implements a Bayesian clustering algorithm for high-dimensional categorical data.

A detailed description of the algorithm and the features of the package can be found in the the accompanying paper. If you find the package useful please cite

C. Ahlmann-Eltze and C. Yau, “MixDir: Scalable Bayesian Clustering for High-Dimensional Categorical Data”, 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 2018, pp. 526-539.

library(mixdir)
data("mushroom")

result <- mixdir(mushroom[1:1000,  1:5], n_latent=3)

For more information see the mixdir CRAN page and the corresponding Github repository.

Publications

Cluster high-dimensional categorical observations using an approximate Bayesian inference algorithm