A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models

Mixture models form an important class of models for unsupervised
learning, allowing data points to be assigned labels based on their
values.  However, standard mixture models procedures do not deal well
with rare components.  For example, pause times in student essays have
different lengths depending on what cognitive processes a student
engages in during the pause.  However, instances of student planning
(and hence very long pauses) are rare, and thus it is difficult to
estimate those parameters from a single student's essays.  A
hierarchical mixture model eliminates some of those problems, by
pooling data across several of the higher level units (in the example
students) to estimate parameters of the mixture components.  One way
to estimate the parameters of a hierarchical mixture model is to use
MCMC.  By these models have several issues such as non-identifiability
under label switching that make them difficult to estimate just using
off-the-shelf MCMC tools.  This paper looks at the steps necessary to
estimate these models using two popular MCMC packages:  JAGS
(Metropolis-Hastings algorithm) and Stan (Hamiltonian Monte Carlo).
JAGS, Stan and R code to estimate the models and model fit statistics
will be published along with the paper.