MCMC for Hierarchical Mixture Models

The purpose of this page is to share some of my R, JAGS and Stan code for fitting hierarchical mixture models. Hopefully, others will find this of use. I also may get some people to help me with feedback for improving the way these models are fit.

The following working paper describes the model that I am working with and some of the approaches in the code. The paper was presented at the 2014 Bayesian Application Workshop as part of the Uncertainty In Artificial Intelligence Conference. The slides from that conference are available (Power point, PDF), as are four graphs which show a direct comparison between the JAGS and Stan (unconstrained) runs for the case were both the data generation and estimated model have K=2:

I have not pulled this together into a formal R package. I have a couple of files full of R functions plus a couple of longer script files for running the whole script. At the moment, this script file needs a fair bit of fine tuning for any application, so I'm not ready to make this into a package yet.

Prerequisites

Code

Test Data

Results from Running on test data

Output scripts and plots

The .Rout files are plain text which should be readable by any reasonable text reader (Note: Microsoft's NOTEPAD.EXE is broken, don't use it.) The .1.pdf files are from the first part of the MCMC run and the .2.pdf files are from the second part of the MCMC run.
Data SetModel FitJAGSStan (constrained)Stan (unconstrained)
K=2K=2 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=2K=3 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=2K=4 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=3K=2 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=3K=3 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=3K=4 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=4K=2 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=4K=3 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
K=4K=4 Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots
Log File
1st Run Plots
2nd Run Plots

Run Times

Times are user times in minutes. Other processing was happening during runs, but never more than a 2–3 process machine. Runs were on a ZaReason laptop with 8 CPUs running Ubuntu 14.4.
Data SetModel FitJAGSStan (constrained)Stan (unconstrained)
K=2K=211.834.131.3
K=2K=324.760.6131.2
K=2K=451.2348.8389.0
K=3K=212.237.821.6
K=3K=332.276.857.9
K=3K=450.0337.5140.9
K=4K=211.181.990.8
K=4K=331.4438.2198.9
K=4K=451.8461.4340.6

Convergence Results

This table summarizes the maximum value of R-hat for the Level-2 parameters for each data/model combination using each method. The minimum effective MCMC sample size is given in parenthesis (note that the full MC sample is 45,000 cycles in each case).
Data SetModel FitJAGSStan (constrained)Stan (unconstrained)
K=2K=21.6 (142) 1.9 (308)1.4 (3084)
K=2K=32.0 (23)2.0 (278)1.5 (3164)
K=2K=41.6 (19)2.1 (10)1.3 (9917)
K=3K=22.2 (101)1.4 (1389)1.5 (1163)
K=3K=31.4 (64)2.7 (636)1.3 (462)
K=3K=41.6 (147)1.9 (60)1.4 (486)
K=4K=21.3 (17)1.4 (292)1.5 (805)
K=4K=31.8 (31)1.5 (108)1.3 (799)
K=4K=41.5 (45)1.5 (252)1.7 (647)

Model Fit indexes

This table looks at the DIC, and WAIC indexes for each model. Look at Chapter 7 of Bayesian Data Analysis, 3rd Edition for definitions. The number in parenthesis is the estimated dimensionality of the model. Note that the Markov chains have not reached pseudo-convergence, so these results need to be taken with an entire carton of salt.

If we were calculating AIC instead of WAIC, the number of parameters would vary according to the second column. The parameter count is (3K-1)I, where in all data sets I=10; so the counts should be 50 (K=2), 80 (K=3) or 110 (K=4). Nominally, the WAIC value should be lowest when the first and second columns match.
WAIC1
Data SetModel FitJAGSStan (constrained)Stan (unconstrained)
K=2K=21132 (36.4)1131 (36.3)1132 (36.4)
K=2K=31132 (35.5)1133 (36.5)1131 (35.5)
K=2K=41135 (38)1133 (35.8)1132 (35.5)
K=3K=21361 (67.7)1326 (49.6)1320 (43.0)
K=3K=31190 (60.29)1188 (60.7)1186.6 (60.0)
K=3K=41187 (63.8)1188 (60.6)1188 (60.9)
K=4K=21227 (50.5)1260 (77.3)1234 (53.3)
K=4K=31124 (73.4)1125 (74.9)1128 (76.6)
K=4K=41120 (97.7)1119 (97.1)1116 (93.4)


WAIC2
Data SetModel FitJAGSStan (constrained)Stan (unconstrained)
K=2K=21132 (36.3)1130 (36.3)1131 (36.3)
K=2K=31132 (35.5)1133 (36.5)1131 (35.5)
K=2K=41135 (38.2)1133 (35.8)1132 (35.5)
K=3K=21361 (67.7)1326 (49.7)1320 (43.0)
K=3K=31190 (60.3)1189 (60.7)1187 (60.0)
K=3K=41187 (63.8)1188 (60.5)1188 (60.9)
K=4K=21227 (50.5)1260 (77.3)1234 (53.3)
K=4K=31124 (73.4)1125 (74.9)1128 (76.6)
K=4K=41121 (97.7)1118 (97.1)1116 (93.5)


Something strange is happening currently in the DIC calculations. The pDIC is almost always turning out negative, and sometimes the DIC is negative as well. There is likely a bug in the code, I need to check it.
DIC
Data SetModel FitJAGSStan (constrained)Stan (unconstrained)
K=2K=2###
K=2K=3###
K=2K=4###
K=3K=2###
K=3K=3###
K=3K=4###
K=4K=2###
K=4K=3###
K=4K=4###


This is based on the DIC alternative suggested by Andrew Gelman, which uses the variance of the deviance to estimate the dimensionality of the model.
DIC alternate
Data SetModel FitJAGSStan (constrained)Stan (unconstrained)
K=2K=21318 (56.7)1572 (55.2)1475 (57.1)
K=2K=32546.7 (55.7)2417 (57.8)2372 (56.5)
K=2K=42822 (61.3)13309 (59.7)16352 (57.9)
K=3K=21805 (127.2)1326 (49.7)1458 (63.1)
K=3K=31548 (107.3)1839 (112.9)1601 (117.2)
K=3K=41686 (123.0)3579 (126.3)2314 (119.0)
K=4K=21359 (75.9)2132 (162.8)1767 (127.9)
K=4K=31760 (103.7)2795 (117.5)2382 (116.7)
K=4K=42008 (196.0)3217.9 (191.1)5174 (211.1)


Page maintained by Russell Almond. Last updated on 2014-06-23.