Tutorial: Bayesian Networks in Educational Assessment

Russell Almond, Florida State University
Roy Levy, Arizona State University
Duanli Yan, ETS
Diego Zapta, ETS


This is a collection of material related to our 2017 NCME Tutorial. This will also be available via memory stick at the tutorial. See you in San Antonio!

Instructions for Attendees. There is now a "live computing" exercise included in the seminar. To do this we are recommending everybody who can bring a laptop.

If you don't have a laptop, hopefully you will be able to share with somebody who does. We are also recommending you do the following steps:

  1. Download student/demonstration version of the software Netica from Norsys. (Other possible software packages are listed below, but we will be preparing the exercises in Netica.) You can try this out using the student/demonstration version. We will have a temporary license code at the workshop.
  2. Download and install the appropriate version of R from CRAN. You do not need the absolute latest version, if you already have R version 3.X installed, you should be fine.
  3. [Optional] Many people prefer to run R from R Studio. You can download the free community edition of R Studio from RStudio.com.
  4. Install the package rjags from the CRAN library. You can do this by issuing the command install.packages("rjags") in R after starting it.
  5. Download the code for CPTtools, RNetica, Peanut and PNetica packages. These packages are not yet on CRAN, but can be found on the RNetica homepage. The following table has the latest versions. [Note that compiling RNetica from source (required for Unix versions) requires downloading the Netica C API from Norsys, see the INSTALL file in the tarball or the RNetica homepage for details.]
    PackageSource (Unix)WindowsMacOSManual
    CPTtoolsCPTtools_0.4-2.tar.gz CPTtools_0.4-2.zip CPTtools_0.4-4.tgz CPTtools-manual_0.4-2.pdf
    RNeticaRNetica_0.4-5.tar.gz RNetica_0.4-5.zip RNetica_0.4-5.tgzRNetica-manual_0.4-5.pdf
    PeanutPeanut_0.2-2.tar.gz Peanut_0.2-2.zip Peanut_0.2-2.tgz Peanut-manual_0.2-2.pdf
    PNeticaPNetica_0.2-2.tar.gz PNetica_0.2-2.zip PNetica_0.2-2.tgz PNetica-manual_0.2-2.pdf
  6. Download the example networks to be used (See Under Each session).

Mac and Linux usesrs Netica should run without problems in a variety of Windows emulators. In particular, it should run under WINE. I (Russell) have had success using WINE under both Mac OS X (version 10.6.8 up) and Ubuntu Linux (version 12.04 up). There are several options:

We will have this material on a CD-ROM and Memory stick at the tutorial, so don't worry if you only have a slow internet connection.


Abstract

This tutorial follows the book Bayesian Networks in Educational Assessment (Almond, Mislevy, Steinberg, Yan and Williamson, 2015). The first part (Sessions I and II) contain an overview of Bayesian networks (Part I of the book) giving some examples of how they can be used. The second part (Sessions III and IV) look at software and techniques for building networks from expert opinion and data.

Bayesian networks are a technique for managing multidimensional models. By representing the variables of the model as nodes in the graph and using edges in the graph to represent patterns of dependence and independence among the variables, the model graph serves as a bridge between educational and psychometric experts, and further helps the computer derive efficient computational strategies.

This tutorial is based on the book Bayesian Networks in Educational Assessment now out from Springer.

Slides and Handouts

I. Evidence Centered Design and Bayesian Networks
Covers basic models of ECD and their application to Bayes nets. Slides (PDF), Handout (PDF), Session I networks (Netica).
II. Bayes Net Applations including ACED
This part looks at a number of simple applications of Bayes nets to provide more intution about how they work. Slides (PDF), Handout (PDF). Simple Example Networks (Netica), ACED Subset (Netica),
III. RNetica and CPTtools
This looks at Tools for using and building Bayesian networks in R, particularly, the CPTtools and RNetica packages. It includes examples in scoring and using the built-in EM algorithm to fit models to data. The talk is split into two sets of slides. RNetica Slides (PDF) RNetica Handout (PDF). mini-ACED (Netica), Learning CPTs Slides (PDF) Learning CPTs Handout (PDF). A simple Learning Example.
IV. Advanced Topics
Covers two topics. Learning with Markov chain Monte Carlo (MCMC) and dynamic Bayesian networks (networks which unfold across time). MCMC Slides (PDF) MCMC Handout (PDF). DBN Slides (PDF) DBN Handout (PDF). All Session IV networks (Netica)
Bibliography
Bayes net and ECD Bibliography (Note: this is an out of date version of the book bibliography).

The handout version is also available as one big file containing all sessions and the bibliography. Honkin' big handout (PDF).

On-line Resources

For quick reference, here are the on-line resources referenced in the bibliography.

Computer programs and documentation available on the Web:

This is a partial list of software packages we have used or think are worth paying attention to. The list of Bayes net software found at the bottom of the Bayesian network Wikipedia entry http://en.wikipedia.org/wiki/Bayesian_network is a reasonably complete and up to date list of both free and commercial software.

Netica (Norsys Software Crop)
http://www.norsys.com/ Netica is another very complete commercial grade Bayes net engine, includes some learning tools.
RNetica (Netica API for R)
http://pluto.coe.fsu.edu/RNetica This is a work in progress binding for the Netica API into the R language. Currently only source verison is available. (Windows and Mac binaries will be available at the conference).
Genie/Smile (Decision Systems Lab, Univ. of Pittsburgh)
http://genie.sis.pitt.edu/ Open source project, free under Gnu Public License. Also contains a ``translator'' which translates between network formats.

Useful (Bayesian) Statistical Software

BUGS (Bayesian inference Using Gibbs Sampling).
http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml Downloadable version for Windows. BUGS is no longer actively maintained. For serious work, I recommend OpenBUGS http://mathstat.helsinki.fi/openbugs/
JAGS (Just Another Gibbs Sampler)
https://sourceforge.net/projects/mcmc-jags/ A rewrite of Classic BUGS (command line only, no GUI support) that runs under Linux, MacOS X, and Windows.
FBM: Flexible Bayesian Modeling
http://www.cs.utoronto.ca/~radford/fbm.software.html Radford Neal's Flexible Bayesian Modeling and Markov Chain Sampler.
R
http://www.r-project.org/ General purpose statistical computing environment based on S language.
Stan
http://mc-stan.org/Stan is a package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo.

Other On-Line Resources:

ECD Wiki
http://ecd.ralmond.net/ecdwiki/ Email Russell to get a password to contribute to the discussion.
Book page on the Wiki
http://ecd.ralmond.net/ecdwiki/BN/BN. We are slowly working at getting sample networks, errata and other resources for working through the book up at this site.
ACED Page on ECD Wiki
http://ecd.ralmond.net/ecdwiki/ACED/ACED Complete data from ACED field trial and ACED Bayes net are available at this site. This is a Wiki using the same user name and password as the ECD wiki.
Heckerman tutorial on learning (Heckerman, D. [1995])
ftp://ftp.research.microsoft.com/pub/tr/tr-95-06.pdf Note: Other Microsoft Research technical reports are available on-line from http://www.research.microsoft.com/
Association for Uncertainty in Artificial Intelligence home page
http://www.auai.org/ UAI conference proceedings is the most important publication in this area.
CRESST Technical Report Archive
http://www.cse.ucla.edu/products/reports.asp Early versions of many of the Mislevy references (including in press references) are available here. (Hint: search for ``Mislevy''). The CRESST web site changes frequently, so this link may be out of date. If the link is broken, google "CRESST Reports".
CiteSeer Cross-Reference Database
http://citeseer.ist.psu.edu/cis On-line cross reference database with lots of articles on Bayes nets. Many of the bibliography entries are available through CiteSeer.

Copyright

All handouts and slides from this tutorial are unpublished work of their respective authors. Sessions I and II are Copyright 2002–17 by Educational Testing Service. Session III (RNetica and Learning CPTs) is Copyright 2017 by Russell G. Almond with some material copyright 2002–15 by ETS (Used by permission). Session IV (MCMC and DBN) is copyright 2017 by Roy Levy.

These materials are an unpublished, proprietary work of their respective authors. Any limited distribution shall not constitute publication. This work may not be reproduced or distributed to third parties without the author's prior written consent. Submit request for the ETS material through http://www.ets.org/legal/copyright.html. Requestions for material from Russell Almond and Roy Levy can be obtained from the respective authors.

ACED development and data collection was sponsored by National Science Foundation Grant No. 0313202. Thanks to Val Shute for permission to use ACED data in this tutorial.


almond (at) acm.org
ralmond (at) fsu.edu
Last modified: Friday, Apr 21, 2017