--- title: "Knowledge Tracing EM" author: "Russell Almond" date: "1/20/2022" output: html_document runtime: shiny --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(tidyverse) library(DiagrammeR) ## This is syntaxtic sugar, so I can use this idiom in a pipe. applyappend <- function(listOfData) do.call(rbind,listOfData) ``` ## Knowledge Tracing Model Corbett and Anderson (1995) Hidden Markov Model Markov means past is independent of future given present. Hidden means one variable is latent. ```{r hmm-dot, fig.cap="Knowledge Tracing (Hidden Markov) model"} DiagrammeR::grViz("digraph hmm { graph[layout=dot] node[shape=circle] subgraph hidden { rank=min; L0; L1; L2; L3; L4; L5 L0 -> L1 -> L2 -> L3 -> L4 -> L5 } X0; X1; X2; X3; X4; X5 L0 -> X0; L1 -> X1; L2 -> X2; L3 -> X3; L4 -> X4; L5->X5; }") ``` $i$ represents opportunity to practice Knowledge, Skill or Ability (KSA) $L_i=1$ is mastery, $L_i=0$ is non-mastery at Time $i$ $X_i$ is response at $i$th opportunity (1=success, 0=failure) [Knowledge Tracing App](https://catherinesyeh.github.io/bkt-balloon/) ## Parameters of the Knowledge Tracing Model * Assumption 1: No forgetting -- $L_{i+1} \ge L_i$ * Assumption 2: Constant learning rate $P(L_{i+1}=1|L_i=0) = p(T)$, `learn` * Assumption 3: Constant item parameters - `guess` $P(X_i=1|L_i=0)$ - `slip` $P(X_i=0|L_i=1)$ Need one more parameter, $P(L_0=1)$, `init` ## Stopping Rule (Russell Addition) Original Corbett and Anderson paper assumed all students do same number of practices. This simulation is based on a tutoring system, where students can work on as many examples as they like. - `quit0` $P(quit|L_i=0)$ - `quit1` $P(quit|L_i=1)$ ```{r generator} simkt <- function(id=0,init=.5,learn=.25, guess=.2,slip=.2, quit0=.1,quit1=.25) { result <- numeric() # Result vector mastery <- runif(1) < init mvec <- numeric() done <- FALSE while (!done) { mvec <- c(mastery,mvec) result <- c( runif(1) < ifelse(mastery,1-slip,guess), result) if (!mastery) mastery <- runif(1) < learn done <- runif(1) < ifelse(mastery,quit1,quit0) } data.frame(id=id, trial=1:length(result), result=rev(result), mastery=rev(mvec)) } simkt(-1) ``` ## Now generate sample data Use `lapply` as this makes it easier to make things parallel; load `parallel` package and use `mclappy` instead of `lapply`. Wrap call to `simkt` in an anonymous function to make changing the parameters easier. The `do.call(rbind,...)` trick glues the data together. ```{r generate} N <- 100 lapply(1:N, function (id) { simkt(id) }) %>% applyappend -> simktdat ``` ## Thinking about the EM algorithm If we knew $\bf L$, then fitting the parameters would be each. The sufficient statistic for the model is a bunch of crosstabs: $L_i \times X_i$, $L_i \times L_{i-1}$, $L_1$. The expected value can be found by plugging in probabilities for $L_i$. E-Step: Use forward-backward algorithm to estimate probabilities and calculate CPTs M-Step: Solve CPTs. ## Forward Backward Algorithm https://en.wikipedia.org/wiki/Forward%E2%80%93backward_algorithm $$\pi_0^{(r)} = \left ( \begin{array}{c} P(L_0) \\ 1-P(L_0) \end{array} \right ) $$ Transition Matrix $$ {\bf T} = \left ( \begin{array}{cc} 1.0 & 0.0 \\ P(L_{t} | L_{t-1}) & 1- P(L_{t} | L_{t-1}) \\ \end{array}\right ) $$