\name{LearnCPTs} \alias{LearnCPTs} \title{Learn Conditional Probability Tables with Missing Data.} \description{ This function updates the conditional probabilities associated with the given list of nodes based on the findings associated with that node and its parents found in the \code{caseStream} argument, which should be a \code{\linkS4class{CaseStream}} object. Unlike \code{\link{LearnCases}}, these algorithms can support cases with missing or latent variables. } \usage{ LearnCPTs(caseStream, nodelist, method = "COUNTING", maxIters = 1000L, maxTol = 1e-06, weight = 1) } \arguments{ \item{caseStream}{This should be a \code{\linkS4class{CaseStream}} object, or else an object which can be made into a case stream: either a pathname for a case file, or a data frame of the format described in \code{\linkS4class{MemoryCaseStream}}. The case stream can be either opened or closed. If closed it is reopened before updating. In either case, it is closed at the end of the function. \bold{Warning}, due to a bug in Netica, memory streams are not working and should not be used with Netica API 5.04 or earlier. See below. } \item{nodelist}{ A list of active \code{\linkS4class{NeticaNode}} objects that reference the conditional probability tables to be updated. } \item{method}{A character scalar giving the name of the method to be used. This should be one of \dQuote{GRADIENT}, \dQuote{EM} or \dQuote{COUNTING} (the default). See details. } \item{maxIters}{An integer scalar giving the maximum number of interactions for the EM and gradient decent algorithms. } \item{maxTol}{A real scalar giving the difference in log-likelihood required before the EM or gradient decent algorithms to be considered converged. } \item{weight}{ A multiplier for the weights of the cases in terms of number of observations. Negative weights unlearn previously learned cases. } } \details{ This function attempts to update the conditional probability tables of the nodes named in \code{nodelist} using the data referenced in the first argument. Three different algorithms are available: \emph{Counting}, \emph{EM} and \emph{Gradient Decent}. The \emph{Counting} algorithm cannot handle cases with missing data or latent variables in the model. The \code{method} argument determines which method is used. The \emph{Counting} algorithm is the same as the one used in \code{\link{LearnCases}}. Cases where either the parent or the child variable is missing are ignored when updating the conditional probability table for the node, that is the neither affect the \code{\link{NodeExperience}} or the \code{\link{NodeProbs}}. As a consequence, models with latent variables cannot be fit with this algorithm. The \emph{EM} is similar to the \emph{Counting} algorithms, but does more intelligent things with missing observations (particularly, missing parent variables). In particular, the complete data case of the \emph{EM} algorithm is the same as the counting algorithm. The \emph{Gradient Decent} algorithm is an alternative iterative algorithm. According to the Netica documentation it is similar to back propagation in neural networks. Again according to Netica, it is faster than EM, but more likely to find a local maxima. It appears not to respect prior information about the conditional probability tables, and it sets the node experience to \code{-Inf}. Both \code{EM} and \code{Gradient Decent} are an iterative algorithms. For these algorithms \code{maxIters} gives the maximum number of iterations, and \code{maxTol} gives the convergence criteria (required difference in log likelihood). These parameters are ignored for the \emph{Counting} algorithm. Currently, Netica gives no indication of whether the algorithm terminated by achieving convergence (difference in log likelihood less than \code{maxTol}) or by exceeding \code{maxIters}. Norsys says they will fix this in an upcoming release. If the case stream has a column \code{NumCases}, then the weight assigned to Row \eqn{j} is \code{weight*NumCases[j]}. If the case stream does not have such a column, then it is treated as if each column has weight 1. (Among other purposes, this allows case data to be stored in a compact format where all of the possible cases are enumerated along with a count of repetitions.) Note that negative weights will unlearn cases. } \value{ Currently, \code{NULL} is returned. In the future, an object containing details about the convergence will be returned. } \references{ \newcommand{\nref}{\href{http://norsys.com/onLineAPIManual/functions/#1.html}{#1()}} \url{http://norsys.com/onLineAPIManual/index.html}: \nref{LearnCPTS_bn}, \nref{NewLearner_bn}, \nref{SetLearnerMaxTol_bn}, \nref{SetLearnerMaxTol_bn} } \author{Russell G. Almond} \note{ The \code{LearnCPTs} function will not update the conditional probability table of a node unless \code{\link{NodeExperience}} has been set for that node. Instead it will issue a warning and update the other nodes. } \section{Netica Bugs}{ In version 5.04 of the Netica API, there is no indication of whether the call to LearnCPTs_bn has converged (terminated because the difference in log likelihood is less than \code{maxTol}) or not (terminated because the number of iterations exceeded \code{maxIters}). Norsys has indicated that they will add this functionality to a later release. In version 5.04 of the Netica API, there is a problem with using Memory Streams that seems to affect the functions \code{\link{LearnCases}} and \code{\link{LearnCPTs}}. Until this problem is fixed, most uses of Memory Streams will require file streams instead. Write the case file using \code{\link{write.CaseFile}}, and then create a file stream using \code{\link{CaseFileStream}}. } \seealso{ \code{\link{NodeExperience}}, \code{\link{NodeProbs}}, \code{\link{NodeFinding}}, \code{\link{FadeCPT}}, \code{\link{RetractNetFindings}}, \code{\link{LearnFindings}} \code{\link{LearnCases}} } \examples{ sess <- NeticaSession() startSession(sess) abb <- CreateNetwork("ABB", session=sess) A <- NewDiscreteNode(abb,"A",c("A1","A2")) B1 <- NewDiscreteNode(abb,"B1",c("B1","B2")) B2 <- NewDiscreteNode(abb,"B2",c("B1","B2")) AddLink(A,B1) AddLink(A,B2) A[] <- c(.5,.5) NodeExperience(A) <- 10 B1["A1"] <- c(.8,.2) B1["A2"] <- c(.2,.8) B2["A1"] <- c(.8,.2) B2["A2"] <- c(.2,.8) NodeExperience(B1) <- c(10,10) NodeExperience(B2) <- c(10,10) casesabb <- data.frame(A=c("A1","A1","A1","A1","A1","A2","A2","A2","A2","A2"), B1=c("B1","B1","B1","B2","B2","B2","B2","B2","B1","B1"), B2=c("B1","B1","B1","B1","B2","B2","B2","B2","B2","B1")) ## LearnCPTs(casesabb,list(A,B1)) ## There is currently a bug in Netica, so that this function does not ## work with memory streams. As a work around, use proper file streams ## instead. outfile <- tempfile("casesabb",fileext=".cas") write.CaseFile(casesabb,outfile, session=sess) LearnCPTs(outfile,list(A,B1)) ## Probs for A & B1 modified, but B2 left alone stopifnot( NodeExperience(A)==20, NodeExperience(B1)==c(15,15), NodeExperience(B2)==c(10,10), sum(abs(NodeProbs(A) - .5)) < .001, sum(abs(B1["A1",drop=TRUE] - c(11,4)/15)) < .001, sum(abs(B1["A2",drop=TRUE] - c(4,11)/15)) < .001, sum(abs(B2["A1",drop=TRUE] - c(8,2)/10)) < .001, sum(abs(B2["A2",drop=TRUE] - c(2,8)/10)) < .001 ) ## Missing Data ## NAs in parents affect both parent and child. casesabb1 <- data.frame(A=c("A1","A1","NA","A1","A1","A2","A2","A2","A2","A2"), B1=c("B1","B1","B1","B2","B2","B2","B2","NA","B1","B1"), B2=c("B1","B1","B1","NA","B2","B2","B2","B2","B2","B1")) outfile1 <- tempfile("casesabb1",fileext=".cas") write.CaseFile(casesabb1,outfile1, session=sess) LearnCPTs(outfile1,list(A,B1,B2)) stopifnot( NodeExperience(A)==29, NodeExperience(B1)==c(19,19), NodeExperience(B2)==c(13,15), sum(abs(NodeProbs(A) - c(14,15)/29)) < .001, sum(abs(B1["A1",drop=TRUE] - c(13,6)/19)) < .001, sum(abs(B1["A2",drop=TRUE] - c(6,13)/19)) < .001, sum(abs(B2["A1",drop=TRUE] - c(10,3)/13)) < .001, sum(abs(B2["A2",drop=TRUE] - c(3,12)/15)) < .001 ) DeleteNetwork(abb) #################################### ## Start again with EM learning. abb <- CreateNetwork("ABB", session=sess) A <- NewDiscreteNode(abb,"A",c("A1","A2")) B1 <- NewDiscreteNode(abb,"B1",c("B1","B2")) B2 <- NewDiscreteNode(abb,"B2",c("B1","B2")) AddLink(A,B1) AddLink(A,B2) A[] <- c(.5,.5) NodeExperience(A) <- 10 B1["A1"] <- c(.8,.2) B1["A2"] <- c(.2,.8) B2["A1"] <- c(.8,.2) B2["A2"] <- c(.2,.8) NodeExperience(B1) <- c(10,10) NodeExperience(B2) <- c(10,10) casesabb <- data.frame(A=c("A1","A1","A1","A1","A1","A2","A2","A2","A2","A2"), B1=c("B1","B1","B1","B2","B2","B2","B2","B2","B1","B1"), B2=c("B1","B1","B1","B1","B2","B2","B2","B2","B2","B1")) ## LearnCPTs(casesabb,list(A,B1),method="EM") ## There is currently a bug in Netica, so that this function does not ## work with memory streams. As a work around, use proper file streams ## instead. outfile <- tempfile("casesabb",fileext=".cas") write.CaseFile(casesabb,outfile, session=sess) LearnCPTs(outfile,list(A,B1),method="EM") ## Complete data, this should look identical to the counting case. ## Note that NodeExperience is no longer an integer stopifnot( abs(NodeExperience(A)-20) < .001, sum(abs(NodeExperience(B1)-c(15,15))) < .001, NodeExperience(B2)==c(10,10), sum(abs(NodeProbs(A) - .5)) < .001, sum(abs(B1["A1",drop=TRUE] - c(11,4)/15)) < .001, sum(abs(B1["A2",drop=TRUE] - c(4,11)/15)) < .001, sum(abs(B2["A1",drop=TRUE] - c(8,2)/10)) < .001, sum(abs(B2["A2",drop=TRUE] - c(2,8)/10)) < .001 ) ## Missing Data ## EM deals more intelligently with missing data. casesabb1 <- data.frame(A=c("A1","A1","NA","A1","A1","A2","A2","A2","A2","A2"), B1=c("B1","B1","B1","B2","B2","B2","B2","NA","B1","B1"), B2=c("B1","B1","B1","NA","B2","B2","B2","B2","B2","B1")) outfile1 <- tempfile("casesabb1",fileext=".cas") write.CaseFile(casesabb1,outfile1, session=sess) LearnCPTs(outfile1,list(A,B1,B2),method="EM") stopifnot( NodeExperience(A)>29, NodeExperience(B1)>c(19,19), NodeExperience(B2)>c(13,15) ) ## EM can handle complete latent variable case. casesabb2 <- data.frame(B1=c("B1","B1","B1","B2","B2","B2","B2","NA","B1","B1"), B2=c("B1","B1","B1","NA","B2","B2","B2","B2","B2","B1")) outfile2 <- tempfile("casesabb2",fileext=".cas") write.CaseFile(casesabb1,outfile2, session=sess) LearnCPTs(outfile1,list(A,B1,B2),method="EM") stopifnot( NodeExperience(A)>39, NodeExperience(B1)>c(24,23), NodeExperience(B2)>c(14,20) ) DeleteNetwork(abb) #################################### ## One more time with Gradient Decent learning. abb <- CreateNetwork("ABB", session=sess) A <- NewDiscreteNode(abb,"A",c("A1","A2")) B1 <- NewDiscreteNode(abb,"B1",c("B1","B2")) B2 <- NewDiscreteNode(abb,"B2",c("B1","B2")) AddLink(A,B1) AddLink(A,B2) A[] <- c(.5,.5) NodeExperience(A) <- 10 B1["A1"] <- c(.8,.2) B1["A2"] <- c(.2,.8) B2["A1"] <- c(.8,.2) B2["A2"] <- c(.2,.8) NodeExperience(B1) <- c(10,10) NodeExperience(B2) <- c(10,10) casesabb <- data.frame(A=c("A1","A1","A1","A1","A1","A2","A2","A2","A2","A2"), B1=c("B1","B1","B1","B2","B2","B2","B2","B2","B1","B1"), B2=c("B1","B1","B1","B1","B2","B2","B2","B2","B2","B1")) ## LearnCPTs(casesabb,list(A,B1),method="GRADIENT") ## There is currently a bug in Netica, so that this function does not ## work with memory streams. As a work around, use proper file streams ## instead. outfile <- tempfile("casesabb",fileext=".cas") write.CaseFile(casesabb,outfile, session=sess) LearnCPTs(outfile,list(A,B1),method="GRADIENT") ## Complete data, this should look identical to the counting case. ## Note that NodeExperience is no longer used, and the posterior ## distribution no longer reflects the prior. stopifnot( NodeExperience(B2)==c(10,10), sum(abs(NodeProbs(A) - .5)) < .001, sum(abs(B1["A1",drop=TRUE] - c(3,2)/5)) < .001, sum(abs(B1["A2",drop=TRUE] - c(2,3)/5)) < .001, sum(abs(B2["A1",drop=TRUE] - c(8,2)/10)) < .001, sum(abs(B2["A2",drop=TRUE] - c(2,8)/10)) < .001 ) ## Gradient algorithm sets experience to -infinity, so need to reset. NodeExperience(A) <- 10 NodeExperience(B1) <- c(10,10) NodeExperience(B2) <- c(10,10) ## Missing Data ## GRADIENT deals more intelligently with missing data. casesabb1 <- data.frame(A=c("A1","A1","NA","A1","A1","A2","A2","A2","A2","A2"), B1=c("B1","B1","B1","B2","B2","B2","B2","NA","B1","B1"), B2=c("B1","B1","B1","NA","B2","B2","B2","B2","B2","B1")) outfile1 <- tempfile("casesabb1",fileext=".cas") write.CaseFile(casesabb1,outfile1, session=sess) LearnCPTs(outfile1,list(A,B1,B2),method="GRADIENT") ## Gradient algorithm sets experience to -infinity, so need to reset. NodeExperience(A) <- 10 NodeExperience(B1) <- c(10,10) NodeExperience(B2) <- c(10,10) ## GRADIENT can handle complete latent variable case. casesabb2 <- data.frame(B1=c("B1","B1","B1","B2","B2","B2","B2","NA","B1","B1"), B2=c("B1","B1","B1","NA","B2","B2","B2","B2","B2","B1")) outfile2 <- tempfile("casesabb2",fileext=".cas") write.CaseFile(casesabb1,outfile2, session=sess) LearnCPTs(outfile1,list(A,B1,B2),method="GRADIENT") DeleteNetwork(abb) stopSession(sess) } \keyword{ interface } \keyword{ model }