\name{matSweep}
\alias{matSweep}
\alias{revSweep}
\title{Beaton (Dempster) Sweep operator for Matrixes}
\description{
  
  The sweep operator takes a covariance matrix and replaces the
  elements corresponding to the swept out rows and columns with the
  inverse correlation matrix.  The remaining rows and columns give the
  regression coefficients for predicting the unswept variables from the
  swept variables and the residual covariance matrix.  The reverse sweep
  operator is its inverse.

  This version of the sweep operator follows Dempster(1969) which is
  slightly different from the one given in Goodnight (1979).

}
\usage{
matSweep(A, ploc = 1L:nrow(A), tol = .Machine$double.eps^0.5)
revSweep(A, ploc = 1L:nrow(A), tol = .Machine$double.eps^0.5)
}
\arguments{
  \item{A}{
    A symmetric matrix representing a correlation matrix or an augmented
    correlation matrix.
  }
  \item{ploc}{
    A vector of integers giving the rows and columns to be swept.  
  }
  \item{tol}{
    A machine tolerance value to check for 0 pivots (singular
    matrixes). 
  }
}
\details{

  The sweep operator was introduced by Beaton (1964) as a way of
  inverting matrixes in place (important with the limited memory
  computers of the time).  Dempster (1969) uses Beaton's sweep operator
  and gives it an interpretation in terms of a regression in which the
  unswept variables are predicted from the swept variables.  The
  original purpose for inverting matrixes has been superseded by more
  numerically stable algorithms (see \code{\link[base]{solve}}), but the
  interpretation in terms of various partial regression models is
  interesting.  In particular, Little and Rubin (2002) use it in
  building imputation models for missing data.

  Let \eqn{\bold{Q}} by an \eqn{s} by \eqn{s} matrix, and let
  \eqn{\bold{Q}_11} be the first \eqn{k} rows and columns,
  \eqn{\bold{Q}_22} be the last \eqn{s-k} rows and columns, and
  \eqn{\bold{Q}_{12}} and \eqn{\bold{Q}_{21}} the remaining two parts of
  the original matrix.  Let \eqn{SWP[1:k]\bold{Q}}
  be the result of sweeping out the first \eqn{k} rows and columns.
  (The sweeping does not need to be done in order, but it makes the
  description cleaner if it is).  Then

  \deqn{SWP[1:k]\bold{Q} = \left [ \begin{array}{cc}
    -\bold{Q}_{11}^{-1} & \bold{H}_{12} \\
    \bold{H}_{21} & \bold{Q}_{22.1}
    \end{array} \right ] }

  where \eqn{\bold{H}_{12} = \bold{Q}_{11}^{-1}\bold{Q}_{12}} and
  \eqn{\bold{Q}_{22.1} = \bold{Q}_{22} -
  \bold{Q}_{21}\bold{Q}_{11}^{-1}\bold{Q}_{12}}.  Note that if all rows
  and columns of a matrix are swept out, then this produces the negative
  of the inverse of the matrix.  (Several built in R routines for
  inverting a matrix, e.g., \code{\link[base]{solve}}, are available.
  The built-in R routines are probably more numberically stable and
  certainly have been better tested.)

  The sweep operator is associative, so that \eqn{SWP[1]SWP[2]\bold{Q} =
  SWP[1:2]\bold{Q}} and communtative so that \eqn{SWP[2:1]\bold{Q} =
  SWP[1:2]\bold{Q}}.  The reverse sweep operator, \code{revSweep()} or
  \eqn{RSW[k]\bold{Q}} undoes the sweep operation.

  The properties of the sweep operator are more interesting when the
  matrix is augmented.  Let \eqn{\bold{X}} be an \eqn{n} by \eqn{p}
  matrix of observed data and let \eqn{\bold{X}_{(+)}} be \eqn{\bold{X}} 
  augmented by setting the \eqn{p+1}st column to 1.  Let
  \eqn{\bold{Q}_{(+)} = \bold{X}_{(+)}^T\bold{X}_{(+)}}.   Note that the
  last row and column of \eqn{\bold{Q}_{(+)}} is the column sums of
  \eqn{\bold{X}_{(+)}} with \eqn{q_{(+)p+1,p+1}=n}.  Sweeping on row and
  column \eqn{p+1} produces:

  \deqn{SWP[p+1]\bold{Q}_{(+)} = \left [ \begin{array}{cc}
    -\bold{T} & \bar \bold{X} \\
    \bar \bold{X} & -1/n
    \end{array} \right ] }

  where \eqn{\bold{T}} is \eqn{n-1} times the covariance matrix for
  \eqn{\bold{X}}. In general, sweeping out all rows columns except for
  \eqn{k} produces the regression of \eqn{X_k} on the remaining
  variables, with the values in the \eqn{k}th row and column the
  regression coefficients and the
  \eqn{SWP[c(1:(k-1),(k+1):(p+1))]\bold{Q}_{(+)}[k,k]} equal to the 
  residual variance.

  Let \eqn{\bold{Q} = \bold{X}^T\bold{X}} and \eqn{\bold{X}_1} is the
  first \eqn{k} columns and \eqn{\bold{X}_2} the remaining columns, and
  let \eqn{SWP[1:k]\bold{Q}} be the partitioned matrix shown above.
  Then the least squares linear predictor of \eqn{\bold{X}_2} from
  \eqn{\bold{X}_1} is \eqn{\bold{H}_{12}\bold{X}_1}, and
  \eqn{\bold{Q}_{22.1}} is the residual covariance matrix.

  
}
\value{
  A square matrix of the same size as \code{A}.  If \code{A} is positive
  definete, then the output will be as well.
  
}
\references{

  Beaton, A. E. (1964).  The Use of Special Matrix Operators in
  Statistical Calculus.  Educational Testing Service Research Report
  RB-64-51. 

  Dempsters, A. P. (1969).  \emph{Elements of Continuous Multivariate
  Analysis.}  Addison-Wesley.
  
  Goodnight, J. H. (1979).  A Tutorial on the SWEEP Operator.
  \emph{The American Statistician.}  \bold{33} (3).  140--158.

  Little, R. J. A. and Rubin, D. B. (2002).  \emph{Statistical Analysis
    with Missing Data, Second Edition.}  Wiley.
  
}
\author{Russell ALmond}
\note{
  The version of the sweep operator described in Goodnight (1979) and
  the one in Dempster (1969) (also Little and Rubin, 2002) are slightly
  different.  This implementation follows the one in Dempster.  In fact
  the example calculations found in Dempster, p. 64-65 and 151-152 are
  available in the test scripts.  (There are some minor discrepancies
  between the values calculated here and given in Dempster, probably due
  to typographical or rounding errors.)
}
\seealso{
  \code{\link{matASM}}
}
\examples{

## Following example from Dempster (1969, p 151-2).
data(eggs)
eggsPlus <- cbind(eggs,1)
eggsQplus <- t(eggsPlus)\%*\%eggsPlus

## Sweep out constant row and column.
eggsTplus <- matSweep(eggsQplus,4)
stopifnot(all.equal(eggsTplus[1:3,1:3],(nrow(eggs)-1)*cov(eggs)),
          all.equal(eggsTplus[4,1:3],apply(eggs,2,mean)),
          all.equal(eggsTplus[4,4],-1/nrow(eggs)))


## Predict log(Volume) from other measures.
eggsQ124 <- matSweep(eggsQplus,c(1:2,4))
lm.eggs <- lm(logVol~logLength+logWidth,data=as.data.frame(eggs))
stopifnot(all.equal(eggsQ124[3,c(4,1:2)], coef(lm.eggs),check.attributes=FALSE),
          all.equal(eggsQ124[3,3], deviance(lm.eggs)))


## Inversion using sweep
eggsQinv <- matSweep(eggsQplus,1:4)
stopifnot(all.equal(eggsQinv,-solve(eggsQplus)))


## Commutative and associative properties.
eggsQ421 <- matSweep(matSweep(eggsQplus,4),2:1)
stopifnot(all.equal(eggsQ421,eggsQ124))

## Reverse sweep operator
eggsQinv3 <- revSweep(eggsQinv,3)
stopifnot(all.equal(eggsQinv3,eggsQ124))

}
\keyword{ algebra }% use one of  RShowDoc("KEYWORDS")
\keyword{ array }% use one of  RShowDoc("KEYWORDS")