Function for new locations given a model object

The function spPredict collects posterior predictive samples for a set of new locations given a spLM, spMvLM, spGLM, spMvGLM, spMisalignLM, spMisalignGLM, bayesGeostatExact, bayesLMConjugate bayesLMRef or spSVC object.

Usage

spPredict(sp.obj, pred.coords, pred.covars, joint=FALSE, start=1, end, thin=1,
          verbose=TRUE, n.report=100, n.omp.threads=1, ...)

Arguments

sp.obj: an object returned by spLM, spMvLM, spGLM, spMvGLM, spMisalignLM, spMisalignGLM, bayesGeostatExact, bayesLMConjugate or bayesLMRef. For spSVC, sp.obj is an object from spRecover.
pred.coords: for spLM, spMvLM, spGLM, spMvGLM, and bayesGeostatExact pred.coords is a \(n^{\ast} \times 2\) matrix of \(n^{\ast}\) prediction location coordinates in \(R^2\) (e.g., easting and northing). For spMisalignLM and spMisalignGLM pred.coords is a list of \(q\) \(n^{\ast}_i \times 2\) matrices of prediction location coordinates where \(i=(1,2,\ldots,q)\). For spSVC pred.coords is an \(n^{\ast} \times m\) matrix of \(n^{\ast}\) prediction location coordinates in \(R^m\).
pred.covars: for spLM, spMvLM, spGLM, spMvGLM, bayesGeostatExact, bayesLMConjugate, bayesLMRef, and spSVC pred.covars is a \(n^{\ast} \times p\) design matrix associated with the new locations (including the intercept if one is specified in sp.obj's formula argument). If this is a multivariate prediction defined by \(q\) models, i.e., for spMvLM or spMvGLM, the multivariate design matrix can be created by passing a list of the \(q\) univariate design matrices to the mkMvX function. For spMisalignLM and spMisalignGLM pred.covars is a list of \(q\) \(n^{\ast}_i \times p_i\) design matrices where \(i=(1,2,\ldots,q)\)
joint: specifies whether posterior samples should be drawn from the joint or point-wise predictive distribution. This argument is only implemented for spSVC. Prediction for all other models uses the point-wise posterior predictive distribution.
start: specifies the first sample included in the composition sampling.
end: specifies the last sample included in the composition. The default is to use all posterior samples in sp.obj.
thin: a sample thinning factor. The default of 1 considers all samples between start and end. For example, if thin = 10 then 1 in 10 samples are considered between start and end.
verbose: if TRUE, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.
n.report: the interval to report sampling progress.
n.omp.threads: a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting n.omp.threads up to the number of hyperthreaded cores. This argument is only implemented for spSVC.
...: currently no additional arguments.

Value

p.y.predictive.samples

a matrix that holds the response variable(s) posterior predictive samples. For multivariate models spMvLM or spMvGLM the rows of this matrix correspond to the predicted locations and the columns are the posterior predictive samples. If prediction is for \(q\) response variables the p.y.predictive.samples matrix has \(qn^{\ast}\) rows, where \(n^{\ast}\) is the number of prediction locations. The predictions for locations are held in rows \(1:q, (q+1):2q, \ldots, ((n^{\ast}-1)q+1):qn^{\ast}\) (i.e., the samples for the first location's \(q\) response variables are in rows \(1:q\), second location in rows \((q+1):2q\), etc.).

For spMisalignLM and spMisalignGLM the posterior predictive samples are organized differently in p.y.predictive.samples with the first response variable \(n^{\ast}_1\) locations held in rows \(1\ldots,n^{\ast}_1\) rows, then the next response variable samples held in the \((n^{\ast}_1+1),\ldots,(n^{\ast}_1+n^{\ast}_2)\), etc.

For spSVC given the \(r\) space-varying coefficients, p.y.predictive.samples has \(rn^{\ast}\) rows and the columns are the posterior predictive samples. The predictions for coefficient are held in rows \(1:r, (r+1):2r, \ldots, ((n^{\ast}-1)r+1):rn^{\ast}\) (i.e., the samples for the first location's \(r\) regression coefficients are in rows 1:r, second location in rows \((r+1):2r\), etc.).

For spGLM and spMisalignGLM the p.y.predictive.samples matrix holds posterior predictive samples \(\frac{1}{1+\exp(-x(s)'\beta-w(s))}\) and \(\exp(x(s)'\beta+w(s))\) for family binomial and poisson, respectively. Here \(s\) indexes the prediction location, \(\beta\) is the vector of regression coefficients, and \(w\) is the associated spatial random spatial effect. These values can be fed directly into rbinom or rpois to generate the realization from the respective distribution.

p.w.predictive.samples

a matrix organized the same as p.y.predictive.samples, that holds the spatial random effects posterior predictive samples.

p.w.predictive.samples.list

only returned for spSVC. This provides p.w.predictive.samples in a different (more convenient form). Elements in this list hold samples for each of the \(r\) coefficients. List element names indicate either the coefficient index or name specified in spSVC's svc.cols argument. The sample matrices are \(n^{\ast}\) rows and predictive samples along the columns.

p.tilde.beta.predictive.samples.list

only returned for spSVC. Like p.w.predictive.samples.list but with the addition of the corresponding \(\beta\) posterior samples (i.e., \(\beta+w(s)\)).

center.scale.pred.covars

only returned for the spSVC when its center.scale argument is TRUE. This is the prediction design matrix centered and scaled with respect to column means and variances of the design matrix used to estimate model parameters, i.e., the one defined in spSVC's formula argument.

References

Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, FL.

Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.

Finley, A.O. and S. Banerjee (2019) Bayesian spatially varying coefficient models in the spBayes R package. https://arxiv.org/abs/1903.03028.

Author

Andrew O. Finley finleya@msu.edu,
Sudipto Banerjee sudipto@ucla.edu

Examples

if (FALSE) { # \dontrun{
rmvn <- function(n, mu=0, V = matrix(1)){
  p <- length(mu)
  if(any(is.na(match(dim(V),p))))
    stop("Dimension problem!")
  D <- chol(V)
  t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p)))
}

set.seed(1)

n <- 200
coords <- cbind(runif(n,0,1), runif(n,0,1))
X <- as.matrix(cbind(1, rnorm(n)))

B <- as.matrix(c(1,5))
p <- length(B)
sigma.sq <- 10
tau.sq <- 0.01
phi <- 3/0.5

D <- as.matrix(dist(coords))
R <- exp(-phi*D)
w <- rmvn(1, rep(0,n), sigma.sq*R)
y <- rnorm(n, X%*%B + w, sqrt(tau.sq))

##partition the data for out of sample prediction
mod <- 1:100
y.mod <- y[mod]
X.mod <- X[mod,]
coords.mod <- coords[mod,]

n.samples <- 1000

starting <- list("phi"=3/0.5, "sigma.sq"=50, "tau.sq"=1)
tuning <- list("phi"=0.1, "sigma.sq"=0.1, "tau.sq"=0.1)
priors <- list("beta.Flat", "phi.Unif"=c(3/1, 3/0.1),
               "sigma.sq.IG"=c(2, 5), "tau.sq.IG"=c(2, 0.01))
cov.model <- "exponential"

m.1 <- spLM(y.mod~X.mod-1, coords=coords.mod, starting=starting, tuning=tuning,
priors=priors, cov.model=cov.model, n.samples=n.samples)

m.1.pred <- spPredict(m.1, pred.covars=X, pred.coords=coords,
start=0.5*n.samples)

y.hat <- apply(m.1.pred$p.y.predictive.samples, 1, mean)

quant <- function(x){quantile(x, prob=c(0.025, 0.5, 0.975))}

y.hat <- apply(m.1.pred$p.y.predictive.samples, 1, quant)

plot(y, y.hat[2,], pch=19, cex=0.5, xlab="observed y", ylab="predicted y")
arrows(y[-mod], y.hat[2,-mod], y[-mod], y.hat[1,-mod], angle=90, length=0.05)
arrows(y[-mod], y.hat[2,-mod], y[-mod], y.hat[3,-mod], angle=90, length=0.05)
} # }