Function for fitting univariate Bayesian spatially-varying coefficient regression models

The function spSVC fits Gaussian univariate Bayesian spatially-varying coefficient regression models.

Usage

spSVC(formula, data = parent.frame(), svc.cols=1, coords, 
      priors, starting, tuning, cov.model, center.scale=FALSE,
      amcmc, n.samples, n.omp.threads = 1,
      verbose=TRUE, n.report=100, ...)

Arguments

formula

a symbolic description of the regression model to be fit. See example below.

data

an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which spSVC is called.

svc.cols

a vector indicating which columns of the regression design matrix \(X\) should be space-varying. svc.cols can be an integer vector with values indicating \(X\) columns or a character vector with values corresponding to \(X\) column names. svc.cols default argument of 1 results in a space-varying intercept model (assuming an intercept is specified in the first column of the design matrix).

coords

an \(n \times m\) matrix of the observation coordinates in \(R^m\) (e.g., \(R^2\) might be easting and northing).

priors

a list with each tag corresponding to a parameter name. Valid tags are sigma.sq.ig, k.iw, tau.sq.ig, phi.unif, nu.unif, beta.norm, and beta.flat. Scalar variance parameters simga.sq and tau.sq are assumed to follow an inverse-Gamma distribution. Cross-covariance matrix parameter K is assumed to follow an inverse-Wishart. The spatial decay phi and smoothness nu parameters are assumed to follow Uniform distributions. The regression coefficient priors can be either flat or multivariate Normal.

There are two specification for the Gaussian Process (GP) on the svc.cols columns: 1) univariate GPs on each column; 2) multivariate GP on the \(r\) columns (i.e., where \(r\) equals length(svc.cols)). If univariate GPs are desired, specify sigma.sq.ig as a list of length two with the first and second elements corresponding to the length \(r\) shape and scale hyperparameter vectors, respectively. If a multivariate GP is desired, specify k.iw as a list of length two with the first and second elements corresponding to the degrees-of-freedom \(df\) and \(r\times r\) scale matrix, respectively. This inverse-Wishart prior is on the \(r\times r\) multivariate GP cross-covariance matrix defined as \(K=AA'\) where \(A\) is the lower-triangle Cholesky square root of \(K\).

If the regression coefficients, i.e., beta vector, are assumed to follow a multivariate Normal distribution then pass the hyperparameters as a list of length two with the first and second elements corresponding to the mean vector and positive definite covariance matrix, respectively. If beta is assumed flat then no arguments are passed. The default is a flat prior. Similarly, phi and nu are specified as lists of length two with the first and second elements holding vectors of length \(r\) lower and upper bounds of the Uniforms' support, respectively.

starting

a list with each tag corresponding to a parameter name. Valid tags are beta, sigma.sq, A, tau.sq, phi, and nu. The value portion of each tag is the parameter's starting value(s). Starting values must be set for the \(r\) univariate or multivariate GP phi and nu. For univariate GPs sigma.sq.ig is specified as a vector of length \(r\) and for a multivariate GP A is specified as a vector of \(r\times(r+1)/2\) that gives the lower-triangle elements in column major ordering of the Cholesky square root of the cross-covaraince matrix \(K=AA'\). tau.sq is a single value. See Finley and Banerjee (2019) for more details.

tuning

a list with each tag corresponding to a parameter name. Valid tags are sigma.sq, A, tau.sq, phi, and nu. The value portion of each tag defines the variance of the Metropolis sampler Normal proposal distribution. For sigma.sq, phi, and nu the tuning value vectors are of length \(r\) and A is of length \(r\times(r+1)/2\). tuning vector elements correspond to starting vector elements. tau.sq is a single value.

cov.model

a quoted keyword that specifies the covariance function used to model the spatial dependence structure among the observations. Supported covariance model key words are: "exponential", "matern", "spherical", and "gaussian". See below for details.

center.scale

if TRUE, non-constant columns of \(X\) are centered on zero and scaled to have variance one. If spPredict is subsequently called this centering and scaling is applied to pred.covars.

amcmc

a list with tags n.batch, batch.length, and accept.rate. Specifying this argument invokes an adaptive MCMC sampler, see Roberts and Rosenthal (2007) for an explanation.

n.samples

the number of MCMC iterations. This argument is ignored if amcmc is specified.

n.omp.threads

a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting n.omp.threads up to the number of hyperthreaded cores.

verbose

if TRUE, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.

n.report

the interval to report Metropolis sampler acceptance and MCMC progress.

...

currently no additional arguments.

Value

An object of class spSVC, which is a list comprising:

coords: the \(n \times m\) matrix specified by coords.
p.theta.samples: a coda object of posterior samples for the defined parameters.
acceptance: the Metropolis sampling acceptance percent. Reported at batch.length or n.report intervals for amcmc specified and non-specified, respectively.

The return object will include additional objects used for subsequent parameter recovery, prediction, and model fit evaluation using spRecover, spPredict, spDiag, respectively.

Details

Model parameters can be fixed at their starting values by setting their tuning values to zero.

The no nugget model is specified by removing tau.sq from the starting list.

References

Finley, A.O., S. Banerjee, and A.E. Gelfand. (2015) spBayes for large univariate and multivariate point-referenced spatio-temporal data models. Journal of Statistical Software, 63:1–28. https://www.jstatsoft.org/article/view/v063i13.

Roberts G.O. and Rosenthal J.S. (2006). Examples of Adaptive MCMC. http://probability.ca/jeff/ftpdir/adaptex.pdf.

Finley, A.O. and S. Banerjee (2019) Bayesian spatially varying coefficient models in the spBayes R package. https://arxiv.org/abs/1903.03028.

Author

Andrew O. Finley finleya@msu.edu,
Sudipto Banerjee sudipto@ucla.edu

Examples

if (FALSE) { # \dontrun{

library(Matrix)

rmvn <- function(n, mu=0, V = matrix(1)){
  p <- length(mu)
  if(any(is.na(match(dim(V),p))))
    stop("Dimension problem!")
  D <- chol(V)
  t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p)))
}


##Assume both columns of X are space-varying and the two GPs don't covary
set.seed(1)
n <- 200
coords <- cbind(runif(n,0,1), runif(n,0,1))

X <- as.matrix(cbind(1, rnorm(n)))
colnames(X) <- c("x.1", "x.2")

Z <- t(bdiag(as.list(as.data.frame(t(X)))))

B <- as.matrix(c(1,5))
p <- length(B)

sigma.sq <- c(1,5)
tau.sq <- 1
phi <- 3/0.5

D <- as.matrix(dist(coords))

C <- exp(-phi*D)%x%diag(sigma.sq)

w <- rmvn(1, rep(0,p*n), C)

mu <- as.vector(X%*%B + Z%*%w)

y <- rnorm(n, mu, sqrt(tau.sq))

##fit a model to the simulated dat
starting <- list("phi"=rep(3/0.5, p), "sigma.sq"=rep(1, p), "tau.sq"=1)

tuning <- list("phi"=rep(0.1, p), "sigma.sq"=rep(0.1, p), "tau.sq"=0.1)

cov.model <- "exponential"

priors <- list("phi.Unif"=list(rep(3/2, p), rep(3/0.0001, p)),
               "sigma.sq.IG"=list(rep(2, p), rep(2, p)),
               "tau.sq.IG"=c(2, 1))

##fit model
n.samples <- 2000

m.1 <- spSVC(y~X-1, coords=coords, starting=starting, svc.cols=c(1,2),
             tuning=tuning, priors=priors, cov.model=cov.model,
             n.samples=n.samples, n.omp.threads=4)

plot(m.1$p.theta.samples, density=FALSE)

##recover posterior samples
m.1 <- spRecover(m.1, start=floor(0.75*n.samples), thin=2, n.omp.threads=4)

summary(m.1$p.beta.recover.samples)
summary(m.1$p.theta.recover.samples)

##check fitted values
quant <- function(x){quantile(x, prob=c(0.025, 0.5, 0.975))}

##fitted y
y.hat <- apply(m.1$p.y.samples, 1, quant)

rng <- c(-15, 20)
plot(y, y.hat[2,], pch=19, cex=0.5, xlab="Fitted y", ylab="Observed y",
     xlim=rng, ylim=rng)
arrows(y, y.hat[2,], y, y.hat[1,], angle=90, length=0.05)
arrows(y, y.hat[2,], y, y.hat[3,], angle=90, length=0.05)
lines(rng, rng, col="blue")

##recovered w
w.hat <- apply(m.1$p.w.recover.samples, 1, quant)

w.1.indx <- seq(1, p*n, p)
w.2.indx <- seq(2, p*n, p)

par(mfrow=c(1,2))

rng <- c(-5,5)
plot(w[w.1.indx], w.hat[2,w.1.indx], pch=19, cex=0.5, xlab="Fitted w.1", ylab="Observed w.1",
     xlim=rng, ylim=rng)
arrows(w[w.1.indx], w.hat[2,w.1.indx], w[w.1.indx], w.hat[1,w.1.indx], angle=90, length=0.05)
arrows(w[w.1.indx], w.hat[2,w.1.indx], w[w.1.indx], w.hat[3,w.1.indx], angle=90, length=0.05)
lines(rng, rng, col="blue")

rng <- c(-10,10)
plot(w[w.2.indx], w.hat[2,w.2.indx], pch=19, cex=0.5, xlab="Fitted w.2", ylab="Observed w.2",
     xlim=rng, ylim=rng)
arrows(w[w.2.indx], w.hat[2,w.2.indx], w[w.2.indx], w.hat[1,w.2.indx], angle=90, length=0.05)
arrows(w[w.2.indx], w.hat[2,w.2.indx], w[w.2.indx], w.hat[3,w.2.indx], angle=90, length=0.05)
lines(rng, rng, col="blue")
} # }