
Function for fitting multivariate Bayesian spatial regression models to misaligned data
spMisalignLM.RdThe function spMisalignLM fits Gaussian multivariate Bayesian
spatial regression models to misaligned data.
Usage
spMisalignLM(formula, data = parent.frame(), coords,
starting, tuning, priors, cov.model,
amcmc, n.samples, verbose=TRUE, n.report=100, ...)Arguments
- formula
a list of \(q\) symbolic regression models to be fit. See example below.
- data
an optional data frame containing the variables in the model. If not found in
data, the variables are taken fromenvironment(formula), typically the environment from whichspMisalignLMis called.- coords
a list of \(q\) \(n_i \times 2\) matrices of the observation coordinates in \(R^2\) (e.g., easting and northing) where \(i=(1,2,\ldots,q)\). .
- starting
a list with tags corresponding to
A,phi,nu, andPsi. The value portion of each tag is a vector that holds the parameter's starting values.Ais of length \(\frac{q(q+1)}{2}\) and holds the lower-triangle elements in column major ordering of the Cholesky square root of the spatial cross-covariance matrix.phiandnuare of length \(q\). The vector of residual variancesPsiis also of length \(q\).- tuning
a list with tags
A,phi,nu, andPsi. The value portion of each tag defines the variance of the Metropolis sampler Normal proposal distribution.Ais of length \(\frac{q(q+1)}{2}\) andPsi,phi, andnuare of length \(q\).- priors
a list with tags
beta.flat,K.iw,Psi.ig,phi.unifandnu.unif. The hyperparameters of the inverse-Wishart for the cross-covariance matrix \(K=AA'\) are passed as a list of length two, with the first and second elements corresponding to the \(df\) and \(q\times q\) scale matrix, respectively. The inverse-Gamma hyperparameters for the non-spatial residual variances are specified as a listPsi.igof length two with the first and second list elements consisting of vectors of the \(q\) shape and scale hyperparameters, respectively. The hyperparameters of the Uniformphi.unif, andnu.unifare also passed as a list of vectors with the first and second list elements corresponding to the lower and upper support, respectively.- cov.model
a quoted keyword that specifies the covariance function used to model the spatial dependence structure among the observations. Supported covariance model key words are:
"exponential","matern","spherical", and"gaussian". See below for details.- amcmc
a list with tags
n.batch,batch.length, andaccept.rate. Specifying this argument invokes an adaptive MCMC sampler see Roberts and Rosenthal (2007) for an explanation.- n.samples
the number of MCMC iterations. This argument is ignored if
amcmcis specified.- verbose
if
TRUE, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.- n.report
the interval to report Metropolis acceptance and MCMC progress.
- ...
currently no additional arguments.
Value
An object of class spMisalignLM, which is a list with the following
tags:
- p.theta.samples
a
codaobject of posterior samples for the defined parameters.- acceptance
the Metropolis sampling acceptance percent. Reported at
batch.lengthorn.reportintervals foramcmcspecified and non-specified, respectively
The return object might include additional data used for subsequent prediction and/or model fit evaluation.
Details
Model parameters can be fixed at their starting values by setting their
tuning values to zero.
References
Banerjee, S., A.E. Gelfand, A.O. Finley, and H. Sang. (2008) Gaussian Predictive Process Models for Large Spatial Datasets. Journal of the Royal Statistical Society Series B, 70:825–848.
Banerjee, S., Carlin, B.P., and Gelfand, A.E. (2004). Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC Press, Boca Raton, Fla.
Finley, A.O., S. Banerjee, and B.D. Cook. (2014) Bayesian hierarchical models for spatially misaligned data. Methods in Ecology and Evolution, 5:514–523.
Finley, A.O., H. Sang, S. Banerjee, and A.E. Gelfand. (2009) Improving the performance of predictive process modeling for large datasets. Computational Statistics and Data Analysis, 53:2873–2884.
Finley, A.O., S. Banerjee, A.R. Ek, and R.E. McRoberts. (2008) Bayesian multivariate process modeling for prediction of forest attributes. Journal of Agricultural, Biological, and Environmental Statistics, 13:60–83.
Author
Andrew O. Finley finleya@msu.edu,
Sudipto Banerjee baner009@umn.edu
Examples
if (FALSE) { # \dontrun{
rmvn <- function(n, mu=0, V = matrix(1)){
p <- length(mu)
if(any(is.na(match(dim(V),p)))){stop("Dimension problem!")}
D <- chol(V)
t(matrix(rnorm(n*p), ncol=p)%*%D + rep(mu,rep(n,p)))
}
set.seed(1)
##generate some data
n <- 100 ##number of locations
q <- 3 ##number of outcomes at each location
nltr <- q*(q+1)/2 ##number of triangular elements in the cross-covariance matrix
coords <- cbind(runif(n,0,1), runif(n,0,1))
##parameters for generating a multivariate spatial GP covariance matrix
theta <- rep(3/0.5,q) ##spatial decay
A <- matrix(0,q,q)
A[lower.tri(A,TRUE)] <- c(1,1,-1,1,0.5,0.25)
K <- A%*%t(A)
K ##spatial cross-covariance
cov2cor(K) ##spatial cross-correlation
C <- mkSpCov(coords, K, diag(0,q), theta, cov.model="exponential")
w <- rmvn(1, rep(0,nrow(C)), C) ##spatial random effects
w.a <- w[seq(1,length(w),q)]
w.b <- w[seq(2,length(w),q)]
w.c <- w[seq(3,length(w),q)]
##covariate portion of the mean
x.a <- cbind(1, rnorm(n))
x.b <- cbind(1, rnorm(n))
x.c <- cbind(1, rnorm(n))
x <- mkMvX(list(x.a, x.b, x.c))
B.1 <- c(1,-1)
B.2 <- c(-1,1)
B.3 <- c(-1,-1)
B <- c(B.1, B.2, B.3)
Psi <- c(0.1, 0.1, 0.1) ##non-spatial residual variance, i.e., nugget
y <- rnorm(n*q, x%*%B+w, rep(sqrt(Psi),n))
y.a <- y[seq(1,length(y),q)]
y.b <- y[seq(2,length(y),q)]
y.c <- y[seq(3,length(y),q)]
##subsample to make spatially misaligned data
sub.1 <- 1:50
y.1 <- y.a[sub.1]
w.1 <- w.a[sub.1]
coords.1 <- coords[sub.1,]
x.1 <- x.a[sub.1,]
sub.2 <- 25:75
y.2 <- y.b[sub.2]
w.2 <- w.b[sub.2]
coords.2 <- coords[sub.2,]
x.2 <- x.b[sub.2,]
sub.3 <- 50:100
y.3 <- y.c[sub.3]
w.3 <- w.c[sub.3]
coords.3 <- coords[sub.3,]
x.3 <- x.c[sub.3,]
##call spMisalignLM
q <- 3
A.starting <- diag(1,q)[lower.tri(diag(1,q), TRUE)]
n.samples <- 5000
starting <- list("phi"=rep(3/0.5,q), "A"=A.starting, "Psi"=rep(1,q))
tuning <- list("phi"=rep(0.5,q), "A"=rep(0.01,length(A.starting)), "Psi"=rep(0.1,q))
priors <- list("phi.Unif"=list(rep(3/0.75,q), rep(3/0.25,q)),
"K.IW"=list(q+1, diag(0.1,q)), "Psi.ig"=list(rep(2,q), rep(0.1,q)))
m.1 <- spMisalignLM(list(y.1~x.1-1, y.2~x.2-1, y.3~x.3-1),
coords=list(coords.1, coords.2, coords.3),
starting=starting, tuning=tuning, priors=priors,
n.samples=n.samples, cov.model="exponential", n.report=100)
burn.in <- floor(0.75*n.samples)
plot(m.1$p.theta.samples, density=FALSE)
##recover regression coefficients and random effects
m.1 <- spRecover(m.1, start=burn.in)
round(summary(m.1$p.theta.recover.samples)$quantiles[,c(3,1,5)],2)
round(summary(m.1$p.beta.recover.samples)$quantiles[,c(3,1,5)],2)
##predict for all locations, i.e., observed and not observed
out <- spPredict(m.1, start=burn.in, thin=10, pred.covars=list(x.a, x.b,
x.c),
pred.coords=list(coords, coords, coords))
##summary and check
quants <- function(x){quantile(x, prob=c(0.5,0.025,0.975))}
y.hat <- apply(out$p.y.predictive.samples, 1, quants)
##unstack and plot
y.a.hat <- y.hat[,1:n]
y.b.hat <- y.hat[,(n+1):(2*n)]
y.c.hat <- y.hat[,(2*n+1):(3*n)]
par(mfrow=c(1,3))
plot(y.a, y.a.hat[1,], xlab="Observed y.a", ylab="Fitted & predicted y.a",
xlim=range(y), ylim=range(y.hat), main="")
arrows(y.a[-sub.1], y.a.hat[1,-sub.1], y.a[-sub.1], y.a.hat[2,-sub.1], length=0.02, angle=90)
arrows(y.a[-sub.1], y.a.hat[1,-sub.1], y.a[-sub.1], y.a.hat[3,-sub.1], length=0.02, angle=90)
lines(range(y.a), range(y.a))
plot(y.b, y.b.hat[1,], xlab="Observed y.b", ylab="Fitted & predicted y.b",
xlim=range(y), ylim=range(y.hat), main="")
arrows(y.b[-sub.2], y.b.hat[1,-sub.2], y.b[-sub.2], y.b.hat[2,-sub.2], length=0.02, angle=90)
arrows(y.b[-sub.2], y.b.hat[1,-sub.2], y.b[-sub.2], y.b.hat[3,-sub.2], length=0.02, angle=90)
lines(range(y.b), range(y.b))
plot(y.c, y.c.hat[1,], xlab="Observed y.c", ylab="Fitted & predicted y.c",
xlim=range(y), ylim=range(y.hat), main="")
arrows(y.c[-sub.3], y.c.hat[1,-sub.3], y.c[-sub.3], y.c.hat[2,-sub.3], length=0.02, angle=90)
arrows(y.c[-sub.3], y.c.hat[1,-sub.3], y.c[-sub.3], y.c.hat[3,-sub.3], length=0.02, angle=90)
lines(range(y.c), range(y.c))
} # }