Function for Fitting Logistic Models using Polya-Gamma Latent Variables

The function PGLogit fits logistic models to binomial data using Polya-Gamma latent variables.

Usage

PGLogit(formula, weights = 1, data = parent.frame(), n.samples,
        n.omp.threads = 1, fit.rep = FALSE, sub.sample, verbose = TRUE, ...)

Arguments

formula: a symbolic description of the regression model to be fit. See example below.
weights: specifies the number of trials for each observation. The default is 1 trial for each observation. Valid arguments are a scalar value that specifies the number of trials used if all observations have the same number of trials, and a vector of length \(n\) that specifies the number of trials for each observation when there are differences in the number of trials.
data: an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which PGLogit is called.
n.samples: the number of posterior samples to collect.
n.omp.threads: a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting n.omp.threads up to the number of hyperthreaded cores. Note, n.omp.threads > 1 might not work on all systems.
fit.rep: if TRUE, regression fitted and replicate data will be returned. The argument sub.sample controls which MCMC samples are used to generate the fitted and replicated data.
sub.sample: an optional list that specifies the samples used for fit.rep. Valid tags are start, end, and thin. Given the value associated with the tags, the sample subset is selected using seq(as.integer(start), as.integer(end), by=as.integer(thin)). The default values are start=floor(0.5*n.samples), end=n.samples and thin=1.
verbose: if TRUE, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.
...: currently no additional arguments.

Value

An object of class PGLogit which is a list comprising:

p.beta.samples: a coda object of posterior samples for the regression coefficients.
y.hat.samples: if fit.rep=TRUE, regression fitted values from posterior samples specified using sub.sample.
y.hat.quants: if fit.rep=TRUE, 0.5, 0.025, and 0.975 quantiles of the y.hat.samples.
y.rep.samples: if fit.rep=TRUE, replicated outcome from posterior samples specified using sub.sample.
y.rep.quants: if fit.rep=TRUE, 0.5, 0.025, and 0.975 quantiles of the y.rep.samples.
s.indx: if fit.rep=TRUE, the subset index specified with sub.sample.
run.time: MCMC sampler execution time reported using proc.time().

The return object will include additional objects used for subsequent prediction and/or model fit evaluation.

References

Polson, N.G., J.G. Scott, and J. Windle. (2013) Bayesian Inference for Logistic Models Using Polya-Gamma Latent Variables. Journal of the American Statistical Association, 108:1339-1349.

Note

Some of the underlying code used for generating random number from the Polya-Gamma distribution is taken from the pgdraw package written by Daniel F. Schmidt and Enes Makalic. Their code implements Algorithm 6 in the PhD thesis of Jesse Bennett Windle (2013), University of Texas at Austin.

Author

Andrew O. Finley finleya@msu.edu,
Abhirup Datta abhidatta@jhu.edu,
Sudipto Banerjee sudipto@ucla.edu

Examples


##Generate binary data
set.seed(1)
n <- 100

x <- cbind(1, rnorm(n), runif(n,0,1))
beta <- c(0.1,-5, 5)
p <- 1/(1+exp(-(x%*%beta)))

##Assume 5 trials per outcome
weights <- rep(5, n)

y <- rbinom(n, size=weights, prob=p)

m <- PGLogit(y~x-1, weights = rep(5, n), n.samples = 1000)
#> ----------------------------------------
#> 	Model description
#> ----------------------------------------
#> Logistic regression with Polya-Gamma latent
#> variable fit with 100 observations.
#> 
#> Number of MCMC samples 1000.
#> 
#> 
#> Source compiled with OpenMP support and model fit using 1 thread(s).
#> 
#> Sampling ... 

summary(m)
#> 
#> Call:
#> PGLogit(formula = y ~ x - 1, weights = rep(5, n), n.samples = 1000)
#> 
#> Chain sub.sample:
#> start = 500
#> end = 1000
#> thin = 1
#> samples size = 501
#>    2.5%    25%     50%     75%     97.5%  
#> x1 -0.3382 0.0050  0.1961  0.3968  0.7574 
#> x2 -5.4268 -4.8924 -4.6033 -4.2737 -3.8055
#> x3 3.1918  4.0076  4.4282  4.8612  5.5409