
Function for Fitting Logistic Models using Polya-Gamma Latent Variables
PGLogit.RdThe function PGLogit fits logistic models to binomial data using Polya-Gamma latent variables.
Usage
PGLogit(formula, weights = 1, data = parent.frame(), n.samples,
n.omp.threads = 1, fit.rep = FALSE, sub.sample, verbose = TRUE, ...)Arguments
- formula
a symbolic description of the regression model to be fit. See example below.
- weights
specifies the number of trials for each observation. The default is 1 trial for each observation. Valid arguments are a scalar value that specifies the number of trials used if all observations have the same number of trials, and a vector of length \(n\) that specifies the number of trials for each observation when there are differences in the number of trials.
- data
an optional data frame containing the variables in the model. If not found in data, the variables are taken from
environment(formula), typically the environment from whichPGLogitis called.- n.samples
the number of posterior samples to collect.
- n.omp.threads
a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting
n.omp.threadsup to the number of hyperthreaded cores. Note,n.omp.threads> 1 might not work on all systems.- fit.rep
if
TRUE, regression fitted and replicate data will be returned. The argumentsub.samplecontrols which MCMC samples are used to generate the fitted and replicated data.- sub.sample
an optional list that specifies the samples used for
fit.rep. Valid tags arestart,end, andthin. Given the value associated with the tags, the sample subset is selected usingseq(as.integer(start), as.integer(end), by=as.integer(thin)). The default values arestart=floor(0.5*n.samples),end=n.samplesandthin=1.- verbose
if
TRUE, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.- ...
currently no additional arguments.
Value
An object of class PGLogit which is a list comprising:
- p.beta.samples
a
codaobject of posterior samples for the regression coefficients.- y.hat.samples
if
fit.rep=TRUE, regression fitted values from posterior samples specified usingsub.sample.- y.hat.quants
if
fit.rep=TRUE, 0.5, 0.025, and 0.975 quantiles of they.hat.samples.- y.rep.samples
if
fit.rep=TRUE, replicated outcome from posterior samples specified usingsub.sample.- y.rep.quants
if
fit.rep=TRUE, 0.5, 0.025, and 0.975 quantiles of they.rep.samples.- s.indx
if
fit.rep=TRUE, the subset index specified withsub.sample.- run.time
MCMC sampler execution time reported using
proc.time().
The return object will include additional objects used for subsequent prediction and/or model fit evaluation.
References
Polson, N.G., J.G. Scott, and J. Windle. (2013) Bayesian Inference for Logistic Models Using Polya-Gamma Latent Variables. Journal of the American Statistical Association, 108:1339-1349.
Note
Some of the underlying code used for generating random number from the Polya-Gamma distribution is taken from the pgdraw package written by Daniel F. Schmidt and Enes Makalic. Their code implements Algorithm 6 in the PhD thesis of Jesse Bennett Windle (2013), University of Texas at Austin.
Author
Andrew O. Finley finleya@msu.edu,
Abhirup Datta abhidatta@jhu.edu,
Sudipto Banerjee sudipto@ucla.edu
Examples
##Generate binary data
set.seed(1)
n <- 100
x <- cbind(1, rnorm(n), runif(n,0,1))
beta <- c(0.1,-5, 5)
p <- 1/(1+exp(-(x%*%beta)))
##Assume 5 trials per outcome
weights <- rep(5, n)
y <- rbinom(n, size=weights, prob=p)
m <- PGLogit(y~x-1, weights = rep(5, n), n.samples = 1000)
#> ----------------------------------------
#> Model description
#> ----------------------------------------
#> Logistic regression with Polya-Gamma latent
#> variable fit with 100 observations.
#>
#> Number of MCMC samples 1000.
#>
#>
#> Source compiled with OpenMP support and model fit using 1 thread(s).
#>
#> Sampling ...
summary(m)
#>
#> Call:
#> PGLogit(formula = y ~ x - 1, weights = rep(5, n), n.samples = 1000)
#>
#> Chain sub.sample:
#> start = 500
#> end = 1000
#> thin = 1
#> samples size = 501
#> 2.5% 25% 50% 75% 97.5%
#> x1 -0.3382 0.0050 0.1961 0.3968 0.7574
#> x2 -5.4268 -4.8924 -4.6033 -4.2737 -3.8055
#> x3 3.1918 4.0076 4.4282 4.8612 5.5409