Introduction

In this exercise, we will model the distributions of the ten most common tree species in Vermont, USA. Our goal is to predict distributions (i.e., probability of occurrence) of the ten species at a fine resolution across the state. We will use data from the US Forest Service Forest Inventory and Analysis (FIA) program, which collects inventory data on forest resources across the US for a variety of applications, such as assessments of current carbon stock and flux, projections of forest change, natural resource planning, and climate change policies. FIA maintains over 300,000 plots across the US for varying objectives, which were selected through a quasi-systematic sampling design (Bechtold and Patterson 2005). Here our data set is relatively modest in size, with \(h = 10\) species (i.e., responses) and \(n = 688\) forest plots. Note a small amount of noise is added to the true spatial location of the forest plots to ensure confidentiality of the locations. As our interest is in modeling tree distributions, we will work with presence/absence data of each species at each site. Because we have a fairly large number of species, we will use a factor modeling dimension reduction approach to jointly model the distributions of each species.

Given our binary response variable, we will fit spatial factor models using a GLMM framwork with a logit link function, employing data augmentation to yield an efficient MCMC sampler (Polson, Scott, and Windle 2013). We will fit such models with the sfJSDM function in the spOccupancy R package (Doser et al. 2022). The sf stands for spatial factor and the JSDM stands for “Joint Species Distribution Model”, which is what joint models of species distributions are called in the ecological literature. We first load spOccupancy below, as well as a few other packages we will use for generating maps and other summaries of the model

set.seed(2210)
library(spOccupancy)
library(coda)
library(MCMCvis)
library(ggplot2)
library(pals)
library(sf)

Exploratory data analysis

The data are stored in an R data file object called vermont-tree-data.rda. Reading in this object will load an object called data.list.

load("vermont-tree-data.rda")
# Take a look at the data object
str(data.list)

List of 3
 $ y     : num [1:10, 1:688] 1 1 1 0 0 1 0 0 0 0 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:10] "sugar-maple" "red-maple" "yellow-birch" "American-beech" ...
  .. ..$ : NULL
 $ coords: num [1:688, 1:2] 1817 1820 1805 1804 1825 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:2] "LON" "LAT"
 $ covs  :'data.frame': 688 obs. of  8 variables:
  ..$ tmax         : num [1:688] 25.4 25.9 25.8 26.1 24.8 25.2 26.3 23.4 25.6 26.9 ...
  ..$ tmin         : num [1:688] -15.1 -15.3 -15 -15 -15.3 -14.9 -14.8 -15.5 -14.9 -14.3 ...
  ..$ prcp         : num [1:688] 1171 1177 1092 1085 1262 ...
  ..$ pet          : num [1:688] 504 508 513 515 496 ...
  ..$ aet          : num [1:688] 466 468 465 464 468 ...
  ..$ water_deficit: num [1:688] 37.4 39.8 47.4 50.8 27.9 42.1 57.4 15.8 44.3 81.7 ...
  ..$ vpd          : num [1:688] 0.614 0.638 0.65 0.664 0.586 0.636 0.686 0.5 0.624 0.728 ...
  ..$ elev         : num [1:688] 253 202 236 134 424 363 138 595 273 54 ...

# Look at the species we're modeling
(sp.names <- dimnames(data.list$y)[[1]])

 [1] "sugar-maple"        "red-maple"          "yellow-birch"      
 [4] "American-beech"     "eastern-hemlock"    "white-ash"         
 [7] "eastern-white pine" "red-spruce"         "paper-birch"       
[10] "northern-red oak"

The data are stored in the format required for spOccupancy, which is a list object with the following three components:

y: an \(h \times n\) matrix comprising the observed presence (1) or absence (0) of each of the \(h\) species at each of the \(n\) spatial locations.
coords: an \(n \times 2\) matrix of spatial coordinates in a projected coordinate system. Here, the coordinates are specified in Albers Equal Area.
covs: a \(n \times p\) matrix or data frame of potential covariates for use in the model. Here we include a variety of 30-year climate normals that we believe may be useful in explaining variability in the distributions of the ten species across Vermont. These include: (1) tmax: maximum temperature; (2) tmin: minimum temperature; (3) prcp: precipitation; (4) pet: potential evapotranspiration; (5) aet: actual evapotranspiration; (6) water_deficit: climate water deficit; (7) vpd: vapor pressure deficit; and (8) elev: elevation.

Let’s first generate some EDA plots of the observed presence/absence data for each species

h <- length(sp.names)
n <- nrow(data.list$coords)
my.crs <- "+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=37.5 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=km +no_defs"
plot.df <- data.frame(val = c(t(data.list$y)),
                      x = rep(data.list$coords[, 1], times = h),
                      y = rep(data.list$coords[, 2], times = h),
                      species = factor(rep(sp.names, each = n)))
plot.df$val <- factor(ifelse(plot.df$val == 1, 'Present', 'Absent'))
plot.sf <- st_as_sf(plot.df, coords = c('x', 'y'), 
                    crs = my.crs)
ggplot() +
  geom_sf(data = plot.sf, aes(col = val), size = 0.5) +
  scale_color_viridis_d() +
  theme_bw(base_size = 10) +
  facet_wrap(~species, nrow = 2) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank()) +
  labs(x = "Longitude", y = "Latitude", col = "")

Observed presence/absence of the ten species across Vermont

Now let’s take a look at the covariates

cov.names <- colnames(data.list$covs)
p.cov <- length(cov.names)
# Standardize covariates for plotting
covs.std <- apply(data.list$covs, 2, scale)
plot.df <- data.frame(val = c(covs.std),
                      x = rep(data.list$coords[, 1], times = p.cov),
                      y = rep(data.list$coords[, 2], times = p.cov),
                      covariate = factor(rep(cov.names, each = n)))
plot.sf <- st_as_sf(plot.df, coords = c('x', 'y'), 
                    crs = my.crs)
ggplot() +
  geom_sf(data = plot.sf, aes(col = val), size = 0.5) +
  scale_color_viridis_c(option = 'plasma') +
  theme_bw(base_size = 10) +
  facet_wrap(~covariate, nrow = 2) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank()) +
  labs(x = "Longitude", y = "Latitude", col = "")

Predictor variables (centered and scaled)

Not surprisingly, the covariates seem to have very high correlation. Let’s dig deeper

pairs(data.list$covs, upper.panel = NULL, pch = 19, cex = 0.5)

cor(data.list$covs)

                     tmax        tmin        prcp        pet         aet
tmax           1.00000000  0.44380248 -0.79073330  0.9241158  0.02565207
tmin           0.44380248  1.00000000  0.03905879  0.6646919  0.37126485
prcp          -0.79073330  0.03905879  1.00000000 -0.5892543  0.43235321
pet            0.92411581  0.66469193 -0.58925432  1.0000000  0.25280047
aet            0.02565207  0.37126485  0.43235321  0.2528005  1.00000000
water_deficit  0.93803165  0.54018411 -0.76944149  0.9292062 -0.12262683
vpd            0.96470783  0.55807558 -0.69252632  0.9809843  0.14614895
elev          -0.83389753 -0.07480407  0.85958172 -0.6234173  0.24270407
              water_deficit        vpd        elev
tmax              0.9380316  0.9647078 -0.83389753
tmin              0.5401841  0.5580756 -0.07480407
prcp             -0.7694415 -0.6925263  0.85958172
pet               0.9292062  0.9809843 -0.62341731
aet              -0.1226268  0.1461489  0.24270407
water_deficit     1.0000000  0.9503862 -0.73198856
vpd               0.9503862  1.0000000 -0.73508308
elev             -0.7319886 -0.7350831  1.00000000

Given the high correlation between most of the predictor variables, we will only use minimum temperature and elevation for our model.

Model Fitting

We will fit the model with the sfJSDM() function in spOccupancy, which uses Nearest Neighbor Gaussian Processes (Datta et al. 2016) and Pólya-Gamma data augmentation (Polson, Scott, and Windle 2013) for a computationally efficient implementation of binary spatial factor models. Full details on the syntax for sfJSDM() can be found with ?sfJSDM and a detailed vignette here (among other models more specific for wildlife ecology).

We first set the number of spatial factors and number of nearest neighbors that we’ll use in the model. Here we’ll use three spatial factors and five neighbors in the NNGP approximation.

# Number of factors
n.factors <- 3
# Number of neighbors for NNGP
n.neighbors <- 5
# Exponential covariance model
cov.model <- 'exponential'

We’ll next specify our priors for the model. We will use Gaussian priors for the “community-level” hyper mean parameters (\(\boldsymbol{\mu}_{\beta}\)) with a mean of 0 and variance of 2.72 (which results in a near uniform prior on the probability scale), and inverse-Gamma priors for the hyper-variance parameters (\(\boldsymbol{\tau^2}_{\beta}\)) with shape and scale equal to .1. Recall that for spatial factor models, we fix the spatial variance parameter \(\sigma^2\) to 1 for each NNGP. The priors for the lower triangle of the factor loadings matrix are set to standard Gaussian priors. We then specify a uniform prior for the spatial decay parameters \(\boldsymbol{\phi}\). For an exponential correlation function, the effective spatial range is equal to \(3 / \phi\), so we can restrict the bounds of the effective spatial range by specifiying a prior of the form Uniform(3 / upper bound, 3 / lower bound). Here, we set the lower bound to 8km and the upper bound to 240km. We also set some initial values for most parameters in the model (which the function will do by default, but we do here just to be explicit).

# Priors
upper.dist <- 240
lower.dist <- 8
priors <- list(beta.comm.normal = list(mean = 0, var = 2.72), # Prior on hyper-means
               tau.sq.beta.ig = list(a = 0.1, b = 0.1), # Prior for hyper-variances
               phi.unif = list(3 / upper.dist, 3 / lower.dist))
# Set initial values
lambda.inits <- matrix(0, h, n.factors)
diag(lambda.inits) <- 1
lambda.inits

      [,1] [,2] [,3]
 [1,]    1    0    0
 [2,]    0    1    0
 [3,]    0    0    1
 [4,]    0    0    0
 [5,]    0    0    0
 [6,]    0    0    0
 [7,]    0    0    0
 [8,]    0    0    0
 [9,]    0    0    0
[10,]    0    0    0

inits <- list(beta.comm = 0, tau.sq.beta = 1, phi = 3 / 100,
              lambda = lambda.inits)

Lastly, we fit the model using an Adaptive MCMC sampler, where we adjust the tuning parameters adaptively for those parameters whose full conditional distributions are not accessible in closed form. Here, because of the Pólya-Gamma data augmentation scheme we use, the only parameters we need to tune are the spatial decay parameters \(\boldsymbol{\phi}\), which are updated using Metropolis-Hastings. In this adaptive algorithm, we set an initial tuning variance for \(\boldsymbol{\phi}\) and split the total number of MCMC iterations up into a set of batches (n.batches) each of some specified length of iterations (batch.length). After each batch, the tuning variance of \(\boldsymbol{\phi}\) is adjusted to try to achieve some target acceptance rate (accept.rate) of the proposed values. We will use a target acceptance rate of 0.43. Below, we set the initial tuning parameters, the number of batches, the length of each batch, and the amount of burn-in and thinning of each MCMC chain. Here we will just run a single MCMC chain.

# Set tuning parameters
tuning <- list(phi = c(1, 0.5, 0.5))

# Specify MCMC criteria
# Number of batches
n.batch <- 400
# Batch length
batch.length <- 25
# Total number of MCMC samples is batch.length * n.batch
(n.samples <- n.batch * batch.length)

[1] 10000

# Amount of burn-in
n.burn <- 5000
# Thinning rate
n.thin <- 5
# Number of chains
n.chains <- 1

Now we are ready to fit our model. We specify the covariates that we want to include in the model using standard R formula syntax. Here we include linear and quadratic effects of minimum temperature and elevation. See ?sfJSDM for more details on all model arguments.

out <- sfJSDM(formula = ~ scale(tmin) + I(scale(tmin)^2) +
                          scale(elev) + I(scale(elev)^2),
              data = data.list, priors = priors, inits = inits,
              n.neighbors = n.neighbors, cov.model = cov.model,
              n.factors = n.factors, n.batch = n.batch, tuning = tuning,
              batch.length = batch.length, n.burn = n.burn,
              accept.rate = 0.43, n.thin = n.thin,
              n.chains = n.chains, n.report = 100)

----------------------------------------
    Preparing to run the model
----------------------------------------

beta is not specified in initial values.
Setting initial values to random values from the community-level normal distribution

w is not specified in initial values.
Setting initial value to 0

----------------------------------------
    Building the neighbor list
----------------------------------------
----------------------------------------
Building the neighbors of neighbors list
----------------------------------------
----------------------------------------
    Model description
----------------------------------------
Spatial Factor NNGP JSDM with Polya-Gamma latent
variable fit with 688 sites and 10 species.

Samples per chain: 10000 (400 batches of length 25)
Burn-in: 5000 
Thinning Rate: 5 
Number of Chains: 1 
Total Posterior Samples: 1000 

Using the exponential spatial correlation model.

Using 3 latent spatial factors.
Using 5 nearest neighbors.

Source compiled with OpenMP support and model fit using 1 thread(s).

Adaptive Metropolis with target acceptance rate: 43.0
----------------------------------------
    Chain 1
----------------------------------------
Sampling ... 
Batch: 100 of 400, 25.00%
    Latent Factor   Parameter   Acceptance  Tuning
    1       phi     56.0        1.76827
    2       phi     68.0        0.80000
    3       phi     24.0        0.31250
-------------------------------------------------
Batch: 200 of 400, 50.00%
    Latent Factor   Parameter   Acceptance  Tuning
    1       phi     24.0        1.91554
    2       phi     32.0        0.83265
    3       phi     68.0        0.33183
-------------------------------------------------
Batch: 300 of 400, 75.00%
    Latent Factor   Parameter   Acceptance  Tuning
    1       phi     16.0        2.20340
    2       phi     32.0        0.49502
    3       phi     44.0        0.64201
-------------------------------------------------
Batch: 400 of 400, 100.00%

# Quick summary of model output
summary(out)


Call:
sfJSDM(formula = ~scale(tmin) + I(scale(tmin)^2) + scale(elev) + 
    I(scale(elev)^2), data = data.list, inits = inits, priors = priors, 
    tuning = tuning, cov.model = cov.model, n.neighbors = n.neighbors, 
    n.factors = n.factors, n.batch = n.batch, batch.length = batch.length, 
    accept.rate = 0.43, n.report = 100, n.burn = n.burn, n.thin = n.thin, 
    n.chains = n.chains)

Samples per Chain: 10000
Burn-in: 5000
Thinning Rate: 5
Number of Chains: 1
Total Posterior Samples: 1000
Run Time (min): 2.5224

----------------------------------------
    Community Level
----------------------------------------
Means (logit scale): 
                    Mean     SD    2.5%     50%   97.5% Rhat ESS
(Intercept)      -0.0578 0.4659 -1.0408 -0.0415  0.8109   NA 677
scale(tmin)       0.1694 0.2004 -0.2063  0.1592  0.5798   NA 152
I(scale(tmin)^2) -0.1163 0.0968 -0.3014 -0.1154  0.0748   NA 684
scale(elev)      -0.3236 0.4235 -1.1399 -0.3094  0.5615   NA 841
I(scale(elev)^2) -0.3367 0.1026 -0.5398 -0.3352 -0.1273   NA 862

Variances (logit scale): 
                   Mean     SD   2.5%    50%  97.5% Rhat  ESS
(Intercept)      2.2502 1.5417 0.7096 1.8535 6.2477   NA  108
scale(tmin)      0.3455 0.2418 0.0863 0.2787 1.0092   NA  140
I(scale(tmin)^2) 0.0730 0.0473 0.0233 0.0606 0.1909   NA 1000
scale(elev)      2.0426 1.4047 0.7305 1.7153 5.0673   NA  537
I(scale(elev)^2) 0.0788 0.0522 0.0258 0.0638 0.1988   NA  891

----------------------------------------
    Species Level
----------------------------------------
Estimates (logit scale): 
                                       Mean     SD    2.5%     50%   97.5% Rhat
(Intercept)-sugar-maple              1.9023 0.1785  1.5658  1.9018  2.2569   NA
(Intercept)-red-maple                0.4249 0.2106  0.0111  0.4287  0.8251   NA
(Intercept)-yellow-birch             0.4545 0.4905 -0.3552  0.4043  1.4905   NA
(Intercept)-American-beech           1.5888 0.4106  0.7745  1.5708  2.4346   NA
(Intercept)-eastern-hemlock         -0.8522 0.3440 -1.4053 -0.8899 -0.0993   NA
(Intercept)-white-ash                0.1773 0.2208 -0.2275  0.1627  0.6791   NA
(Intercept)-eastern-white pine      -1.4874 0.6840 -2.9165 -1.4373 -0.3150   NA
(Intercept)-red-spruce               0.0629 0.3634 -0.5880  0.0461  0.8218   NA
(Intercept)-paper-birch             -0.4466 0.2490 -0.8943 -0.4623  0.0945   NA
(Intercept)-northern-red oak        -2.3009 0.4886 -3.2622 -2.2931 -1.3973   NA
scale(tmin)-sugar-maple             -0.0047 0.1158 -0.2349 -0.0111  0.2223   NA
scale(tmin)-red-maple                0.1421 0.1493 -0.1458  0.1425  0.4467   NA
scale(tmin)-yellow-birch            -0.2182 0.2392 -0.7315 -0.2020  0.2302   NA
scale(tmin)-American-beech           0.8200 0.2608  0.3103  0.8241  1.3634   NA
scale(tmin)-eastern-hemlock          0.3741 0.2200 -0.0958  0.3928  0.7718   NA
scale(tmin)-white-ash                0.2246 0.1305 -0.0289  0.2274  0.4703   NA
scale(tmin)-eastern-white pine       0.0681 0.3189 -0.5157  0.0528  0.7106   NA
scale(tmin)-red-spruce              -0.2696 0.2138 -0.7494 -0.2542  0.1039   NA
scale(tmin)-paper-birch             -0.4701 0.1479 -0.7848 -0.4653 -0.1937   NA
scale(tmin)-northern-red oak         0.9597 0.3161  0.3790  0.9475  1.6312   NA
I(scale(tmin)^2)-sugar-maple        -0.1112 0.0930 -0.2894 -0.1146  0.0716   NA
I(scale(tmin)^2)-red-maple           0.2539 0.1207  0.0175  0.2495  0.5063   NA
I(scale(tmin)^2)-yellow-birch       -0.1389 0.1283 -0.3883 -0.1378  0.1099   NA
I(scale(tmin)^2)-American-beech     -0.1003 0.1402 -0.3714 -0.0961  0.1596   NA
I(scale(tmin)^2)-eastern-hemlock    -0.1744 0.1240 -0.4192 -0.1749  0.0744   NA
I(scale(tmin)^2)-white-ash          -0.1228 0.0875 -0.3023 -0.1210  0.0430   NA
I(scale(tmin)^2)-eastern-white pine -0.1948 0.1622 -0.5150 -0.1924  0.1266   NA
I(scale(tmin)^2)-red-spruce         -0.2139 0.1207 -0.4443 -0.2131  0.0170   NA
I(scale(tmin)^2)-paper-birch        -0.0874 0.0891 -0.2598 -0.0828  0.0851   NA
I(scale(tmin)^2)-northern-red oak   -0.2892 0.1935 -0.6896 -0.2789  0.0536   NA
scale(elev)-sugar-maple              0.0829 0.1035 -0.1150  0.0879  0.2858   NA
scale(elev)-red-maple               -0.7274 0.1436 -1.0274 -0.7210 -0.4608   NA
scale(elev)-yellow-birch             1.3283 0.2016  0.9234  1.3202  1.7398   NA
scale(elev)-American-beech           0.7434 0.2068  0.3499  0.7398  1.1763   NA
scale(elev)-eastern-hemlock         -1.4257 0.1925 -1.8245 -1.4201 -1.0494   NA
scale(elev)-white-ash               -0.8038 0.1205 -1.0482 -0.7986 -0.5613   NA
scale(elev)-eastern-white pine      -2.4137 0.3578 -3.1089 -2.3983 -1.7749   NA
scale(elev)-red-spruce               1.2416 0.1947  0.8802  1.2288  1.6540   NA
scale(elev)-paper-birch              0.2379 0.1112  0.0239  0.2325  0.4576   NA
scale(elev)-northern-red oak        -1.6505 0.2827 -2.2268 -1.6389 -1.1339   NA
I(scale(elev)^2)-sugar-maple        -0.3897 0.0827 -0.5487 -0.3871 -0.2322   NA
I(scale(elev)^2)-red-maple          -0.2688 0.0985 -0.4718 -0.2710 -0.0776   NA
I(scale(elev)^2)-yellow-birch       -0.1496 0.1216 -0.3981 -0.1487  0.0847   NA
I(scale(elev)^2)-American-beech     -0.3818 0.1160 -0.6224 -0.3826 -0.1633   NA
I(scale(elev)^2)-eastern-hemlock    -0.4747 0.1228 -0.7133 -0.4712 -0.2391   NA
I(scale(elev)^2)-white-ash          -0.5626 0.0975 -0.7499 -0.5624 -0.3846   NA
I(scale(elev)^2)-eastern-white pine -0.4154 0.1834 -0.7625 -0.4213 -0.0644   NA
I(scale(elev)^2)-red-spruce         -0.3725 0.1275 -0.6436 -0.3682 -0.1453   NA
I(scale(elev)^2)-paper-birch        -0.0003 0.0810 -0.1601 -0.0018  0.1556   NA
I(scale(elev)^2)-northern-red oak   -0.3566 0.1588 -0.6888 -0.3575 -0.0528   NA
                                     ESS
(Intercept)-sugar-maple              714
(Intercept)-red-maple                394
(Intercept)-yellow-birch               6
(Intercept)-American-beech           100
(Intercept)-eastern-hemlock           43
(Intercept)-white-ash                192
(Intercept)-eastern-white pine        10
(Intercept)-red-spruce                53
(Intercept)-paper-birch               71
(Intercept)-northern-red oak          14
scale(tmin)-sugar-maple              736
scale(tmin)-red-maple                293
scale(tmin)-yellow-birch              19
scale(tmin)-American-beech           114
scale(tmin)-eastern-hemlock           33
scale(tmin)-white-ash                308
scale(tmin)-eastern-white pine        13
scale(tmin)-red-spruce               123
scale(tmin)-paper-birch              187
scale(tmin)-northern-red oak          33
I(scale(tmin)^2)-sugar-maple        1000
I(scale(tmin)^2)-red-maple           637
I(scale(tmin)^2)-yellow-birch        113
I(scale(tmin)^2)-American-beech      496
I(scale(tmin)^2)-eastern-hemlock     577
I(scale(tmin)^2)-white-ash           910
I(scale(tmin)^2)-eastern-white pine  234
I(scale(tmin)^2)-red-spruce          603
I(scale(tmin)^2)-paper-birch         635
I(scale(tmin)^2)-northern-red oak    343
scale(elev)-sugar-maple              732
scale(elev)-red-maple                157
scale(elev)-yellow-birch              74
scale(elev)-American-beech           158
scale(elev)-eastern-hemlock          362
scale(elev)-white-ash                773
scale(elev)-eastern-white pine       164
scale(elev)-red-spruce               299
scale(elev)-paper-birch              323
scale(elev)-northern-red oak         341
I(scale(elev)^2)-sugar-maple         486
I(scale(elev)^2)-red-maple           586
I(scale(elev)^2)-yellow-birch        588
I(scale(elev)^2)-American-beech      504
I(scale(elev)^2)-eastern-hemlock     428
I(scale(elev)^2)-white-ash           559
I(scale(elev)^2)-eastern-white pine  260
I(scale(elev)^2)-red-spruce          655
I(scale(elev)^2)-paper-birch         844
I(scale(elev)^2)-northern-red oak    347

----------------------------------------
    Spatial Covariance
----------------------------------------
        Mean     SD   2.5%    50%  97.5% Rhat ESS
phi-1 0.3445 0.0246 0.2865 0.3504 0.3742   NA 352
phi-2 0.2252 0.0660 0.1012 0.2208 0.3597   NA  29
phi-3 0.0277 0.0185 0.0127 0.0184 0.0743   NA   5

Here we only ran one chain of the model, so the model summary does not report a Gelman-Rubin diagnostic (Rhat) for any of the model parameters. Looking at the effective sample sizes, we see slow mixing of the spatial decay parameters and the species-specific intercepts. We can look further at the traceplots for some of the model parameters to get a better sense of how well our model has converged.

# Components stored in the resulting model object
names(out)

 [1] "rhat"                "beta.comm.samples"   "tau.sq.beta.samples"
 [4] "beta.samples"        "lambda.samples"      "theta.samples"      
 [7] "w.samples"           "psi.samples"         "like.samples"       
[10] "z.samples"           "X.re"                "ESS"                
[13] "X"                   "y"                   "call"               
[16] "n.samples"           "x.names"             "sp.names"           
[19] "theta.names"         "type"                "coords"             
[22] "cov.model.indx"      "n.neighbors"         "q"                  
[25] "n.post"              "n.thin"              "n.burn"             
[28] "n.chains"            "monitors"            "update"             
[31] "psiRE"               "run.time"

plot(out$beta.comm.samples, density = FALSE)

Traceplots for hyper-mean parameters

plot(out$beta.samples[, 1:9], density = FALSE)

Traceplots for the first nine-species intercepts

plot(out$lambda.samples[, 2:10], density = FALSE)

Traceplots for the nine estimated loadings for the first spatial factor

plot(mcmc(out$psi.samples[, 1:9, 10]), density = FALSE)

Traceplots for occurrence probabilities for nine species at one forest plot

plot(out$theta.samples, density = FALSE)

Traceplots for spatial decay parameters for the three spatial factors

We see good mixing and convergence of the hyper-means, fairly slow mixing of some of the species-specific intercepts, adequate mixing and convergence of the loadings for the first spatial factor, and decent mixing and convergence of a few occurrence probabilities. The decay parameters (particularly the third one) do not show good mixing and/or convergence. This is not all that surprising, as these parameters are updated with a Metropolis-Hastings update while all other parameters have fully Gibbs updates.

Let’s look a bit more at some of the parameter estimates from the model.

# Create simple plot summaries using MCMCvis package
# "Community"-level effects. Mean effects across all species
MCMCplot(out$beta.comm.samples, ref_ovl = TRUE, ci = c(50, 95))

# Effects for one species of interest (sugar maple)
MCMCplot(out$beta.samples, ref_ovl = TRUE, ci = c(50, 95), exact = FALSE,
         param = 'sugar-maple')

# Latent factor loadings
MCMCplot(out$lambda.samples, ref_ovl = TRUE, ci = c(50, 95))

Looking at the factor loadings (as well as the factors themselves) can give some indication as to how many factors you want to estimate in the model. For a given factor, if estimates of the loadings and the values of the factor across space are all close to 0, that suggests you might be able to reduce the number of factors in the model.

Next, let’s compare the multivariate model to a couple of univariate models for two individual species. In the below set of code, we fit two univaraite models for American Beech and Eastern Hemlock, and then compare the univariate models to the joint model using WAIC. In a more formal comparison, we would generate some out-of-sample prediction criteria to compare the univariate and joint models.

# Use WAIC to compare spatial factor model to univariate spatial models for
# two species (American beech and eastern hemlock)
sp.names <- dimnames(data.list$y)[[1]]
# Eastern hemlock ---------------------
data.hemlock <- data.list
data.hemlock$y <- data.list$y[which(sp.names == 'eastern-hemlock'), ]
data.hemlock$weights <- rep(1, nrow(data.hemlock$covs))
str(data.hemlock)

List of 4
 $ y      : num [1:688] 0 1 0 1 1 0 1 0 1 0 ...
 $ coords : num [1:688, 1:2] 1817 1820 1805 1804 1825 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:2] "LON" "LAT"
 $ covs   :'data.frame':    688 obs. of  8 variables:
  ..$ tmax         : num [1:688] 25.4 25.9 25.8 26.1 24.8 25.2 26.3 23.4 25.6 26.9 ...
  ..$ tmin         : num [1:688] -15.1 -15.3 -15 -15 -15.3 -14.9 -14.8 -15.5 -14.9 -14.3 ...
  ..$ prcp         : num [1:688] 1171 1177 1092 1085 1262 ...
  ..$ pet          : num [1:688] 504 508 513 515 496 ...
  ..$ aet          : num [1:688] 466 468 465 464 468 ...
  ..$ water_deficit: num [1:688] 37.4 39.8 47.4 50.8 27.9 42.1 57.4 15.8 44.3 81.7 ...
  ..$ vpd          : num [1:688] 0.614 0.638 0.65 0.664 0.586 0.636 0.686 0.5 0.624 0.728 ...
  ..$ elev         : num [1:688] 253 202 236 134 424 363 138 595 273 54 ...
 $ weights: num [1:688] 1 1 1 1 1 1 1 1 1 1 ...

priors.univariate <- list(beta.normal = list(mean = 0, var = 2.72),
                          phi.unif = list(3 / upper.dist, 3 / lower.dist),
                          sigma.sq.ig = list(2, 1))
# Approx. run time: 1 minute
out.hemlock <- svcPGBinom(formula = ~ scale(tmin) + I(scale(tmin)^2) +
                                      scale(elev) + I(scale(elev)^2),
                          data = data.hemlock, priors = priors.univariate, svc.cols = 1,
                          n.neighbors = n.neighbors, cov.model = 'exponential',
                          n.batch = n.batch, batch.length = batch.length,
                          n.burn = n.burn, accept.rate = 0.43, n.thin = n.thin,
                          n.chains = n.chains, n.report = 100)

----------------------------------------
    Preparing to run the model
----------------------------------------

beta is not specified in initial values.
Setting initial values to random values from the prior distribution

phi is not specified in initial values.
Setting initial value to random values from the prior distribution

sigma.sq is not specified in initial values.
Setting initial values to random values from the prior distribution

w is not specified in initial values.
Setting initial value to 0

----------------------------------------
    Building the neighbor list
----------------------------------------
----------------------------------------
Building the neighbors of neighbors list
----------------------------------------
----------------------------------------
    Model description
----------------------------------------
Spatial NNGP Binomial model with Polya-Gamma latent
variable fit with 688 sites.

Samples per chain: 10000 (400 batches of length 25)
Burn-in: 5000 
Thinning Rate: 5 
Number of Chains: 1 
Total Posterior Samples: 1000 

Number of spatially-varying coefficients: 1 
Using the exponential spatial correlation model.

Using 5 nearest neighbors.

Source compiled with OpenMP support and model fit using 1 thread(s).

Adaptive Metropolis with target acceptance rate: 43.0
----------------------------------------
    Chain 1
----------------------------------------
Sampling ... 
Batch: 100 of 400, 25.00%
    Coefficient Parameter   Acceptance  Tuning
    1       phi     24.0        0.37908
-------------------------------------------------
Batch: 200 of 400, 50.00%
    Coefficient Parameter   Acceptance  Tuning
    1       phi     32.0        0.24414
-------------------------------------------------
Batch: 300 of 400, 75.00%
    Coefficient Parameter   Acceptance  Tuning
    1       phi     44.0        0.30422
-------------------------------------------------
Batch: 400 of 400, 100.00%

# American beech ----------------------
data.beech <- data.list
data.beech$y <- data.list$y[which(sp.names == 'American-beech'), ]
data.beech$weights <- rep(1, nrow(data.beech$covs))
str(data.beech)

List of 4
 $ y      : num [1:688] 0 0 0 0 0 0 1 0 1 0 ...
 $ coords : num [1:688, 1:2] 1817 1820 1805 1804 1825 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:2] "LON" "LAT"
 $ covs   :'data.frame':    688 obs. of  8 variables:
  ..$ tmax         : num [1:688] 25.4 25.9 25.8 26.1 24.8 25.2 26.3 23.4 25.6 26.9 ...
  ..$ tmin         : num [1:688] -15.1 -15.3 -15 -15 -15.3 -14.9 -14.8 -15.5 -14.9 -14.3 ...
  ..$ prcp         : num [1:688] 1171 1177 1092 1085 1262 ...
  ..$ pet          : num [1:688] 504 508 513 515 496 ...
  ..$ aet          : num [1:688] 466 468 465 464 468 ...
  ..$ water_deficit: num [1:688] 37.4 39.8 47.4 50.8 27.9 42.1 57.4 15.8 44.3 81.7 ...
  ..$ vpd          : num [1:688] 0.614 0.638 0.65 0.664 0.586 0.636 0.686 0.5 0.624 0.728 ...
  ..$ elev         : num [1:688] 253 202 236 134 424 363 138 595 273 54 ...
 $ weights: num [1:688] 1 1 1 1 1 1 1 1 1 1 ...

# Approx. run time: 1 minute
out.beech <- svcPGBinom(formula = ~ scale(tmin) + I(scale(tmin)^2) +
                                    scale(elev) + I(scale(elev)^2),
                        data = data.beech, priors = priors.univariate, svc.cols = 1,
                        n.neighbors = n.neighbors, cov.model = 'exponential',
                        n.batch = n.batch, batch.length = batch.length,
                        n.burn = n.burn, accept.rate = 0.43, n.thin = n.thin,
                        n.chains = n.chains, n.report = 100)

----------------------------------------
    Preparing to run the model
----------------------------------------

beta is not specified in initial values.
Setting initial values to random values from the prior distribution

phi is not specified in initial values.
Setting initial value to random values from the prior distribution

sigma.sq is not specified in initial values.
Setting initial values to random values from the prior distribution

w is not specified in initial values.
Setting initial value to 0

----------------------------------------
    Building the neighbor list
----------------------------------------
----------------------------------------
Building the neighbors of neighbors list
----------------------------------------
----------------------------------------
    Model description
----------------------------------------
Spatial NNGP Binomial model with Polya-Gamma latent
variable fit with 688 sites.

Samples per chain: 10000 (400 batches of length 25)
Burn-in: 5000 
Thinning Rate: 5 
Number of Chains: 1 
Total Posterior Samples: 1000 

Number of spatially-varying coefficients: 1 
Using the exponential spatial correlation model.

Using 5 nearest neighbors.

Source compiled with OpenMP support and model fit using 1 thread(s).

Adaptive Metropolis with target acceptance rate: 43.0
----------------------------------------
    Chain 1
----------------------------------------
Sampling ... 
Batch: 100 of 400, 25.00%
    Coefficient Parameter   Acceptance  Tuning
    1       phi     44.0        0.50158
-------------------------------------------------
Batch: 200 of 400, 50.00%
    Coefficient Parameter   Acceptance  Tuning
    1       phi     32.0        0.50158
-------------------------------------------------
Batch: 300 of 400, 75.00%
    Coefficient Parameter   Acceptance  Tuning
    1       phi     36.0        0.56553
-------------------------------------------------
Batch: 400 of 400, 100.00%

waic.joint <- waicOcc(out, by.sp = TRUE)
rownames(waic.joint) <- sp.names
waic.joint

                        elpd        pD     WAIC
sugar-maple        -274.8984  41.42404 632.6450
red-maple          -289.7761  99.15664 777.8654
yellow-birch       -258.1507  74.39994 665.1013
American-beech     -198.3603 137.05896 670.8385
eastern-hemlock    -263.3020  59.88226 646.3685
white-ash          -385.6940  26.05158 823.4911
eastern-white pine -170.5473  46.88716 434.8689
red-spruce         -263.0844 107.55150 741.2717
paper-birch        -374.5959  43.17852 835.5487
northern-red oak   -156.1504  19.15890 350.6186

waicOcc(out.hemlock)

      elpd         pD       WAIC 
-303.38693   33.50046  673.77477

waicOcc(out.beech)

     elpd        pD      WAIC 
-228.6058  142.4851  742.1818

WAIC is lower for both species, indicating the joint model provides some improvements compared to the univariate models.

Prediction

Next, let’s turn to our main task of predicting distributions across the state. First we load a set of covariates calculated across a 2 km grid of Vermont. We then standardize them by the values used to fit the data, and subsequently predict across the evenly spaced grid.

# Read in prediction data (reads in two objects coords.0 and X.0
load("prediction-data.rda")
str(coords.0)

'data.frame':   6297 obs. of  2 variables:
 $ X: num  1905 1895 1897 1899 1901 ...
 $ Y: num  1086 1084 1084 1084 1084 ...

str(X.0)

 num [1:6297, 1:8] 24.3 23.5 24 24.2 24.2 24.3 24.3 24.1 24.2 24.2 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:8] "tmax" "tmin" "ppt" "pet" ...

# Standardize covariate values by values used to fit the model
tmin.pred <- (X.0[, 'tmin'] - mean(data.list$covs$tmin)) / sd(data.list$covs$tmin)
elev.pred <- (X.0[, 'elev'] - mean(data.list$covs$elev)) / sd(data.list$covs$elev)
X.pred <- cbind(1, tmin.pred, tmin.pred^2, elev.pred, elev.pred^2)
colnames(X.pred) <- c('(Intercept)', 'tmin', 'tmin.2', 'elev', 'elev.2')

# Predict occurrence probabilities
# Approx. run time: 1 minute
out.pred <- predict(out, X.pred, coords.0)

----------------------------------------
    Prediction description
----------------------------------------
Spatial Factor NNGP Multispecies Occupancy model with Polya-Gamma latent
variable fit with 688 observations.

Number of covariates 5 (including intercept if specified).

Using the exponential spatial correlation model.

Using 5 nearest neighbors.
Using 3 latent spatial factors.

Number of MCMC samples 1000.

Predicting at 6297 non-sampled locations.


Source compiled with OpenMP support and model fit using 1 threads.
-------------------------------------------------
        Predicting
-------------------------------------------------
Location: 100 of 6297, 1.59%
Location: 200 of 6297, 3.18%
Location: 300 of 6297, 4.76%
Location: 400 of 6297, 6.35%
Location: 500 of 6297, 7.94%
Location: 600 of 6297, 9.53%
Location: 700 of 6297, 11.12%
Location: 800 of 6297, 12.70%
Location: 900 of 6297, 14.29%
Location: 1000 of 6297, 15.88%
Location: 1100 of 6297, 17.47%
Location: 1200 of 6297, 19.06%
Location: 1300 of 6297, 20.64%
Location: 1400 of 6297, 22.23%
Location: 1500 of 6297, 23.82%
Location: 1600 of 6297, 25.41%
Location: 1700 of 6297, 27.00%
Location: 1800 of 6297, 28.59%
Location: 1900 of 6297, 30.17%
Location: 2000 of 6297, 31.76%
Location: 2100 of 6297, 33.35%
Location: 2200 of 6297, 34.94%
Location: 2300 of 6297, 36.53%
Location: 2400 of 6297, 38.11%
Location: 2500 of 6297, 39.70%
Location: 2600 of 6297, 41.29%
Location: 2700 of 6297, 42.88%
Location: 2800 of 6297, 44.47%
Location: 2900 of 6297, 46.05%
Location: 3000 of 6297, 47.64%
Location: 3100 of 6297, 49.23%
Location: 3200 of 6297, 50.82%
Location: 3300 of 6297, 52.41%
Location: 3400 of 6297, 53.99%
Location: 3500 of 6297, 55.58%
Location: 3600 of 6297, 57.17%
Location: 3700 of 6297, 58.76%
Location: 3800 of 6297, 60.35%
Location: 3900 of 6297, 61.93%
Location: 4000 of 6297, 63.52%
Location: 4100 of 6297, 65.11%
Location: 4200 of 6297, 66.70%
Location: 4300 of 6297, 68.29%
Location: 4400 of 6297, 69.87%
Location: 4500 of 6297, 71.46%
Location: 4600 of 6297, 73.05%
Location: 4700 of 6297, 74.64%
Location: 4800 of 6297, 76.23%
Location: 4900 of 6297, 77.81%
Location: 5000 of 6297, 79.40%
Location: 5100 of 6297, 80.99%
Location: 5200 of 6297, 82.58%
Location: 5300 of 6297, 84.17%
Location: 5400 of 6297, 85.76%
Location: 5500 of 6297, 87.34%
Location: 5600 of 6297, 88.93%
Location: 5700 of 6297, 90.52%
Location: 5800 of 6297, 92.11%
Location: 5900 of 6297, 93.70%
Location: 6000 of 6297, 95.28%
Location: 6100 of 6297, 96.87%
Location: 6200 of 6297, 98.46%
Location: 6297 of 6297, 100.00%
Generating latent occupancy state

str(out.pred)

List of 6
 $ z.0.samples  : num [1:1000, 1:10, 1:6297] 1 1 1 1 1 0 1 0 1 1 ...
 $ w.0.samples  : num [1:1000, 1:3, 1:6297] -1.265 -1.224 -0.307 0.299 0.893 ...
 $ psi.0.samples: num [1:1000, 1:10, 1:6297] 0.363 0.549 0.662 0.864 0.92 ...
 $ run.time     : 'proc_time' Named num [1:5] 61.2 21.9 46.3 0 0
  ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
 $ call         : language predict.sfMsPGOcc(object = object, X.0 = X.0, coords.0 = coords.0, n.omp.threads = n.omp.threads,      verbose = | __truncated__
 $ object.class : chr "sfJSDM"
 - attr(*, "class")= chr "predict.sfJSDM"

We see the prediction object contains three main components:

z.0.samples: predictions of the binary presence/absence state at each of the prediction locations.
w.0.samples: predictions of the spatial factors at each of the prediction locations.
psi.0.samples: predictions of the occurrence probability (i.e., the back transformed linear predictor) across the prediction locations.

The psi.0.samples is what we will use to generate species distribution maps.

# Generate predictions of occurrence probability across Vermont
occ.pred.means <- apply(out.pred$psi.0.samples, c(2, 3), mean)
occ.pred.sds <- apply(out.pred$psi.0.samples, c(2, 3), sd)

n.pred <- nrow(X.pred)
plot.df <- data.frame(psi.mean = c(t(occ.pred.means)),
                      psi.sd = c(t(occ.pred.sds)),
                      x = rep(coords.0[, 1], times = length(sp.names)),
                      y = rep(coords.0[, 2], times = length(sp.names)),
                      species = factor(rep(sp.names, each = n.pred)))
pred.sf <- st_as_sf(plot.df, coords = c('x', 'y'))
ggplot() +
  geom_sf(data = pred.sf, aes(col = psi.mean)) +
  scale_color_viridis_c(option = 'plasma') +
  theme_bw(base_size = 9) +
  facet_wrap(~species, nrow = 2) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank()) +
  labs(x = "Longitude", y = "Latitude", col = "", title = 'Mean Occurrence Probability')

ggplot() +
  geom_sf(data = pred.sf, aes(col = psi.sd)) +
  scale_color_viridis_c(option = 'plasma') +
  theme_bw(base_size = 9) +
  facet_wrap(~species, nrow = 2) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank()) +
  labs(x = "Longitude", y = "Latitude", col = "", title = 'SD Occurrence Probability')

Lastly, we plot the three predicted spatial factors across the state.

# Generate predictions of spatial factors across Vermont
w.pred.means <- apply(out.pred$w.0.samples, c(2, 3), mean)
w.pred.sds <- apply(out.pred$w.0.samples, c(2, 3), sd)

n.pred <- nrow(X.pred)
plot.df <- data.frame(w.mean = c(t(w.pred.means)),
              w.sd = c(t(w.pred.sds)),
              x = rep(coords.0[, 1], times = n.factors),
              y = rep(coords.0[, 2], times = n.factors),
              id = factor(rep(1:n.factors, each = n.pred)))
pred.sf <- st_as_sf(plot.df, coords = c('x', 'y'))
ggplot() +
  geom_sf(data = pred.sf, aes(col = w.mean)) +
  scale_color_viridis_c(option = 'plasma') +
  theme_bw(base_size = 16) +
  facet_wrap(~id) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank()) +
  labs(x = "Longitude", y = "Latitude", col = "", title = 'Mean Spatial Factor')

ggplot() +
  geom_sf(data = pred.sf, aes(col = w.sd)) +
  scale_color_viridis_c(option = 'plasma') +
  theme_bw(base_size = 16) +
  facet_wrap(~id) +
  theme(axis.text.x = element_blank(),
        axis.text.y = element_blank()) +
  labs(x = "Longitude", y = "Latitude", col = "", title = 'SD Spatial Factor')

Here, the first two factors have quite a low effective spatial range, so little information is provided to non-sampled locations.

References

Bechtold, William A, and Paul L Patterson. 2005. The Enhanced Forest Inventory and Analysis Program–National Sampling Design and Estimation Procedures. 80. USDA Forest Service, Southern Research Station.

Datta, Abhirup, Sudipto Banerjee, Andrew O Finley, and Alan E Gelfand. 2016. “Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets.” Journal of the American Statistical Association 111 (514): 800–812.

Doser, Jeffrey W, Andrew O Finley, Marc Kéry, and Elise F Zipkin. 2022. “spOccupancy: An R package for single-species, multi-species, and integrated spatial occupancy models.” Methods in Ecology and Evolution 13 (8): 1670–8.

Polson, Nicholas G, James G Scott, and Jesse Windle. 2013. “Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables.” Journal of the American Statistical Association 108 (504): 1339–49.

Modeling tree distributions across Vermont

Jeffrey W. Doser and Andrew O. Finley

May 15, 2023

Introduction

Exploratory data analysis

Model Fitting

Prediction

References