In this vignette we will present the `inlabru`

implementation of the covariance-based rational SPDE approach. For further technical details on the covariance-based approach, see the Rational approximation with the `rSPDE`

package vignette and (Xiong, Simas, and Bolin 2022)(https://www.tandfonline.com/doi/full/10.1080/10618600.2019.1665537).

We begin by providing a step-by-step illustration on how to use our implementation. To this end we will consider a real world data set that consists of precipitation measurements from the Paraná region in Brazil.

After the initial model fitting, we will show how to change some parameters of the model. In the end, we will also provide an example in which we have replicates.

The examples in this vignette are the same as those in the R-INLA implementation of the rational SPDE approach vignette. As in that case, it is important to mention that one can improve the performance by using the PARDISO solver. Please, go to https://www.pardiso-project.org/r-inla/#license to apply for a license. Also, use `inla.pardiso()`

for instructions on how to enable the PARDISO sparse library.

To illustrate our implementation of `rSPDE`

in `inlabru`

we will consider a dataset available in `R-INLA`

. This data has also been used to illustrate the SPDE approach, see for instance the book Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA and also the vignette Spatial Statistics using R-INLA and Gaussian Markov random fields. See also (Lindgren, Rue, and Lindström 2011)(https://rss.onlinelibrary.wiley.com/doi/full/10.1111/j.1467-9868.2011.00777.x) for theoretical details on the standard SPDE approach.

The data consist of precipitation measurements from the Paraná region in Brazil and were provided by the Brazilian National Water Agency. The data were collected at 616 gauge stations in Paraná state, south of Brazil, for each day in 2011.

We will follow the vignette Spatial Statistics using R-INLA and Gaussian Markov random fields. As precipitation data are always positive, we will assume it is Gamma distributed. `R-INLA`

uses the following parameterization of the Gamma distribution, \[\Gamma(\mu, \phi): \pi (y) = \frac{1}{\Gamma(\phi)} \left(\frac{\phi}{\mu}\right)^{\phi} y^{\phi - 1} \exp\left(-\frac{\phi y}{\mu}\right) .\] In this parameterization, the distribution has expected value \(E(x) = \mu\) and variance \(V(x) = \mu^2/\phi\), where \(1/\phi\) is a dispersion parameter.

In this example \(\mu\) will be modelled using a stochastic model that includes both covariates and spatial structure, resulting in the latent Gaussian model for the precipitation measurements \[\begin{align} y_i\mid \mu(s_i), \theta &\sim \Gamma(\mu(s_i),c\phi)\\ \log (\mu(s)) &= \eta(s) = \sum_k f_k(c_k(s))+u(s)\\ \theta &\sim \pi(\theta) \end{align},\]

where \(y_i\) denotes the measurement taken at location \(s_i\), \(c_k(s)\) are covariates, \(u(s)\) is a mean-zero Gaussian Matérn field, and \(\theta\) is a vector containing all parameters of the model, including smoothness of the field. That is, by using the `rSPDE`

model we will also be able to estimate the smoothness of the latent field.

We will be using `inlabru`

. The `inlabru`

package is available on CRAN and also on GitHub.

We begin by loading some libraries we need to get the data and build the plots.

```
library(gridExtra)
library(ggplot2)
library(lattice)
library(INLA)
library(inlabru)
library(splancs)
library(fields)
```

Let us load the data and the border of the region

```
data(PRprec)
data(PRborder)
```

The data frame contains daily measurements at 616 stations for the year 2011, as well as coordinates and altitude information for the measurement stations. We will not analyze the full spatio-temporal data set, but instead look at the total precipitation in January, which we calculate as

`<- rowMeans(PRprec[, 3 + 1:31]) Y `

In the next snippet of code, we extract the coordinates and altitudes and remove the locations with missing values.

```
<- !is.na(Y)
ind <- Y[ind]
Y <- as.matrix(PRprec[ind, 1:2])
coords <- PRprec$Altitude[ind] alt
```

Let us build a plot for the precipitations:

```
<- ggplot() +
p geom_point(aes(
x = coords[, 1], y = coords[, 2],
colour = Y
size = 2, alpha = 1) +
), scale_colour_gradientn(colours = tim.colors(100)) +
geom_path(aes(x = PRborder[, 1], y = PRborder[, 2])) +
geom_path(aes(x = PRborder[1034:1078, 1], y = PRborder[
1034:1078,
2
colour = "red")
]), tryCatch(
{print(p)
},error = function(e) {
print("Unable to build the plot")
} )
```

The red line in the figure shows the coast line, and we expect the distance to the coast to be a good covariate for precipitation.

This covariate is not available, so let us calculate it for each observation location:

```
<- apply(spDists(coords, PRborder[1034:1078, ],
seaDist longlat = TRUE
1, min) ),
```

Now, let us plot the precipitation as a function of the possible covariates:

```
par(mfrow = c(2, 2))
plot(coords[, 1], Y, cex = 0.5, xlab = "Longitude")
plot(coords[, 2], Y, cex = 0.5, xlab = "Latitude")
plot(seaDist, Y, cex = 0.5, xlab = "Distance to sea")
plot(alt, Y, cex = 0.5, xlab = "Altitude")
```

`par(mfrow = c(1, 1))`

To use the `inlabru`

implementation of the `rSPDE`

model we need to load the functions:

`library(rSPDE)`

To create a `rSPDE`

model, one would the `rspde.matern()`

function in a similar fashion as one would use the `inla.spde2.matern()`

function.

We can use `R-INLA`

for creating the mesh. Let us create a mesh which is based on a non-convex hull to avoid adding many small triangles outside the domain of interest:

```
<- inla.nonconvex.hull(coords, -0.03, -0.05, resolution = c(100, 100))
prdomain <- inla.mesh.2d(boundary = prdomain, max.edge = c(0.45, 1), cutoff = 0.2)
prmesh plot(prmesh, asp = 1, main = "")
lines(PRborder, col = 3)
points(coords[, 1], coords[, 2], pch = 19, cex = 0.5, col = "red")
```

In place of a `inla.stack`

, we can set up a `data.frame()`

to use `inlabru`

. We refer the reader to vignettes in https://inlabru-org.github.io/inlabru/index.html for further details.

```
<- data.frame(long = coords[,1], lat = coords[,2],
prdata seaDist = inla.group(seaDist), y = Y)
coordinates(prdata) <- c("long","lat")
```

To set up an `rSPDE`

model, all we need is the mesh. By default it will assume that we want to estimate the smoothness parameter \(\nu\) and to do a covariance-based rational approximation of order 2.

Later in this vignette we will also see other options for setting up `rSPDE`

models such as keeping the smoothness parameter fixed and/or increasing the order of the covariance-based rational approximation.

Therefore, to set up a model all we have to do is use the `rspde.matern()`

function:

`<- rspde.matern(mesh = prmesh) rspde_model `

Notice that this function is very reminiscent of `R-INLA`

’s `inla.spde2.matern()`

function.

We will assume the following linkage between model components and observations \[\eta(s) \sim A x(s) + A \text{ Intercept} + \text{seaDist}.\] \(\eta(s)\) will then be used in the observation-likelihood, \[y_i\mid \eta(s_i),\theta \sim \Gamma(\exp(\eta (s_i)), c\phi).\]

We will build a model using the distance to the sea \(x_i\) as a covariate through an improper CAR(1) model with \(\beta_{ij}=1(i\sim j)\), which `R-INLA`

calls a random walk of order 1. We will fit it in `inlabru`

’s style. For `inlabru`

version `2.5.3.9002`

or higher we can use the following compact synthax:

```
<- y ~ Intercept(1) + distSea(seaDist, model="rw1") +
cmp field(coordinates, model = rspde_model)
```

For the current CRAN version of `inlabru`

(version 2.5.3), one should use:

```
<- y ~ Intercept(1) + distSea(seaDist, model="rw1") +
cmp field(coordinates, model = rspde_model, mapper = bru_mapper(rspde_model))
```

To fit the model we simply use the `bru()`

function:

```
<- bru(cmp, data = prdata,
rspde_fit family = "Gamma",
options = list(
inla.mode = "experimental",
control.inla = list(int.strategy = "eb"),
verbose = FALSE)
)
```

We can look at some summaries of the posterior distributions for the parameters, for example the fixed effects (i.e. the intercept) and the hyper-parameters (i.e. dispersion in the gamma likelihood, the precision of the RW1, and the parameters of the spatial field):

`summary(rspde_fit)`

```
## inlabru version: 2.5.3
## INLA version: 22.09.15
## Components:
## Intercept: Model types main='linear', group='exchangeable', replicate='iid'
## distSea: Model types main='rw1', group='exchangeable', replicate='iid'
## field: Model types main='rgeneric', group='exchangeable', replicate='iid'
## Likelihoods:
## Family: 'Gamma'
## Data class: 'SpatialPointsDataFrame'
## Predictor: y ~ .
## Time used:
## Pre = 4.05, Running = 24.2, Post = 0.146, Total = 28.4
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## Intercept 1.941 0.055 1.834 1.941 2.048 1.941 0
##
## Random effects:
## Name Model
## distSea RW1 model
## field RGeneric2
##
## Model hyperparameters:
## mean sd 0.025quant
## Precision parameter for the Gamma observations 13.262 0.910 11.53
## Precision for distSea 11373.771 9664.782 2070.18
## Theta1 for field -1.217 0.627 -2.45
## Theta2 for field 1.122 0.314 0.49
## Theta3 for field -0.884 0.389 -1.64
## 0.5quant 0.975quant mode
## Precision parameter for the Gamma observations 13.242 15.115 13.222
## Precision for distSea 8660.828 36994.303 5074.032
## Theta1 for field -1.218 0.017 -1.220
## Theta2 for field 1.128 1.724 1.150
## Theta3 for field -0.887 -0.112 -0.897
##
## Deviance Information Criterion (DIC) ...............: 2501.52
## Deviance Information Criterion (DIC, saturated) ....: 4748.62
## Effective number of parameters .....................: 83.25
##
## Watanabe-Akaike information criterion (WAIC) ...: 2504.19
## Effective number of parameters .................: 75.91
##
## Marginal log-Likelihood: -1263.54
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
```

Let \(\theta_1 = \textrm{Theta1}\), \(\theta_2=\textrm{Theta2}\) and \(\theta_3=\textrm{Theta3}\). In terms of the SPDE \[(\kappa^2 I - \Delta)^{\alpha/2}(\tau u) = \mathcal{W},\] where \(\alpha = \nu + d/2\), we have that \[\tau = \exp(\theta_1),\quad \kappa = \exp(\theta_2), \] and by default \[\nu = 4\Big(\frac{\exp(\theta_3)}{1+\exp(\theta_3)}\Big).\] The number 4 comes from the upper bound for \(\nu\), which will be discussed later in this vignette.

In general, we have \[\nu = \nu_{UB}\Big(\frac{\exp(\theta_3)}{1+\exp(\theta_3)}\Big),\] where \(\nu_{UB}\) is the value of the upper bound for the smoothness parameter \(\nu\).

Another choice for prior for \(\nu\) is a truncated lognormal distribution and will also be discussed later in this vignette.

`rSPDE`

-`INLA`

resultsWe can obtain outputs with respect to parameters in the original scale by using the function `rspde.result()`

:

```
<- rspde.result(rspde_fit, "field", rspde_model)
result_fit summary(result_fit)
```

```
## mean sd 0.025quant 0.5quant 0.975quant mode
## tau 0.359183 0.243961 0.0872512 0.295841 1.00687 0.200143
## kappa 3.223380 1.011430 1.6418300 3.091880 5.57968 2.838560
## nu 1.193500 0.316382 0.6522140 1.166790 1.88107 1.103850
```

We can also plot the posterior densities:

```
par(mfrow = c(1, 3))
plot(result_fit, caption = c("tau", "kappa", "nu"))
```

This function is reminiscent to the `inla.spde.result()`

function with the main difference that it has the `summary()`

and `plot()`

methods implemented.

Let us now obtain predictions (i.e. do kriging) of the expected precipitation on a dense grid in the region.

We begin by creating the grid in which we want to do the predictions. To this end, we can use the `inla.mesh.projector()`

function:

```
<- c(150, 100)
nxy <- inla.mesh.projector(prmesh,
projgrid xlim = range(PRborder[, 1]),
ylim = range(PRborder[, 2]), dims = nxy
)
```

This lattice contains 150 × 100 locations. One can easily change the resolution of the kriging prediction by changing `nxy`

. Let us find the cells that are outside the region of interest so that we do not plot the estimates there.

`<- inout(projgrid$lattice$loc, cbind(PRborder[, 1], PRborder[, 2])) xy.in `

Let us plot the locations that we will do prediction:

```
<- projgrid$lattice$loc[xy.in, ]
coord.prd plot(coord.prd, type = "p", cex = 0.1)
lines(PRborder)
points(coords[, 1], coords[, 2], pch = 19, cex = 0.5, col = "red")
```

Let us now create a `data.frame()`

of the coordinates:

```
<- data.frame(x1 = coord.prd[,1],
coord.prd.df x2 = coord.prd[,2])
coordinates(coord.prd.df) <- c("x1", "x2")
```

Since we are using distance to the sea as a covariate, we also have to calculate this covariate for the prediction locations. Finally, we add the prediction location to our prediction `data.frame()`

, namely, `coord.prd.df`

:

```
<- apply(spDists(coord.prd,
seaDist.prd 1034:1078, ],
PRborder[longlat = TRUE
1, min)
), $seaDist <- seaDist.prd coord.prd.df
```

```
<- predict(rspde_fit, coord.prd.df,
pred_obs ~exp(Intercept + field + distSea))
```

Let us now build the data frame with the results:

```
<- pred_obs@data
pred_df <- cbind(pred_df, pred_obs@coords) pred_df
```

Finally, we plot the results. First the predicted mean:

```
<- ggplot(pred_df, aes(x = x1, y = x2, fill = mean)) +
p geom_raster() +
scale_fill_gradient(low = "yellow", high = "red")
p
```

Then, the std. deviations:

```
<- ggplot(pred_df, aes(x = x1, y = x2, fill = sd)) +
p geom_raster()
p
```

For this example we will simulate a data with replicates. We will use the same example considered in the Rational approximation with the `rSPDE`

package vignette (the only difference is the way the data is organized). We also refer the reader to this vignette for a description of the function `matern.operators()`

, along with its methods (for instance, the `simulate()`

method).

Let us consider a simple Gaussian linear model with 30 independent replicates of a latent spatial field \(x(\mathbf{s})\), observed at the same \(m\) locations, \(\{\mathbf{s}_1 , \ldots , \mathbf{s}_m \}\), for each replicate. For each \(i = 1,\ldots,m,\) we have

\[\begin{align} y_i &= x_1(\mathbf{s}_i)+\varepsilon_i,\\ \vdots &= \vdots\\ y_{i+29m} &= x_{30}(\mathbf{s}_i) + \varepsilon_{i+29m}, \end{align}\]

where \(\varepsilon_1,\ldots,\varepsilon_{30m}\) are iid normally distributed with mean 0 and standard deviation 0.1.

We use the basis function representation of \(x(\cdot)\) to define the \(A\) matrix linking the point locations to the mesh. We also need to account for the fact that we have 30 replicates at the same locations. To this end, the \(A\) matrix we need can be generated by `inla.spde.make.A()`

function. The reason being that we are sampling \(x(\cdot)\) directly and not the latent vector described in the introduction of the Rational approximation with the `rSPDE`

package vignette.

We begin by creating the mesh:

```
<- 200
m <- matrix(runif(m * 2), m, 2)
loc_2d_mesh <- inla.mesh.2d(
mesh_2d loc = loc_2d_mesh,
cutoff = 0.05,
offset = c(0.1, 0.4),
max.edge = c(0.05, 0.5)
)plot(mesh_2d, main = "")
points(loc_2d_mesh[, 1], loc_2d_mesh[, 2])
```

We then compute the \(A\) matrix, which is needed for simulation, and connects the observation locations to the mesh:

```
<- 30
n.rep <- inla.spde.make.A(
A mesh = mesh_2d,
loc = loc_2d_mesh,
index = rep(1:m, times = n.rep),
repl = rep(1:n.rep, each = m)
)
```

Notice that for the simulated data, we should use the \(A\) matrix from `inla.spde.make.A()`

function.

We will now simulate a latent process with standard deviation \(\sigma=1\) and range \(0.1\). We will use \(\nu=0.5\) so that the model has an exponential covariance function. To this end we create a model object with the `matern.operators()`

function:

```
<- 0.5
nu <- 1
sigma <- 0.1
range <- sqrt(8 * nu) / range
kappa <- sqrt(gamma(nu) / (sigma^2 * kappa^(2 * nu) * (4 * pi) * gamma(nu + 1)))
tau <- 2
d <- matern.operators(
operator_information mesh = mesh_2d,
nu = nu,
kappa = kappa,
sigma = sigma,
m = 2
)
```

More details on this function can be found at the Rational approximation with the rSPDE package vignette.

To simulate the latent process all we need to do is to use the `simulate()`

method on the `operator_information`

object. We then obtain the simulated data \(y\) by connecting with the \(A\) matrix and adding the gaussian noise.

```
set.seed(1)
<- simulate(operator_information, nsim = n.rep)
u <- as.vector(A %*% as.vector(u)) +
y rnorm(m * n.rep) * 0.1
```

The first replicate of the simulated random field as well as the observation locations are shown in the following figure.

```
<- par(mfrow = c(1, 2), mgp = c(1.2, 0.5, 0),
opar mar = c(2, 2, 0.5, 0.5) + 0.1)
<- inla.mesh.projector(mesh_2d, dims = c(100, 100))
proj image(inla.mesh.project(proj, field = as.vector(u[, 1])),
xlab = "", ylab = "",
cex.main = 0.8, cex.axis = 0.8, cex.lab = 0.8
)plot(loc_2d_mesh[, 1], loc_2d_mesh[, 2],
cex = 0.2, pch = 16, xlab = "", ylab = "",
cex.main = 0.8, cex.axis = 0.8, cex.lab = 0.8
)par(opar)
```

`inlabru`

rSPDE modelLet us then use the rational SPDE approach to fit the data.

We begin by creating the model object.

`<- rspde.matern(mesh = mesh_2d) rspde_model.rep `

Let us now create the `data.frame()`

and the vector with the replicates indexes:

```
<- data.frame(y = y, x1 = rep(loc_2d_mesh[,1], 30),
rep.df x2 = rep(loc_2d_mesh[,2], 30))
coordinates(rep.df) <- c("x1", "x2")
<- rep(1:30, each=200) repl
```

Let us create the component and fit. It is extremely important not to forget the `replicate`

when fitting model with the `bru()`

function. It will not produce warning and might fit some meaningless model.

```
# For inlabru 2.5.3.9002 or above:
<-
cmp.rep ~ -1 + field(coordinates,
y model = rspde_model.rep,
replicate = repl
)
# For inlabru 2.5.3
<-
cmp.rep ~ -1 + field(coordinates,
y model = rspde_model.rep,
replicate = repl,
mapper = bru_mapper(rspde_model.rep)
)
<-
rspde_fit.rep bru(cmp.rep,
data = rep.df,
family = "gaussian",
options = list(
list(inla.mode = "experimental")
) )
```

We can get the summary:

`summary(rspde_fit.rep)`

```
## inlabru version: 2.5.3
## INLA version: 22.09.15
## Components:
## field: Model types main='rgeneric', group='exchangeable', replicate='iid'
## Likelihoods:
## Family: 'gaussian'
## Data class: 'SpatialPointsDataFrame'
## Predictor: y ~ .
## Time used:
## Pre = 3.61, Running = 305, Post = 9.74, Total = 318
## Random effects:
## Name Model
## field RGeneric2
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant
## Precision for the Gaussian observations 90.77 4.135 81.68 91.03
## Theta1 for field -2.95 0.096 -3.17 -2.95
## Theta2 for field 3.13 0.016 3.10 3.13
## Theta3 for field -1.70 0.040 -1.77 -1.70
## 0.975quant mode
## Precision for the Gaussian observations 100.64 90.21
## Theta1 for field -2.79 -2.91
## Theta2 for field 3.16 3.13
## Theta3 for field -1.61 -1.71
##
## Deviance Information Criterion (DIC) ...............: -5937.44
## Deviance Information Criterion (DIC, saturated) ....: -5963.13
## Effective number of parameters .....................: 4400.46
##
## Watanabe-Akaike information criterion (WAIC) ...: -6700.54
## Effective number of parameters .................: 2674.35
##
## Marginal log-Likelihood: -4401.07
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
```

and the summary in the user’s scale:

```
<- rspde.result(rspde_fit.rep, "field", rspde_model.rep)
result_fit_rep summary(result_fit_rep)
```

```
## mean sd 0.025quant 0.5quant 0.975quant mode
## tau 0.0524607 0.00492103 0.0423617 0.0527179 0.061389 0.0538517
## kappa 22.8835000 0.36844300 22.1646000 22.8821000 23.611600 22.8807000
## nu 0.6176790 0.02095560 0.5812400 0.6159060 0.663042 0.6114320
```

```
<- data.frame(
result_df parameter = c("tau", "kappa", "nu"),
true = c(tau, kappa, nu),
mean = c(
$summary.tau$mean,
result_fit_rep$summary.kappa$mean,
result_fit_rep$summary.nu$mean
result_fit_rep
),mode = c(
$summary.tau$mode,
result_fit_rep$summary.kappa$mode,
result_fit_rep$summary.nu$mode
result_fit_rep
)
)print(result_df)
```

```
## parameter true mean mode
## 1 tau 0.08920621 0.05246071 0.05385169
## 2 kappa 20.00000000 22.88345226 22.88065177
## 3 nu 0.50000000 0.61767858 0.61143178
```

`inlabru`

implementationThere are several additional options that are available. For instance, it is possible to change the order of the rational approximation, the upper bound for the smoothness parameter (which may speed up the fit), change the priors, change the type of the rational approximation, among others. These options are described in the “Further options of the `rSPDE`

-`INLA`

implementation” section of the R-INLA implementation of the rational SPDE approach vignette. Observe that all these options are passed to the model through the `rspde.matern()`

function, and therefore the resulting model object can directly be used in the `bru()`

function, in an identical manner to the examples above.

Lindgren, Finn, Håvard Rue, and Johan Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach.” *Journal of the Royal Statistical Society. Series B. Statistical Methodology* 73 (4): 423–98.

Xiong, Zhen, Alexandre B. Simas, and David Bolin. 2022. “Covariance-Based Rational Approximations of Fractional SPDEs for Computationally Efficient Bayesian Inference.” *arXiv:2209.04670*.