Creating input landscapes in gen3sis

29.08.2020

This vignette aims at showing how to use the function create_input_landscape that creates input landscape objects necessary to run gen3sis. These objects are:

To create these files, the function create_input_landscape requires as arguments:

Below, we will generate the R objects required by the function create_input_landscape step by step. For this exercise, we will use the simple center of the world landscape that we used in the introduction vignette and in the case study 1 of the main package manuscript1.

The first step is to load the necessary packages and set the working directory:

# first we load the required packages
library(gen3sis)
library(raster)

# next we set the working directory to the package path
datapath <- system.file("extdata", package = "gen3sis")
setwd(datapath)

Creating the environmental rasters bricks

In this step, the user needs to create the rastebriks containing the spatio-temporal distribution of the environmental variables of interest. Here, we will use the temperature, aridity and area rasters of the 4d sample world that has been used in the introduction vignette. We first load the three corresponding raster bricks from the hard drive.

temperature_brick <- brick("InputRasters/WorldCenter/temp_rasters.grd")
aridity_brick <- brick("InputRasters/WorldCenter/aridity_rasters.grd")
area_brick <- brick("InputRasters/WorldCenter/area_rasters.grd")

If you want to learn how you can create a virtual dynamic landscape from scratch, please refer to the design_landscape vignette.

Creating a list with the environmental raster bricks

To create the input landscape for gen3sis, all environmental variables have to be stored as raster files in a named list. It is important to add the name of the environmental variable you want to use in the list, which will be used when specifying the config file. Alternatively, you can also create a list which contains all of the raster file paths on your hard drive.

We will now create a list that contains all of the layers of our raster bricks for temperature, aridity and area.

landscapes_list <- list(temp = NULL, arid = NULL, area = NULL)
for (i in 1:nlayers(temperature_brick)) {
    landscapes_list$temp <- c(landscapes_list$temp, temperature_brick[[i]])
    landscapes_list$arid <- c(landscapes_list$arid, aridity_brick[[i]])
    landscapes_list$area <- c(landscapes_list$area, area_brick[[i]])
}

Defining a cost function

The second argument we need to define is a cost function. This function should define the connection cost between sites. The species dispersal ability will determine if sites are reachable (i.e. if the dispersal event is higher than the connection cost).

First, let’s look at a very simple case, where no additional cost is added to the geographical distances, meaning that connections costs equal distance in grid cell units multiplied by the raster resolution. The costs here will be defined in cell unit weighted by degrees, so that the same landscape under different spatial resolution in equivalent. For landscapes with a coordinate system (crs) the cost is defined in meters, since the coordinate system is in meters. Hence, we can define a simple cost function in which the dispersal is not penalized:

cost_function_null <- function(source, habitable_src, dest, habitable_dest) {
    return(1)  #represents cost 4 in a 4degree landscape
}
This figure shows the connection costs from one site in central Africa to all other sites. To travel to the site in South America that is indicated with the arrow, the travelling cost is 64. The distance matrix was computed using the very simple cost function that has been introduced before and is not adding any penalty.

This figure shows the connection costs from one site in central Africa to all other sites. To travel to the site in South America that is indicated with the arrow, the travelling cost is 64. The distance matrix was computed using the very simple cost function that has been introduced before and is not adding any penalty.

Now, let’s look at a slightly more sophisticated cost function. Only sites which have environmental values (no NAs) are considered habitable. Here, suitable terrestrial sites should have a cost penalty of one (1), but dispersal over water sites should double (2) the cost of dispersal. Since our example landscape has resolution of 4 degrees, terrestrial sites will have cost penalty of four (4) and aquatic sites of eight (8).

cost_function_water <- function(source, habitable_src, dest, habitable_dest) {
    if (!all(habitable_src, habitable_dest)) {
        return(2)
    } else {
        return(1)
    }
}

Whether a specific site can be reached or not depends on the properties of the population or species.

Further useful arguments

Besides the (i) landscape list, (ii) cost function and (iii) experiment folder, we can define further arguments such as: directions, time-steps and calculate_full_distance_matrices.

The argument ‘directions’ requires an integer providing the amount of directions used to calculate distances. It can take 4, 8 or 16 as values to consider the four, eighth or 16 neighboring sites. This is the same argument used by the function transition in the gdistance package (gdistance::transition).

The argument ‘timesteps’ requires a vector of strings and is used to name the files and to name the timesteps in the landscapes.rds landscape object created by the create_input_landscape function. It is not necessary to define this argument, but it helps readability when manipulating the object.

The argument ‘calculate_full_distance_matrices’ defines whether the create_input_landscape function will create one large distance matrix for all suitable sites (calculate_full_distance_matrices=TRUE) or a list of small distance matrices for the neighboring sites specified in ‘directions’ around each suitable site (calculate_full_distance_matrices=FALSE). The full distance matrix confers faster computing speed for the model, but requires a larger storage. The local distance matrices cause a slower computing speed, but require smaller storage. The slower speed emerges form the fact that a full distance matrix is then reconstructed by concatenating local distances matrices each time-step during a simulation run.

The real geographic coordinate system can be provided which allows for the global spherical distance correction. For real landscapes with valide coordinate system (e.g. crs=“+proj=longlat +datum=WGS84”), the distance unit is in meters (m). If you would use hypothetical landscapes, such as in the design_landscape vignette, geographic distance would be measured in units of sites multiplied by the grid cell resolution.

Create input files

The create_input_landscape function creates .rds files that will be used as input for gen3sis. The file path where the landscapes and the distance matrices of all time steps should be stored can be specified within the function.

create_input_landscape(landscapes = landscapes_list, cost_function = cost_function_water, 
    output_directory = file.path(tempdir(), "WorldCenter"), timesteps = paste0(round(seq(150, 
        100, length.out = 301), 2), "Ma"), calculate_full_distance_matrices = F)

Finally, don’t forget to specify the important metadata for the landscape you just created. We recommend storing it in form of a METADATA.txt file within the landscape folder.A template is automatically created by the functions create_input_landscape. Once the input landscape is ready, the run_simulation function can start the simulation using the path of the new landscape.


  1. O. Hagen, B. Flück, F. Fopp, J.S. Cabral, F. Hartig, M. Pontarp, T.F. Rangel, L. Pellissier. (2020). GENƎSIS: the GENeral Engine for Eco-Evolutionary SImulationS on the origins of biodiversity. (in prep)↩︎