The `rtrim`

package is an complete reimplementation of the
original
TRIM software developed by Jeroen Pannekoek and Arco van Strien from
the 1990’s onwards. This vignette provides a quick getting started
manual that demonstrates the R-based workflow for computing TRIM
models.

- An extensive introduction showing many of the options can be found in the trim by example vignette.
- To use legacy TRIM command files and TRIM data files, see the section on tcf files.

TRIM was developed to estimate animal populations, based on repeated counts at various sites while counts may be missing for certain sites at certain times. Estimation is based on a model-based imputation method.

We assume that the reader is already familiar with the methodology behind TRIM but in short, TRIM estimates a piecewise loglinear growth model to compute imputations. There are three variants of this model which differ by their basic assumptions.

**Model 1:**Populations vary accross sites, but not over time.**Model 2:**Populations vary accross sites, but show the same growth everywhere. Growth rates are constant during specifief time intervals**Model 3:**Similar, but time effects are independent for each time point.

Note that both Model 1 and Model 3 can be seen as special cases of Model 2 (Model 1 is equivalent with Model 2 when where time effects or growth rate is set to zero; Model 3 is equivalent with Model 2 when growth rates are assumed to change every time point).

For each variant it is possible to include categorical covariates in the model, or to weight sites. Certain simplifying assumptions are made to keep computations tractable. A detailed description of the methodology can be found in the original TRIM3 manual.

We are going to use the `skylark`

dataset, which is
included with the package.

```
library(rtrim)
data(skylark)
head(skylark,3) # inspect the dataset
#> site time count Habitat Deposition
#> 1 1 1 11 2 2
#> 2 1 2 8 2 2
#> 3 1 3 5 2 2
```

Here, `skylark`

is a regular R
`data.frame`

.

The central function for computing TRIM models is called
`trim`

. Calling this function is very similar to calling
basic R modeling functions like `lm`

. Here, we compute TRIM
model 2.

Note that the data is passed to `trim`

as an R data.frame.
Information on which columns in the data frame represent the counts, the
site ID’s etc is encoded in the first argument, which is of the special
type `formula’. Because site identifiers and time points are treated
differently by the model, the order matters (see also model specification).

Alternatively, one can just pass the data frame as argument 1, and
explictly tell `trim`

in which columns the counts etc
are:

Note that although the name `year_col`

suggests that
counts must be on an annual interval, this is not necesarily the
case.

The result is an object of class `trim`

. Just like with
objects of class `lm`

, its various components can be
extracted using specialized functions. Here are some examples.

```
summary(m1) # summarize the model
#> Call:
#> tools::buildVignettes(dir = ".", tangle = TRUE)
#>
#> Model : 2
#> Method : ML (Convergence reached after 3 iterations)
#>
#> Coefficients:
#> from upto add se_add mul se_mul
#> 1 1 8 0.05482546 0.01043636 1.056356 0.01102452
#>
#>
#> Goodness of fit:
#> Chi-square = 210.53, df=146, p=0.0004
#> Likelihood Ratio = 204.63, df=146, p=0.0010
#> AIC (up to a constant) = -87.37
```

```
totals(m1) # Return time-totals
#> time imputed se_imp
#> 1 1 438 21
#> 2 2 392 20
#> 3 3 432 21
#> 4 4 433 21
#> 5 5 474 22
#> 6 6 521 23
#> 7 7 556 25
#> 8 8 591 28
```

```
gof(m1) # Retrieve goodness-of-fit
#> Goodness of fit:
#> Chi-square = 210.53, df=146, p=0.0004
#> Likelihood Ratio = 204.63, df=146, p=0.0010
#> AIC (up to a constant) = -87.37
```

```
coefficients(m1) # Extract the coefficients
#> from upto add se_add mul se_mul
#> 1 1 8 0.05482546 0.01043636 1.056356 0.01102452
```

These are just a few of of the functions that can be used to analyse the model. See any of their help files for a complete list of links to all analyses functions.

The names of variables in the dataset are not important and neither is their order. However, since TRIM models are designed to estimate the number of counts at counting sites, the formula specifying the model has to satisfy certain rules.

- The single variable at the left-hand side of hr tilde must represent the counted numbers.
- The
**first variable**on the right-hand of the tilde side must represent the**site**variable. - The
**second variable**on the right-hand side must represent the**time**identifier. - All other variables on the right-hand-side are interpreted as covariates.

For example, to use the variable `Habitat`

as covariate
when analysing the `skylark`

dataset (under model 2) one does
the following.

It is also possible to apply weights by specifyinga
`weights`

argument. The TRIM options `overdisp`

(for overdispersion) and `serialcor`

(for serial
correlation), are simple `TRUE/FALSE`

toggles. The breaks of
the piecewise loglinear model can be specified with the
`changepoints`

option. The `trim`

function will
give an error when too little observations are present in a time
segment, except when the `autodelete`

option is set to
`TRUE`

. In that case time segments are combined until enough
observations are present for a model to be estimated. See
`?trim`

for a precise description of all options. Below is an
example where we specify the maximum number of changepoints and let
`trim`

delete change points where necessary.

```
m3 <- trim(count ~ site + time + Habitat, data=skylark, model=2
, overdisp = TRUE, serialcor = TRUE, changepoints=1:7, autodelete=TRUE)
m3$changepoints
#> [1] 1 2 3 4 5 6 7
```

In this case, no change points are deleted.

In this example, the data sets consists of 8 time points, so time
points 1 to 7 are explicitly specified as change point. This notation,
which requires the prior identification of the number of time points
present within the data, can be replaced by the more convenient
expression `changepoints="all"`

.

Alternatively the `stepwise`

algorithm can be used. This
algorithm removes changepoints when the slope does not change
significantly from before to after a changepoint, yielding a simpler
(more sparse) model.

```
m4 <- trim(count ~ site + time + Habitat, data=skylark, model=2
, overdisp = TRUE, serialcor = TRUE, changepoints=1:7, stepwise = TRUE)
m4$changepoints
#> [1] 1 2
```

Again, the explicit setting of initial changepoints can be replaced
by the more convenient `changepoints="auto"`

, which combines
`changepoints="all"`

with `stepwise=TRUE`

.

The original TRIM software can be controlled with text files
containing a series of commands that specify both the location and
format of the data, an the model (or models) to compute. Such TRIM
command files (usually stored with the extension `.tcf`

)
should be considered legacy but for backwards compatability they can be
used from R.

To try this, execute the code below to create a `tcf`

file
and a TRIM data file in the current working directory of R.

```
library(rtrim)
tmp <- "FILE skylark.dat
TITLE skylark-1d
NTIMES 8
NCOVARS 2
LABELS
Habitat
Cov2
END
MISSING 999
WEIGHT Absent
COMMENT Example 1; using linear trend model
WEIGHTING off
OVERDISP on
SERIALCOR on
MODEL 2
"
write(tmp,file="skylark.tcf")
data(skylark)
skylark[is.na(skylark)] <- 999
write.table(skylark,file="skylark.dat",col.names=FALSE,row.names=FALSE)
```

Executing a TRIM command file is as easy as reading the file using
`read_tcf`

and passing the result to `trim`

.

The resulting `trim`

object can be evaluated as described
above. For example

The object `tc`

, resulting from `read_tcf`

is
an object of class `trimcommand`

. It stores all commands
defined in the TRIM command file. Note that logical parameters such as
`WEIGHT`

are transformed to `logical`

in R.

```
tc
#> Object of class trimcommand:
#> file: skylark.dat
#> title: skylark-1d
#> ntimes: 8
#> ncovars: 2
#> labels: Habitat, Cov2
#> missing: 999
#> weight: FALSE
#> comment: Example 1; using linear trend model
#> weighting: FALSE
#> serialcor: TRUE
#> overdisp: TRUE
#> basetime:
#> model: 2
#> covariates:
#> changepoints:
#> stepwise: FALSE
#> autodelete: FALSE
#> outputfiles:
#> overallchangepoints:
#> impcovout: FALSE
#> covin: FALSE
```

**NOTE.** Be aware that R has its own present working
directory. If relative paths (that is, file names not starting with the
full path to their location) are used in the TRIM command file, R will
interpret them as relative to the current working directory.

TRIM data files are basically space-separated, tabular textfiles
where the order and type of columns is fixed by a few parameters. Given
such a specification, a file can be read with `read_tdf`

.

An overview of count data can be obtained with the function
`count_summary`

```
data(skylark)
count_summary(skylark)
#> Total number of sites 55
#> Sites without positive counts (0):
#> Number of observed zero counts 0
#> Number of observed positive counts 202
#> Total number of observed counts 202
#> Number of missing counts 238
#> Total number of counts 440
```

The result is an overview similar to the one that used to be printed at the start of TRIM output files.

The TRIM model can only be computed when sufficient data is present.
With the function `check_observations`

one can check if a
certain model can be computed. Note the use of `year_col`

to
specify a non-default column name.

```
check_observations(skylark, model=2, year_col="time", changepoints=c(1,4))
#> $sufficient
#> [1] TRUE
#>
#> $errors
#> $errors$changepoint
#> numeric(0)
```

The result is a `list`

with boolean element
`sufficient`

. If `sufficient==FALSE`

, the element
`errors`

contains a `data.frame`

with the
sites/times/covariates with insufficient counts.