Global Influence

In the mixpoissonreg package the following global influence methods are implemented: `hatvalues`, `cooks.distance` and `influence`. Below, we discuss the implementations of these methods in detail. Notice that in this vignette we do not provide interpretations of the influence results, we focus on how to compute the measures. For interpretations of the influence measures we refer the reader to the Analyzing overdispersed count data with the mixpoissonreg package vignette. We also do not provide examples on customization of the plots since virtually all the arguments related to the customization of plots for the `local_influence_plot` and `local_influence_autoplot` methods are the same as their counterparts `plot` and `autoplot`, respectivelly. Therefore, we refer the reader to the Building and customizing base-R diagnostic plots with the mixpoissonreg package and Building and customizing ggplot2-based diagnostic plots with the mixpoissonreg package for examples on how to customize the plots (change point and line types, colors, sizes, as well as change titles, captions, colors, etc.).

`hatvalues` method

To define “hat values” for mixed Poisson regression models, we follow Zhu et al. (2001) to consider the negative of the hessian of the Q-function as weight matrix, and follow Pregibon (1981) to define the “hat” matrix with respect to this weight matrix. We can consider the hessian of the Q-function with respect to mean-related parameters, which is the default. We can also consider the hessian of the Q-function with respect to the precision-related parameters to give rise to hat values related to the precision parameters.

To obtain the mean-related hat values one simply calls the `hatvalues` method:

``````library(mixpoissonreg)

fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog,
data = Attendance, em_controls = list(maxit=1))

#>           1           2           3           4           5           6
#> 0.009079739 0.006988289 0.004624420 0.006642726 0.009645113 0.026264887``````

We are using the option `em_controls = list(maxit=1)` to reduce computational cost.

The hat values are used to obtain Cook’s distance. One can also use the hat values to define leverage-adjusted residuals by dividing the residuals by `sqrt(hatvalues(fitted_object))`.

To obtain precision-related hat values one must set `parameters` to “precision”:

``````head(hatvalues(fit, parameters = "precision"))
#>           1           2           3           4           5           6
#> 0.003308847 0.004137405 0.015968674 0.012503519 0.020321253 0.007444494``````

`cooks.distance` method

The implementation of the `cooks.distance` method for `mixpoissonreg` models contains several “Cook’s distance”-like measures. More precisely, it contains the standard Cook’s distance, the generalized Cook’s distance, the likelihood displacement and the Q-displacement.

Cook’s distance

The implementation of the standard Cook’s distance uses the usual formula for Cook’s distance in terms of the “hat” matrix, where the “hat” matrix is the one given above. The standard Cook’s distance returned by default in the `cooks.distance` method. So, to obtain Cook’s distance, we simply call the `cooks.distance` method:

``````head(cooks.distance(fit))
#>            1            2            3            4            5            6
#> 7.412521e-05 1.938596e-04 6.180137e-04 6.839692e-04 1.170998e-03 5.061764e-03``````

Since there is also a “hat” matrix with respect to the precision parameters, we may compute Cook’s distance using the hat values with respect to the precision parameters. To compute Cook’s distance with the “hat” matrix with respect to the precision parameters we simply set the `hat` argument to “precision”:

``````head(cooks.distance(fit, hat = "precision"))
#>            1            2            3            4            5            6
#> 2.670088e-05 1.141181e-04 2.183563e-03 1.302754e-03 2.521237e-03 1.380809e-03``````

generalized Cook’s distance and Q-displacement

The Generalized Cook’s distance and Q-displacement (also called Q-distance) for EM-based models were defined in Zhu et al. (2001) and computed for mixed Poisson regression models in Barreto-Souza and Simas (2016). We implemented first-order approximation to these quantities to make it computationally feasible. These first-order approximations are available in Barreto-Souza and Simas (2016). We also provide versions of generalized Cook’s distance for mean-related or precision-related parameters, whose details can be found in Barreto-Souza and Simas (2016).

To compute the generalized Cook’s distance with respect to the mean and precision parameters jointly, simply set the `type` argument to “GCD”:

``````head(cooks.distance(fit, type = "GCD"))
#>           1           2           3           4           5           6
#> 0.004246929 0.002821274 0.002072301 0.001018638 0.001979334 0.003445122``````

To compute the generalized Cook’s distance with respect to the mean-related parameters, set the `type` argument to “GCDmean”:

``````head(cooks.distance(fit, type = "GCDmean"))
#>            1            2            3            4            5            6
#> 6.176462e-05 1.885868e-04 1.180570e-03 1.009909e-03 1.806187e-03 1.660843e-03``````

and to compute the generalized Cook’s distance with respect to the precision-related parameters, set the `type` argument to “GCDprecision”:

``````head(cooks.distance(fit, type = "GCDprecision"))
#>            1            2            3            4            5            6
#> 4.185164e-03 2.632687e-03 8.917309e-04 8.728862e-06 1.731461e-04 1.784279e-03``````

To compute the Q-displacement one sets the `type` argument to “QD”:

``````head(cooks.distance(fit, type = "QD"))
#>           1           2           3           4           5           6
#> 0.004223071 0.002777084 0.002040897 0.001022165 0.001969990 0.003508785``````

Likelihood displacement

The likelihood displacement (also called likelihood distance) was defined in Cook and Weisberg (1982). To compute this measure, one simply set the `type` argument to “LD”:

``````head(cooks.distance(fit, type = "LD"))
#>            1            2            3            4            5            6
#> 0.0019140235 0.0013801409 0.0006543277 0.0001519963 0.0003607121 0.0012554162``````

`influence` method

The `influence` method returns a `list` with several quantities:

• hat.mean hat values with respect to the mean;
• hat.precision hat with respect to the precision parameters;
• pear.res Pearson residuals;
• score.res Score residuals

and if the argument `do.coef` is `TRUE` the returned `list` also contains:

• coefficients.mean first-order approximation to the impact on the estimate of each mean-related parameter if each observation is removed;
• coefficients.precision first-order approximation to the impact on the estimate of each precision-related parameter if each observation is removed.

For the two elements above, the ith row corresponds to the parameter estimates with the ith observation removed.

The `influence` method has only one argument, `do.coef`, which, by default, is set to `TRUE` since its computation is not computationally intensive.

Let us call this method on the `fit` object:

``````influence_fit <- influence(fit)

#>   (Intercept) gendermale         math progAcademic progVocational
#> 1    2.745887 -0.2448940 -0.006613360   -0.4258482      -1.269851
#> 2    2.746077 -0.2447048 -0.006619964   -0.4257226      -1.269747
#> 3    2.747010 -0.2459932 -0.006631644   -0.4252007      -1.269402
#> 4    2.747015 -0.2459013 -0.006632624   -0.4252938      -1.269393
#> 5    2.747544 -0.2460342 -0.006644830   -0.4252217      -1.269154
#> 6    2.746786 -0.2443429 -0.006641383   -0.4268347      -1.269387``````

Global influence plots

The main global influence plots are implemented in the `plot` and `autoplot` methods. They are the plots number 3, 4 and 5, which are, respectively, the Cook’s distance plot, the generalized Cook’s distance plot and Cook’s distance vs generalized Cook’s distance. The `plot` and `autoplot` methods provide the same plots, the difference between them being that the former uses R’s base graphics whereas the latter uses the `ggplot2` package.

Let us build these plots:

``plot(fit, which = c(3,4,5))``

and

``autoplot(fit, which = c(3,4,5))``

These plots identify the most extreme points. By the default they identify 3 points, but the number of identified points can be changed by setting the `id.n` argument to the desired value for the `plot` method and by setting the `label.n` argument to the desired value for the `autoplot` method:

``plot(fit, which = c(3,4,5), id.n = 5)``

and

``autoplot(fit, which = c(3,4,5), label.n = 5)``

For further details customizing plots of `mixpoissonreg` objects, we refer the reader to the Building and customizing base-R diagnostic plots with the mixpoissonreg package and Building and customizing ggplot2-based diagnostic plots with the mixpoissonreg package vignettes.

We now turn to the problem of plotting Q-displacements and likelihood displacements. Both of these plots can easily be built “by hand”.

R’s base graphics:

``````qd_fit <- cooks.distance(fit, type = "QD")

# Get the extreme points:
extreme_points <- as.vector(Rfast::nth(abs(qd_fit), k = 5,
num.of.nths = 5,
index.return = TRUE, descending = TRUE))
idx_y <- qd_fit[extreme_points]

ylim <- range(qd_fit, na.rm = TRUE)

ylim <- extendrange(r = ylim, f = 0.15)

plot(qd_fit, xlab = "Obs. number", ylab = "Q-displacement", ylim = ylim, type = "h")
text(extreme_points, idx_y, labels = extreme_points, pos = 3, offset = 0.2)``````

and

``````ld_fit <- cooks.distance(fit, type = "LD")

# Get 5 most extreme points:
extreme_points <- as.vector(Rfast::nth(abs(ld_fit), k = 5,
num.of.nths = 5,
index.return = TRUE, descending = TRUE))
idx_y <- ld_fit[extreme_points]

ylim <- range(ld_fit, na.rm = TRUE)

ylim <- extendrange(r = ylim, f = 0.15)

plot(ld_fit, xlab = "Obs. number", ylab = "Likelihood displacement", ylim = ylim, type = "h")
text(extreme_points, idx_y, labels = extreme_points, pos = 3, offset = 0.2)``````

Now the `ggplot2` version:

``````library(dplyr)
library(ggplot2)
library(ggrepel)

qd_fit <- cooks.distance(fit, type = "QD")

qd_tbl <- tibble("Q-displacement" = qd_fit, "Obs. number" = 1:length(qd_fit))

# Get 5 most extreme points
qd.extreme <- arrange(qd_tbl, desc(`Q-displacement`))

ggplot(qd_tbl, aes(x = `Obs. number`, y = `Q-displacement`)) +
geom_linerange(aes(ymin = 0, ymax = `Q-displacement`)) +
geom_text_repel(data = qd.extreme, aes(label = `Obs. number`))``````

and

``````ld_fit <- cooks.distance(fit, type = "LD")

ld_tbl <- tibble("Likelihood displacement" = ld_fit, "Obs. number" = 1:length(ld_fit))

# Get 5 most extreme points
ld.extreme <- arrange(ld_tbl, desc(`Likelihood displacement`))

ggplot(ld_tbl, aes(x = `Obs. number`, y = `Likelihood displacement`)) +
geom_linerange(aes(ymin = 0, ymax = `Likelihood displacement`)) +
geom_text_repel(data = ld.extreme, aes(label = `Obs. number`))``````

Local Influence

The mixpoissonreg package contains the `local_influence` method implemented. This method contains conformal normal curvatures and normal curvatures under several perturbation schemes. It returns a list whose elements are the perturbation schemes contained in the `perturbation` argument.

The local influence was introduced in the seminal paper Cook (1986), in which the normal curvatures under different perturbations schemes were introduced. Cook suggested the analysis of the perturbation schemes in the directions of largest curvatures. Lesaffre and Verbeke (1998) suggested the analysis for each canonical direction and called this “total local influence”. Poon and Poon (1999) introduced the conformal normal curvature which has the advantadge of taking values in the standard unit interval [0,1]. Finally, Zhu and Lee (2001) introduced local influence measures for EM-based models. They computed normal and conformal normal curvatures for EM-based models.

Below, we discuss the arguments of the `local_influence` method in detail. Notice that in this vignette we do not provide interpretations of the influence results, we focus on how to compute the measures. For interpretations of the influence measures we refer the reader to the Analyzing overdispersed count data with the mixpoissonreg package vignette.

Curvature

By default, the `local_influence` method for `mixpoissonreg` objects returns the conformal normal curvature since it takes values in the standard unit interval [0,1]. For conformal normal curvatures the benchmark suggested by Zhu and Lee (2001) is returned as the `benchmark` attribute to each returned element, i.e., for each perturbation considered.

For example:

``````fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog,
data = Attendance, em_controls = list(maxit=1))

loc_inf_fit <- local_influence(fit)

ls(loc_inf_fit)
#> [1] "case_weights"             "hidden_variable"
#> [3] "mean_explanatory"         "precision_explanatory"
#> [5] "simultaneous_explanatory"

#>            1            2            3            4            5            6
#> 0.0014599089 0.0009698309 0.0007123667 0.0003501633 0.0006804085 0.0011842828

attr(loc_inf_fit\$case_weights, "benchmark")
#> [1] 0.01398722``````

To compute the normal curvature, simply set the `curvature` argument to “normal”. For normal curvature the `benchmark` attribute contains the benchmark suggested by Verbeke and Molenberghs (2000, sect. 11.3). For example:

``````loc_inf_normal_fit <- local_influence(fit, curvature = "normal")

ls(loc_inf_normal_fit)
#> [1] "case_weights"             "hidden_variable"
#> [3] "mean_explanatory"         "precision_explanatory"
#> [5] "simultaneous_explanatory"

#>           1           2           3           4           5           6
#> 0.004246929 0.002821274 0.002072301 0.001018638 0.001979334 0.003445122

attr(loc_inf_normal_fit\$case_weights, "benchmark")
#> [1] 0.0185289``````

Direction

The default direction of `local_influence` method for `mixpoissonreg` objects is the canonical direction, that is, it computes the total local influence (Lesaffre and Verbeke, 1998).

For canonical directions the `local_influence` method also returns benchmarks for each perturbation scheme, following Zhu and Lee (2001) for conformal normal curvatures and Verbeke and Molenberghs (2000, sect. 11.3) for normal curvatures.

To change the directions to the directions of largest curvatures one must set the `direction` argument to “max.eigen”.

For mixed Poisson regression models (and actually for very large class of regression models) both the normal and conformal normal curvatures are quadratic forms with respect to the direction, that is, by seeing the direction as the variable. The name “max.eigen” for the direction of largest curvature comes from the fact that the direction of largest curvature is the direction of the eigenvector of the associated quadratic form associated to the eigenvalue with largest absolute value.

Notice that for direction of largest curvature no benchmark is returned, so if one calls the `benchmark` attribute, the returned value will be `NA`:

``````# Conformal normal curvature

loc_inf_fit_larg_curv <- local_influence(fit, direction = "max.eigen")

ls(loc_inf_fit_larg_curv)
#> [1] "case_weights"             "hidden_variable"
#> [3] "mean_explanatory"         "precision_explanatory"
#> [5] "simultaneous_explanatory"

#>            1            2            3            4            5            6
#> -0.057396493 -0.068340893  0.021970638 -0.004724979  0.006115105 -0.015803334

attr(loc_inf_fit_larg_curv\$case_weights, "benchmark")
#> [1] NA

# Normal curvature

loc_inf_normal_fit_larg_curv <- local_influence(fit, curvature = "normal",
direction = "max.eigen")

ls(loc_inf_normal_fit_larg_curv)
#> [1] "case_weights"             "hidden_variable"
#> [3] "mean_explanatory"         "precision_explanatory"
#> [5] "simultaneous_explanatory"

#>            1            2            3            4            5            6
#> -0.057396493 -0.068340893  0.021970638 -0.004724979  0.006115105 -0.015803334

attr(loc_inf_normal_fit_larg_curv\$case_weights, "benchmark")
#> [1] NA``````

Perturbation

The available perturbation schemes are “case_weights”, “hidden_variable”, “mean_explanatory”, “precision_explanatory”, “simultaneous_explanatory”. For mixed Poisson regression models, these perturbation schemes are given in Barreto-Souza and Simas (2016). The case weights and explanatory variable perturbations were introduced by Cook (1986) whereas the hidden variable perturbation requires a latent variable and was introduced by Barreto-Souza and Simas (2016). Notice that there is not a response variable perturbation scheme since the response variable is discrete, so it does not make sense to do such a perturbation.

By default, the `local_influence` method for `mixpoissonreg` object returns a list with all the available perturbation schemes.

``````loc_inf_fit <- local_influence(fit)

ls(loc_inf_fit)
#> [1] "case_weights"             "hidden_variable"
#> [3] "mean_explanatory"         "precision_explanatory"
#> [5] "simultaneous_explanatory"``````

Notice that if there is no precision covariates, then the returned values for “precision_explanatory” and “simultaneous_explanatory” are `NA` and a warning message is generated:

``````fit2 <- mixpoissonreg(daysabs ~ gender + math + prog,
data = Attendance,
em_controls = list(maxit=1))

loc_inf_fit2 <- local_influence(fit2)

ls(loc_inf_fit2)
#> [1] "case_weights"             "hidden_variable"
#> [3] "mean_explanatory"         "precision_explanatory"
#> [5] "simultaneous_explanatory"

#>            1            2            3            4            5            6
#> 0.0007928476 0.0007005787 0.0006602506 0.0005594278 0.0008113136 0.0017174080

#>  1  2  3  4  5  6
#> NA NA NA NA NA NA

#>  1  2  3  4  5  6
#> NA NA NA NA NA NA``````

To select a subset of perturbations, enter a list or vector with the desired perturbations as the `perturbation` argument:

``````loc_inf_1 <- local_influence(fit, perturbation = c("case_weights", "hidden_variable"))

ls(loc_inf_1)
#> [1] "case_weights"    "hidden_variable"

#>            1            2            3            4            5            6
#> 0.0014599089 0.0009698309 0.0007123667 0.0003501633 0.0006804085 0.0011842828

loc_inf_2 <- local_influence(fit, perturbation = c("case_weights", "hidden_variable"),
curvature = "normal",
direction = "max.eigen")

ls(loc_inf_2)
#> [1] "case_weights"    "hidden_variable"

#>            1            2            3            4            5            6
#> -0.057396493 -0.068340893  0.021970638 -0.004724979  0.006115105 -0.015803334``````

Parameters

The `parameters` argument of the `local_influence` method for `mixpoissonreg` objects refer to the “case_weights” and “hidden_variable” perturbation schemes. One can obtain case weights and hidden variable perturbations with respect to all parameters, with respect to the mean-related parameters and with respect to the precision-related parameters.

By default, the case weights and hidden variable perturbations are returned with respect to all parameters. To return case weights and hidden variable perturbations with respect to the mean-related parameters, one must set the `parameters` argument to “mean”:

``````loc_inf_fit_mean <- local_influence(fit, parameters = "mean")

#>            1            2            3            4            5            6
#> 6.649242e-05 2.030222e-04 1.270937e-03 1.087213e-03 1.944443e-03 1.787973e-03``````

Analogously, to return case weights and hidden variable perturbations with respect to the precision-related parameters, one must set the `parameters` argument to “precision”:

``````loc_inf_fit_precision <- local_influence(fit, parameters = "precision")

#>            1            2            3            4            5            6
#> 2.113570e-03 1.329546e-03 4.503374e-04 4.408205e-06 8.744134e-05 9.010877e-04``````

Perturbation schemes with respect to a subset of the covariates

The `mean.covariates` argument refers to the “mean.explanatory” and “simultaneous.explanatory” perturbations schemes, whereas the `precision.covariates` argument refers to the “precision.explanatory” and “simultaneous.explanatory” perturbations schemes. The idea is that only the covariates listed in these arguments will be perturbed, so one can see the influence of observations with respect to those covariates. If `mean.covariates` is `NULL`, then all the mean-related covariates will be considered. Analogously, if `precision.covariates` is `NULL`, then all the precision-related covariates will be considered.

Remark: Notice that factor and integer-valued covariates should not be perturbed since it does not make sense to make infinitesimal perturbations on them. You should only perturb continuous variables.

By default both `mean.covariates` and `precision.covariates` are `NULL`, so all non-factor covariates are perturbed. Notice that integer non-factor covariates will be perturbed, so you should manually remove them.

Let us consider, for example, the following model:

``````    set.seed(1234)

x1 <- rexp(200, rate = 2)
x2 <- rnorm(200)
x3 <- factor(as.integer(2*runif(200) + 1))
x4 <- as.integer(10*runif(200))

y <- stats::rnbinom(200, mu = exp(1-x1-x2-(x3==2)+0.1*x4),
size = exp(1+2*x1+x2))

fit_example <- mixpoissonreg(y ~ x1 + x2 + x3 + x4 | x1 + x2,
em_controls = list(maxit=1))

summary(fit_example)
#>
#> Negative Binomial Regression - Expectation-Maximization Algorithm
#>
#> Call:
#> mixpoissonreg(formula = y ~ x1 + x2 + x3 + x4 | x1 + x2, em_controls = list(maxit = 1))
#>
#>
#> Pearson residuals:
#>      RSS      Min       1Q   Median       3Q      Max
#> 184.5508  -1.5734  -0.6603  -0.3186   0.4243   3.8465
#>
#> Coefficients modeling the mean (with  link):
#>             Estimate Std.error z-value Pr(>|z|)
#> (Intercept)  0.88304   0.18902   4.672 2.99e-06 ***
#> x1          -1.28107   0.18555  -6.904 5.05e-12 ***
#> x2          -0.99686   0.08599 -11.592  < 2e-16 ***
#> x32         -1.16351   0.15225  -7.642 2.14e-14 ***
#> x4           0.14925   0.02512   5.940 2.84e-09 ***
#>
#> Coefficients modeling the precision (with  link):
#>             Estimate Std.error z-value Pr(>|z|)
#> (Intercept)   1.0731    0.4655   2.305  0.02115 *
#> x1            2.6327    1.0925   2.410  0.01597 *
#> x2            1.0746    0.3353   3.204  0.00135 **
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Efron's pseudo R-squared:  0.52115
#> Number of iterations of the EM algorithm = 1``````

Remark: Notice that since x3 is a factor and x4 is integer-valued, they are discrete and it does not make sense to make infinitesimal perturbations on them. You should only perturb continuous variables.

To consider the “mean_explanatory” and “simultaneous_explanatory” perturbation schemes with respect to the “x1” covariate, we simply set the `mean.covariates` argument to “x1”:

``````loc_inf_x1 <- local_influence(fit_example, mean.covariates = "x1")

#>            1            2            3            4            5            6
#> 9.184808e-09 3.332425e-06 1.352478e-06 4.736643e-07 5.720140e-09 1.752231e-06``````

To consider more than one covariate, simply enter the covariates as a vector. For instance, to consider the “mean_explanatory” and “simultaneous_explanatory” perturbation schemes with respect to the “x1” and “x2” covariates, we simply set the `mean.covariates` argument to `c("x1", "x2")`:

``````loc_inf_x1_x2 <- local_influence(fit_example, mean.covariates = c("x1", "x2"))

#>            1            2            3            4            5            6
#> 9.184808e-09 3.332425e-06 1.352478e-06 4.736643e-07 5.720140e-09 1.752231e-06``````

The covariates used in the “mean_explanatory” and “simultaneous_explanatory” perturbation schemes are returned as `covariates` attributes in the “mean_explanatory” and “simultaneous_explanatory” perturbation schemes:

``````attr(loc_inf_x1\$mean_explanatory, "covariates")
#> [1] "x1"
attr(loc_inf_x1\$simultaneous_explanatory, "covariates")
#> \$mean
#> [1] "x1"
#>
#> \$precision
#> [1] "all"

attr(loc_inf_x1_x2\$mean_explanatory, "covariates")
#> [1] "x1" "x2"
attr(loc_inf_x1_x2\$simultaneous_explanatory, "covariates")
#> \$mean
#> [1] "x1" "x2"
#>
#> \$precision
#> [1] "all"``````

Analogously, to consider the “precision_explanatory” and “simultaneous_explanatory” perturbation schemes with respect to the “x1” covariate, we simply set the `precision.covariates` argument to “x1”:

``````loc_inf_prec_x1 <- local_influence(fit_example, precision.covariates = "x1")

#>            1            2            3            4            5            6
#> 0.0112811796 0.0008030722 0.0015556764 0.0073939712 0.0054385258 0.0015444537``````

In the same manner as in the “mean_explanatory” perturbation scheme, to consider more than one covariate, simply enter the covariates as a vector. For instance, to consider the “precision_explanatory” and “simultaneous_explanatory” perturbation schemes with respect to the “x1” and “x2” covariates, we simply set the `precision.covariates` argument to `c("x1", "x2")`:

``````loc_inf_prec_x1_x2 <- local_influence(fit_example, precision.covariates = c("x1", "x2"))

#>            1            2            3            4            5            6
#> 0.0112811796 0.0008030722 0.0015556764 0.0073939712 0.0054385258 0.0015444537``````

The covariates used in the “precision_explanatory” and “simultaneous_explanatory” perturbation schemes are returned as `covariates` attributes in the “precision_explanatory” and “simultaneous_explanatory” perturbation schemes:

``````attr(loc_inf_prec_x1\$precision_explanatory, "covariates")
#> [1] "x1"
attr(loc_inf_prec_x1\$simultaneous_explanatory, "covariates")
#> \$mean
#> [1] "all"
#>
#> \$precision
#> [1] "x1"

attr(loc_inf_prec_x1_x2\$precision_explanatory, "covariates")
#> [1] "x1" "x2"
attr(loc_inf_prec_x1_x2\$simultaneous_explanatory, "covariates")
#> \$mean
#> [1] "all"
#>
#> \$precision
#> [1] "x1" "x2"``````

Local influence plots

It is possible to plot the perturbation schemes for all the possible combinations of arguments of the `local_influence` method by using the `local_influence_plot` and `local_influence_autoplot` methods. The `local_influence_plot` and `local_influence_autoplot` methods provide the same plots, the difference between them being that the former uses R’s base graphics whereas the latter uses the `ggplot2` package.

If the `direction` argument is set to “canonical” (the default), then the `n.influential` points above the benchmarks are automatically identified, where the default value for `n.influential` is 5. Recall that we use the benchmarks suggested by Zhu and Lee (2001) for conformal normal curvatures and the benchmarks suggested by Verbeke and Molenberghs (2000, sect. 11.3) for normal curvatures. For `direction = "max.eigen"`, no benchmark is provided. In this case, the `local_influence_plot` and `local_influence_autoplot` methods automatically identify the `n.influential` most extremes points.

Let us build these plots. First, the standard arguments provide the plots of conformal normal curvature in the canonical directions for the “case_weights”, “hidden_variable”, “mean_explanatory” and “precision_explanatory” perturbation schemes with “case_weights” and “hidden_variable” being computed for all parameters, and the explanatory perturbations being computed for all covariates:

``````fit <- mixpoissonreg(daysabs ~ gender + math + prog | gender + math + prog,
data = Attendance, em_controls = list(maxit=1))

# Notice that since gender and prog are factors,
# they are not considered in the computation of the
# explanatory variables perturbation schemes

local_influence_plot(fit)``````

and

``local_influence_autoplot(fit)``

To change to normal curvature simply set the `curvature` argument to “normal”:

``local_influence_plot(fit, curvature = "normal")``

and

``local_influence_autoplot(fit, curvature = "normal")``

We can change the direction to the direction of largest curvature by setting the `direction` to “max.eigen”:

``local_influence_plot(fit, direction = "max.eigen")``