Effect Sizes for Bayesian Models

Standardized Parameters

Introduction

Like in OLS / ML or other frequentists methods of model parameter estimation, standardizing the parameters of Bayesian (generalized) linear regression models can allow for the comparison of so-called “effects” within and between models, variables and studies.

As with frequentists methods, standardizing parameters should not be the only method of examining the role different predictors play in a particular Bayesian model, and this vignette generally assumes that the issues of model convergence, goodness of fit and model selection have already been taken care of. (Learn more about how to become a Bayesian master with the bayestestR package.)

Set up

We will examine the predictive role of overtime (xtra_hours), number of compliments given to the boss (n_comps) and seniority in predicting workers salaries. Let’s fit the model:

library(rstanarm)

data("hardlyworking", package = "effectsize")

head(hardlyworking)
>   salary xtra_hours n_comps age seniority
> 1  19745        4.2       1  32         3
> 2  11302        1.6       0  34         3
> 3  20636        1.2       3  33         5
> 4  23047        7.2       1  35         3
> 5  27342       11.3       0  33         4
> 6  25657        3.6       2  30         5
mod <- stan_glm(salary ~ xtra_hours + n_comps + seniority,
                data = hardlyworking,
                prior = normal(0, scale = c(1, 0.5, 0.5), autoscale = TRUE), # set some priors
                refresh = 0) 

parameters::model_parameters(mod, test = NULL)
> # Fixed effects 
> 
> Parameter   |   Median |              89% CI |  Rhat |     ESS |                      Prior
> -------------------------------------------------------------------------------------------
> (Intercept) | 10023.20 | [9365.61, 10724.72] | 0.999 | 5404.55 | Normal (20000 +- 15495.02)
> xtra_hours  |  1236.25 | [1189.92,  1285.99] | 1.000 | 4755.57 |      Normal (0 +- 1587.80)
> n_comps     |  2964.35 | [2754.07,  3165.00] | 1.000 | 5032.44 |      Normal (0 +- 3755.71)
> seniority   |   439.46 | [ 286.51,   605.06] | 0.999 | 4283.39 |      Normal (0 +- 2702.93)

Looking at the un-standardized (“raw”) parameters, it looks like all predictors positively predict workers’ salaries, but which has the highest predictive power? Unfortunately, the predictors are not on the same scale (hours, compliments, years), so comparing them is hard when looking at the raw data. This is where standardization comes in.

Like with frequentists models we can choose from the same standardization methods. Let’s use the (slow) "refit" method.

library(effectsize)

standardize_parameters(mod, method = "refit", ci = 0.89)
> Parameter   | Median (std.) |        89% CI
> -------------------------------------------
> (Intercept) |     -2.63e-04 | [-0.03, 0.03]
> xtra_hours  |          0.78 | [ 0.75, 0.81]
> n_comps     |          0.39 | [ 0.37, 0.42]
> seniority   |          0.08 | [ 0.05, 0.11]
> 
> # Standardization method: Refit

Note that the central tendency of the posterior distribution is still the median - the median of the standardized posterior distribution. We can easily change this, of the type of credible interval used:

library(effectsize)

standardize_parameters(mod, method = "basic", ci = 0.89,
                       centrality  = "MAP", ci_method = "eti")
> Parameter   | MAP (std.) |       89% CI
> ---------------------------------------
> (Intercept) |       0.00 | [0.00, 0.00]
> xtra_hours  |       0.78 | [0.75, 0.81]
> n_comps     |       0.39 | [0.37, 0.42]
> seniority   |       0.08 | [0.05, 0.11]
> 
> # Standardization method: Basic

As we can see, working harder (or at least for longer hours) has stronger predictive power than complementing or seniority. (Do note, however, that this does not mean that if you wish to have a higher salary you should work overtime - the raw parameters seem to suggest that complementing your boss is the way to go, with one compliment worth almost 3.5 times more than a full hours’ work!)

Eta2

Introduction

In classical frequentists models, the computation of (\eta^2) or (\eta^2_p) is straightforward: based on the right combinations of sums-of-squares (SSs), we get the correct proportion of variance accounted for by some predictor term. However such a computation is not as straightforward for Bayesian models, for various reasons (e.g., the model-SS and the residual-SS don’t necessarily sum to the total-SS). Although some have proposed Bayesian methods of estimating explained variance in ANOVA designs (Marsman et al. 2019), these are not yet easy to implement with stan-based models.

An alternative route to obtaining effect sizes of explained variance, is via the use of the posterior predictive distribution (PPD). The PPD is the Bayesian expected distribution of possible unobserved values. Thus, after observing some data, we can estimate not just the expected mean values (the conditional marginal means), but also the full distribution of data around these values (Gelman et al. 2014, chap. 7).

By sampling from the PPD, we can decompose the sample to the various SSs needed for the computation of explained variance measures. By repeatedly sampling from the PPD, we can generate a posterior distribution of explained variance estimates. But note that these estimates are conditioned not only on the location-parameters of the model, but also on the scale-parameters of the model! So it is vital to validate the PPD before using it to estimate explained variance measures.

Setup

Let’s factorize out data from above:

hardlyworking$age_f <- cut(hardlyworking$age, 
                           breaks = c(25,35,45), right = FALSE,
                           labels = c("Young", "Less_young"))
hardlyworking$comps_f <- cut(hardlyworking$n_comps, 
                             breaks = c(0,1,2,3),
                             include.lowest = TRUE, 
                             right = FALSE)

table(hardlyworking$age_f,hardlyworking$comps_f)
>             
>              [0,1) [1,2) [2,3]
>   Young        124   188   114
>   Less_young    13    34    27

And fit our model:

# use (special) effects coding
contrasts(hardlyworking$age_f) <- bayestestR::contr.bayes
contrasts(hardlyworking$comps_f) <- bayestestR::contr.bayes

modAOV <- stan_glm(salary ~ age_f * comps_f,
                   data = hardlyworking, family = gaussian(),
                   refresh = 0)

We can use eta_squared_posterior() to get the posterior distribution of (eta^2) or (eta^2_p) for each effect. Like an ANOVA table, we must make sure to use the right effects-coding and SS-type:

pes_posterior <- eta_squared_posterior(modAOV, 
                                       draws = 500, # how many times should the PPD be estimated
                                       partial = TRUE, # partial eta squared
                                       type = 3) # type 3 SS
> Simulating effect size... This can take a while...
head(pes_posterior)
>   age_f comps_f age_f:comps_f
> 1 0.028   0.161        0.0024
> 2 0.030   0.117        0.0252
> 3 0.014   0.091        0.0078
> 4 0.035   0.112        0.0118
> 5 0.017   0.100        0.0054
> 6 0.126   0.058        0.0013
bayestestR::describe_posterior(pes_posterior, rope_range = c(0, 0.1), test = "rope")
> # Description of Posterior Distributions
> 
> Parameter     | Median |         89% CI |       89% ROPE | % in ROPE
> --------------------------------------------------------------------
> age_f         |  0.027 | [0.000, 0.060] | [0.000, 0.100] |   100.000
> comps_f       |  0.135 | [0.068, 0.192] | [0.000, 0.100] |    14.350
> age_f:comps_f |  0.010 | [0.000, 0.031] | [0.000, 0.100] |   100.000

Compare to:

modAOV_f <- lm(salary ~ age_f * comps_f,
               data = hardlyworking)

eta_squared(car::Anova(modAOV_f, type = 3))
> Parameter     | Eta2 (partial) |       90% CI
> ---------------------------------------------
> age_f         |           0.03 | [0.01, 0.05]
> comps_f       |           0.13 | [0.09, 0.18]
> age_f:comps_f |       6.65e-03 | [0.00, 0.02]

References

Gelman, Andrew, John B Carlin, Hal S Stern, and Donald B Rubin. 2014. “Bayesian Data Analysis (Vol. 2).” Boca Raton, FL: Chapman.

Marsman, Maarten, Lourens Waldorp, Fabian Dablander, and Eric-Jan Wagenmakers. 2019. “Bayesian Estimation of Explained Variance in Anova Designs.” Statistica Neerlandica 73 (3): 351–72.