A Tutorial for the Paper: A Default Bayes Factor for testing Null Hypotheses About the Fixed Effects of Linear Two-level Models
Introduction
This tutorial presents a comprehensive and detailed version of the
examples section from my master’s thesis (Sekulovski & Hoijtink,
2023). The paper introduces a default Bayes factor (henceforth abbreviated as BF, Kass & Raftery, 1995) with
clear operating characteristics for testing whether the fixed effects of
linear two-level models are equal to zero. This was achieved by
generalizing an approach for linear regression presented in Hoijtink (2021), resulting in a BF of 19
when the marginal \(R^2\) for the fixed
effects is zero in the data. A wrapper function
for the
R
package bain
(Gu et al., 2021) was developed for testing
the fixed parameters of linear two-level models fitted with the
lmer
function from the R
package
lme4
(Bates et
al., 2015). The function includes an adjustment for the
fraction of the scaling parameter of the prior distribution, as proposed
in my paper. Researchers can access the function and the full paper from
this
repository. This tutorial also provides a step-by-step guide for
calculating the Multiple Imputation-based effective sample Size
(abbreviated as MI-based \(N_{eff}\)),
a new method for determining the effective sample size for two-level
models containing predictors with varying slopes. In the paper, I
demonstrate that the sample size does not impact the calculation of the
BF. However, I believe that the proposed method for determining the
effective sample size of two-level models is an innovative approach.
That’s why I have included a comprehensive, step-by-step guide on how to
calculate the MI-based \(N_{eff}\).
Unfortunately, at this time, there is not a user-friendly R function
readily available that automates this calculation.
The tutorial aims to:
Offer practical examples for using the
wrapper function
.Demonstrate the clear operating characteristics of the proposed Bayes factor when testing fixed effects in two-level models
Provide a clear and easy-to-follow guide for calculating the \(N_{eff}\). Which represents a promising new approach for calculating the effective sample size for two-level models.
Packages
library(R2MLwiN) # includes the data set
library(lme4) # fitting two-level models
library(tidyverse) # data manipulation and plotting
library(jtools) # model summaries & automatic calculation of R^2_m
library(summarytools) # descriptive statistics
library(DT) # interactive tables
# for the MI-based N_eff
library(rjags)
library(MASS)
#the wrapper function
source("wrapper_function.R")
Please note, that you need to have JAGS
installed
on your local machine to be able to compute the MI-based \(N_{eff}\).
The wrapper function
As illustrated in the example call above, the wrapper
takes the following arguments:
x
: the fittedlmer
object;hypotheses
: the specified hypotheses (saved as a character vector), the names of the parameters specified in the hypotheses should correspond to the names of the predictors as specified when callinglmer
(see the examples below);standardize
: a logical argument, indicating whether to compute the BF based on standardized data. This is not relevant when testing whether the fixed effects are equal to zero, however, as mentioned in the paper, it is important when comparing the parameters to each other;N
: the value for the sample size, which includes the following options:"level_1"
- computes the BF using the number of level-1 observations;"level_2"
- computes the BF using the number of level-2 observations;"ICC_effective"
- computes the BF using the ICC based \(N_{eff}\);- a number supplied by the user - used for the MI-based \(N_{eff}\).
fraction
: a multiplicative factor for the fraction b which is by default set to 1 i.e., the default value for J (the number of fixed effects that are set equal to zero in the hypothesis);jref
: a logical argument, which, if set to equalTRUE
, applies the calculation for the reference value for \(J\) (i.e., \(J_{ref}\)), which yields BFs with clear operating characteristics when testing whether the fixed effects are equal to zero. In this case, the previous argumentfraction
is ignored (and the used need not specify it).
For more details on how to test (informative) hypotheses for different statistical models, see this vignette.
The Data
Throughout the examples, we will be using the tutorial
data, a two-level data set available from the R
package
R2MLwiN
. The data represents a subset from a larger data
set of examination results from six inner London Education Authorities
with 4059 students nested within 65 schools. The variables used for the
aims of this example are: (i) the standardized students’ exam score
(normexam
) which will serve as the outcome variable; (ii)
the standardized students’ score at age 11 on the London Reading Test
(standlrt
); (iii) the school indicator
(school
) with 65 schools of varying size. Additionally, in
the end, a level-2 predictor in the form of the average LRT score for
each school (avslrt
) is included to illustrate that this
approach can also be for testing the coefficients of second-level
variables.
Load and inspect the data
data("tutorial")
tutorial.1 <- tutorial[, c(1,2,3,5)] # subset the data with only the variables of interest
st_options(descr.silent = TRUE)
descr(tutorial.1, stats = c("mean", "med", "sd", "min", "max")) # descriptive statistics for the outcome and the level-1 predictor
## Descriptive Statistics
## tutorial.1
## N: 4059
##
## normexam standlrt
## ------------- ---------- ----------
## Mean 0.00 0.00
## Median 0.00 0.04
## Std.Dev 1.00 0.99
## Min -3.67 -2.93
## Max 3.67 3.02
Example 1: Model with random intercept and a random level-1 predictor where \(H_0\) is false.
For this example, we fit a two-level model containing a random
intercept and a random slope for the standlrt
variable,
i.e.,
\[\text{normexam}_{ij} = \alpha +
\beta_1\;\text{standlrt}_{ij} + u_{0j} + u_{1j}\;\text{standlrt}_{ij} +
\epsilon_{ij},\] where \(i\) =
1,. . . , N and \(j\) = 1,. .
. , G with N denoting the number of students (level-1
observations i.e., 4059) and G denoting the total number of
schools (level-2 observations i.e., 65). \(\alpha\) represents the fixed intercept,
\(\beta_1\) represents the effect for
standlrt
, \(u_{0j}\)
represents the random component denoting the deviation of school \(j\) from the fixed intercept and \(u_{1j}\) denotes the deviation of school
\(j\) from the fixed intercept. Lastly,
\(\epsilon_{ij}\) represents the
standard residual error term for student \(i\) in school \(j\). \(u_{0j}\), \(u_{1j}\) and \(\epsilon_{ij}\) have estimated variance
components denoted as \(\sigma^2_{u0}\), \(\sigma^2_{u1}\) and \(\sigma^2_\epsilon\), respectively.
First, let’s visualize this model by plotting the predictor
standlrt
against the outcome normexam
, with
separate regression lines for each school:
We can see that the schools indeed have different slopes for the
predictor standlrt
. So, let’s fit this model with
lmer
.
Fit the model
We will estimate all the models using Full Maximum Likelihood Estimation (FML), here is an interesting blog post about the differences between FML and REML. However, researchers are advised to use whichever method they choose based on other methodological considerations which are not related to testing the fixed effects. As shown in the paper, the estimation method has a negligible influence on the value of the BF.
Inspect the estimates and calculate \(R^2_m\)
The summ
function from the package jtools
(Long, 2020)
gives visually pleasing summaries of fitted lme4
models and
reports the marginal effect size for the fixed effects (\(R^2_m\)). Additionally, this function
reports p-values based on the Satterthwaite approximation for the
degrees of freedom. The lme4
package deliberately omits
reporting p-values due to issues regarding the calculation of the
degrees of freedom (for details on evaluating the
fixed effects by using Null Hypothesis Significance Testing approaches,
see, Luke, 2017).
## MODEL INFO:
## Observations: 4059
## Dependent Variable: normexam
## Type: Mixed effects linear regression
##
## MODEL FIT:
## AIC = 9328.87, BIC = 9366.72
## Pseudo-R² (fixed effects) = 0.32
## Pseudo-R² (total) = 0.43
##
## FIXED EFFECTS:
## --------------------------------------------------------
## Est. S.E. t val. d.f. p
## ----------------- ------- ------ -------- ------- ------
## (Intercept) -0.01 0.04 -0.29 65.28 0.78
## standlrt 0.56 0.02 27.62 62.60 0.00
## --------------------------------------------------------
##
## p values calculated using Kenward-Roger standard errors and d.f.
##
## RANDOM EFFECTS:
## ------------------------------------
## Group Parameter Std. Dev.
## ---------- ------------- -----------
## school (Intercept) 0.30
## school standlrt 0.12
## Residual 0.74
## ------------------------------------
##
## Grouping variables:
## --------------------------
## Group # groups ICC
## -------- ---------- ------
## school 65 0.14
## --------------------------
The fixed effect for standlrt
is estimated to be larger
than zero, which is evident both by its SE and by the highly significant
p-value (rejecting the null hypothesis which states that the fixed
effect is equal to zero). The value for the ICC in this data set is .14,
which indicates the amount of within-school clustering with respect to
normexam
, before accounting for (part of) the explained
variance by introducing (random) predictors.
\(R^2_m\) using
summ()
From the table above we can see that the Pseudo-\(R^2\) (fixed effects) i.e., the \(R^2_m\) is .32, which indicates a large effect size based on Cohen (1992) for the \(R^2\) of multiple linear regression.
\(R^2_m\) by hand
We can also manually compute the marginal \(R^2\) which is defined as the proportion of variation in the outcome variable that can be explained by the fixed effect(s):
\[R^2_m = \frac{\sigma_f^2}{\sigma_f^2 + \sigma_{u0}^2 + \sigma_{u1}^2 + \sigma_\epsilon^2},\] where \(\sigma_f^2\) is calculated is defined as
\[\sigma_f^2 = \text{var}(\alpha +\beta_1\;\text{standlrt}_{ij}).\]
fixef <- fixef(model.1) # extract the fixed effect
y_hat <- fixef[1] + fixef[2]*tutorial$standlrt # predict the outcome
sigma_f <- var(y_hat) # obtain the variance in the predicted outcome based on the fixed effect
# extract the estimated random effects from the fitted model (you can use the summary function)
# i.e., summary(model.1)
sigma_u0 <- 0.09044
sigma_u1 <- 0.01454
sigma_e <- 0.55366
# calculate the R^2_m
marginal_Rsq_fixed <- (sigma_f)/(sigma_f + sigma_u0 + sigma_u1 + sigma_e)
marginal_Rsq_fixed
## [1] 0.3170484
We can see that this value corresponds to the Pseudo-\(R^2\) (fixed effects) given by the
summ
function. Thus, we expect the BF reject the null
(i.e., show support for the unconstrained hypothesis).
Calculate the MI-based \(N_{eff}\)
In the following section, I will provide a comprehensive, yet easy-to-follow guide on how to compute the recently proposed effective sample size using the tutorial data. While there may be more sophisticated and streamlined methods from a programming perspective, the goal of this tutorial is to clearly demonstrate the underlying processes involved in calculating Mi-based \(N_eff\) in a straightforward manner.
First, we need to specify the same model in JAGS
in a
text file:
jags.model <- "model {
# Likelihood
for (i in 1:4059){
normexam[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha[school[i]] + beta[school[i]] * standlrt[i]
}
# Level-2
for(j in 1:65){
alpha[j] <- U[j,1]
beta[j] <- U[j,2]
U[j,1:2] ~ dmnorm (MU[j,], invSigma[,])
MU[j,1] <- mu.alpha
MU[j,2] <- mu.beta
}
# (hyper)Priors
mu.alpha ~ dnorm(0, 0.0001)
mu.beta ~ dnorm(0, 0.0001)
tau ~ dgamma (0.001, 0.001) # resiudal variance
invSigma[1:2,1:2] ~ dwish(Tau, 2) # inverse of the covariance matrix following a Wishart prior
tau.alpha ~ dgamma (0.001, 0.001) # intercept variance
tau.beta ~ dgamma (0.001, 0.001) # slope variance
Tau[1,1] <- pow(tau.alpha, -1/2) # construct the scale matrix (a parameter of the Whishart prior)
Tau[2,2] <- pow(tau.beta, -1/2)
Tau[1,2] <- rho_1*tau.alpha*tau.beta # covariance between the slope of standlrt and the intrcept
Tau[2,1] <- Tau[1,2]
rho_1 ~ dunif(-1, 1) # correlation (between -1 and 1) not important
}
"
Compare the models
Since we are using uninformative priors, we will take a small detour
and show that the Bayesian posterior parameter estimates correspond to
those obtained with lmer
when using FML.
# Check and inspect the model
model.def <- jags.model(file = textConnection(jags.model),
inits = list(.RNG.name="base::Wichmann-Hill",
.RNG.seed=100),
data = tutorial, n.chains = 2)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 4059
## Unobserved stochastic nodes: 72
## Total graph size: 16539
##
## Initializing model
update(object = model.def, n.iter = 1000)
# ask only for these parameters
parameters <- c("mu.alpha", "mu.beta", "tau", "invSigma")
results <- coda.samples(model = model.def, variable.names = parameters, n.iter =1000)
summary(results)
##
## Iterations = 2001:3000
## Thinning interval = 1
## Number of chains = 2
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## invSigma[1,1] 13.52115 3.58318 0.0801224 0.1558283
## invSigma[2,1] -16.27213 8.90956 0.1992239 0.5807041
## invSigma[1,2] -16.27213 8.90956 0.1992239 0.5807041
## invSigma[2,2] 79.44014 31.90626 0.7134457 2.1708230
## mu.alpha -0.01271 0.04361 0.0009752 0.0013995
## mu.beta 0.55412 0.02121 0.0004743 0.0008501
## tau 1.80630 0.04071 0.0009103 0.0009105
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## invSigma[1,1] 8.05983 11.05483 13.07467 15.39220 22.17893
## invSigma[2,1] -39.38018 -20.46357 -14.63042 -10.09460 -4.02706
## invSigma[1,2] -39.38018 -20.46357 -14.63042 -10.09460 -4.02706
## invSigma[2,2] 37.66881 57.83871 72.84995 93.21918 170.70371
## mu.alpha -0.09981 -0.04166 -0.01324 0.01561 0.06769
## mu.beta 0.51329 0.53977 0.55368 0.56767 0.59710
## tau 1.72764 1.77944 1.80671 1.83268 1.89180
#transform the precisions of the random eff. into standard deviations
sqrt(1/13.52115) # intercept SD
## [1] 0.2719526
## [1] 0.1121967
## [1] 0.744055
Note how the posterior mean estimates along with their standard
deviations roughly correspond to their (Full) Maximum Likelihood
counterparts. Note, the (co)variance components are expressed in terms
of precisions, the reason for this are beyond the scope of the tutorial.
Thus, in order to be able to compare the results with the ones from
lmer
, we need to take the inverse for the values of
invSigma
and tau
, and then take the square
root (since the random effects summarized by the summ
function are expressed as standard deviations). The last three rows from
the output contain the estimated random effects expressed in terms of
standard deviations.
Obtain multiple imputed data sets
Now, we estimate the model again and save all the sampled random effects, which we will treat as multiple imputed values. We run the sampler by using two chains with a 1000 iterations each (excluding the burn-in period of a 1000 extra iterations).
# fit the model again but ask to monitor all the random effects
model.def <- jags.model(file = textConnection(jags.model),
inits = list(.RNG.name="base::Wichmann-Hill",
.RNG.seed=100),
data = tutorial, n.chains = 2)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 4059
## Unobserved stochastic nodes: 72
## Total graph size: 16539
##
## Initializing model
update(object = model.def, n.iter = 1000)
# the random effects (the ones we need for the aim of this approach)
parameters <- c("alpha", "beta", "mu.alpha", "mu.beta")
results <- coda.samples(model = model.def, variable.names = parameters, n.iter =1000)
Extract the samples from the posterior (from both chains):
chain1 <- as.data.frame(results[[1]])
chain2 <- as.data.frame(results[[2]])
samples <- rbind(chain1, chain2) # combine both chains
Before we start manipulating the sampled random (and fixed) effects, such that we can add each one to a copy of the original data set, it would be nice to have a glimpse of the data set’s layout.
Extract the fixed effect from each sampled vector (i.e., \(\alpha\) and \(\beta)\):
fixed_alphas <- samples[, 131]
fixed_betas <- samples[, 132]
fixed_alphas <- as.data.frame(fixed_alphas)
fixed_betas <- as.data.frame(fixed_betas)
Repeat each fixed effect N times (in this case 4059 - the number of \(N_{level-1}\) observations) and store it in a separate data frame that will be merged with the imputed data sets later on:
# the fixed intercepts
samp_fixed_alphas <- list()
for (i in 1:nrow(fixed_alphas)){
samp_fixed_alphas[[i]] <-rep.int(fixed_alphas[i, 1], nrow(tutorial.1))
}
# the fixed slopes
samp_fixed_betas <- list()
for (i in 1:nrow(fixed_betas)){
samp_fixed_betas[[i]] <-rep.int(fixed_betas[i, 1], nrow(tutorial.1))
}
Get the sampled random intercepts (\(\alpha_j's\))
First, split the data per group:
Afterwards, extract the size of each group:
Exclude the columns containing the fixed effects from the
samples
data set:
Extract the random \(\alpha_{j}s\) and \(\beta_{j}s\) into separate data frames:
alphas <- samples [, 1:65] # extract the random intercepts
betas <- samples [, 66:130] # extract the random slopes
Use the size of each group to replicate, from each iteration, the
respective \(\alpha_j\),
n_gr
number of times:
Extract these in a big matrix with the samples from every iteration stored in a separate column:
samp_alphas_mat <- matrix(nrow = nrow(tutorial.1), ncol = nrow(alphas))
for(i in 1:nrow(alphas)){
samp_alphas_mat[, i] <- unlist(samp_alphas[[i]])
}
Get the sampled random slopes (\(\beta_j's\))
Use the size of each group to replicate, from each iteration, the
respective \(\beta_j\),
n_gr
number of times:
Extract them in a big matrix with the samples from every iteration stored in a separate column:
samp_betas_mat <- matrix(nrow = nrow(tutorial.1), ncol = nrow(betas))
for(i in 1:nrow(betas)){
samp_betas_mat[, i] <- unlist(samp_betas[[i]])
}
Combine the sampled random effects
First, combine the \(\alpha_j's\):
imputed <- list()
for (i in 1:ncol(samp_alphas_mat)){
imputed[[i]] <- cbind(tutorial.1, samp_alphas_mat[, i])
}
Add the \(\beta_j's\):
Add the sampled fixed effects:
The \(\alpha's\):
The \(\beta's\):
Voilà, we obtained 2000 imputed data sets.
Transform the outcome variable
In order to be able to fit multiple linear regression models we need to transform the outcome variable as follows,
\[Z_{ij} = \text{normexam}_{ij} - \alpha_j - \beta_j\;\text{standlrt}_{ij} + \alpha + \beta\;\text{standlrt}_{ij},\]
which leads to \[ Z_i = \eta_0 + \eta_1\;\text{standlrt}_i + e_i.\]
Construct a function that calculates \(Z\):
transform_z <- function(df){
df$z <- df[, 3] - df[, 5] - df[, 6] * df[, 4] + df[, 7] + df[, 8] * df[, 4]
df
}
Apply the function to the “imputed” data sets and obtain a column for \(Z\):
Now let’s quickly have a look at one of the imputed data sets:
datatable(imputed[[1]][,-c(1,2)], options = list(pageLength = 5)) # exclude the school and student columns
Estimates:
Fit linear regression models to each imputed data set and extract the estimated coefficients and their respective variance-covariance matrices:
estimates <- list()
vCOV <- list()
for(i in seq_along(imputed)){
estimates[[i]] <- coef(lm(imputed[[i]][,9] ~ imputed[[i]][,4]))
vCOV [[i]] <- vcov(lm(imputed[[i]][,9] ~ imputed[[i]][,4]))
}
Extract these estimates in a data frame:
Apply the multiple imputation equations
The following equations are taken from Van Buuren (2018, Ch. 2.3).
Combined estimate:
\[ \bar{\boldsymbol{\eta}} = \frac{1}{m} \sum_{l = 1}^{m} \hat{\boldsymbol{\eta}_l} \]
Average over the variances:
\[ \bar{U} = \frac{1}{m} \sum_{l = 1}^{m} \bar{U}_l \]
Since the list vCOV
contains variance-covariance
matrices and we need to take the average across all of them (and not
within them), it is best to extract every element of each covariance
matrix and then take the average:
V <- matrix(nrow = nrow(estimates), ncol = 4)
for(i in seq_along(vCOV)){
V[i, 1] <- vCOV[[i]][1,1]
V[i, 2] <- vCOV[[i]][1,2]
V[i, 3] <- vCOV[[i]][2,1]
V[i, 4] <- vCOV[[i]][2,2]
}
U_bar <- apply(V, 2, mean)
U_bar <- matrix(U_bar, nrow = 2, ncol = 2)
Unbiased estimate of the variance between the m complete data estimates:
\[B = \frac{1}{m-1} \sum_{l = 1}^{m} (\hat{\boldsymbol{\eta}}_l - \bar{\boldsymbol{\eta}})(\hat{\boldsymbol{\eta}}_l - \bar{\boldsymbol{\eta}})'.\]
Total variance:
\[T = \bar{U} + (1 + \frac{1}{m})*B\]
Proportion of variation attributable to the missing data (a compromise over all estimates)
\[\bar{\lambda} =(1 + \frac{1}{m}) tr(BT^{-1})/k\]
k <- ncol(estimates) # number of parameters in eta_hat
lambda_hat <- (1+ 1/m) * sum(diag(B %*% ginv(Total_var)))/k
Degrees of freedom:
\[\nu_{old} = \frac{m-1}{\bar{\lambda}^2}; \; \nu_{com} = N_{level-1} - k; \; \nu_{obs} = \frac{\nu_{com} + 1}{\nu_{com} + 3}\nu_{com}(1-\bar{\lambda}); \; \nu = \frac{\nu_{old} \nu_{obs}}{\nu_{old} + \nu_{obs}}.\]
N <- nrow(tutorial.1) # N_level-1
nu_old <- (m - 1)/lambda_hat^2
nu_com <- N - k
nu_obs <- ((nu_com +1)/(nu_com + 3))*nu_com*(1 - lambda_hat)
nu <- (nu_old*nu_obs)/(nu_old + nu_obs)
Fraction of missing information:
\[\gamma = \frac{\nu + 1}{\nu + 3}\bar{\lambda}+\frac{2}{\nu + 3}\]
Calculate the MI-based \(N_{eff}\)
\[\text{MI-based}\; N_{eff} = N_{level-1} - \gamma*N_{level-1}\]
## [1] 874.2342
The MI-based effective sample size is 874.
Compare with the ICC-based \(N_{eff}\)
Calculate the ICC-based \(N_{eff}\)
We already know that the ICC = .17 since it was given by the
summ
function, however, here we calculate it by hand, to
illustrate that the ICC can only be computed using the estimated
variances from the intercept-only model and the average school (group)
size:
\[ICC = \frac{\sigma_{u0}}{\sigma_{u0} + \sigma_{\epsilon}}, \] where \(\sigma^2_{u0}\) denotes the variance for the random slope and \(\sigma^2_{\epsilon}\) denotes the residual variance estimated from a random intercept-only model.
# fit a random intercept-only model
model.0 <- lmer(normexam ~1 + (1|school), REML = FALSE, data = tutorial.1)
summary(model.0)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: normexam ~ 1 + (1 | school)
## Data: tutorial.1
##
## AIC BIC logLik deviance df.resid
## 11016.6 11035.6 -5505.3 11010.6 4056
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9471 -0.6486 0.0117 0.6992 3.6578
##
## Random effects:
## Groups Name Variance Std.Dev.
## school (Intercept) 0.1686 0.4107
## Residual 0.8478 0.9207
## Number of obs: 4059, groups: school, 65
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) -0.01317 0.05363 -0.246
## [1] 0.1658796
The ICC-based \(N_{eff}\) is defined as
\[\text{ICC-based} \; N_{eff} = \frac{N_{level-1}}{1+(n_c - 1)ICC},\]
where \(n_c\) denotes the within-group sample size. As already mentioned, a drawback of the ICC-based \(N_{eff}\) is that when the data has varying group sizes, a compromise, such as taking the average group size, has to be made.
n_clus <- mean(sapply(split, nrow)) # the average group (i.e., school) size (62.4 in this case)
ICC_N_eff <- N / (1 + (n_clus - 1) * ICC)
ICC_N_eff
## [1] 362.6483
The value of the ICC-based effective sample size is 363. This
demonstrates that by accounting for the within-group variation through a
model that includes a random intercept and random slope for the
predictor standlrt
, the number of effective level-1
observations increases from 362 to 874. Essentially, the model with the
predictor “explains away” some of the clustering within groups indicated
by the ICC. As an exercise, you may want to try recalculating the
MI-based effective sample size by using a random intercept-only model
with JAGS, and compare the value obtained to that of the ICC-based
approach.
Test the fixed effects using the BF
For the aims of this example we test a null hypothesis stating that
the fixed effect for standlrt
is equal to zero and
a simple inequality constrained (informative) hypothesis stating that
the fixed effect is larger than zero:
\[H_0: \beta_1 = 0; \; H_i:\beta_1 >0.\]
We define these hypotheses in one single character vector (note the
names correspond to the variable names used when calling
lmer
):
Now we call the wrapper function
with the calculated
MI-based \(N_{eff}\) for the sample
size and ref
set to TRUE
:
BFs.1 <- bain_2lmer(model.1, hypotheses, standardize = FALSE,
N = MI_N_eff, seed = 123, jref = TRUE)
print(BFs.1)
## Bayesian informative hypothesis testing for an object of class numeric:
##
## Fit Com BF.u BF.c PMPa PMPb PMPc
## H1 0.000 1.053 0.000 0.000 0.000 0.000 0.000
## H2 1.000 0.500 2.000 22919082073133.293 1.000 0.667 1.000
## Hu 0.333
## Hc 0.000 0.500 0.000 0.000
##
## Hypotheses:
## H1: standlrt=0
## H2: standlrt>0
##
## Note: BF.u denotes the Bayes factor of the hypothesis at hand versus the unconstrained hypothesis Hu. BF.c denotes the Bayes factor of the hypothesis at hand versus its complement. PMPa contains the posterior model probabilities of the hypotheses specified. PMPb adds Hu, the unconstrained hypothesis. PMPc adds Hc, the complement of the union of the hypotheses specified.
The \(BF_{0u}\) is a very small number close to zero, which indicates there is no support in the data for the null hypothesis, we can also take the inverse of this number to obtain \(BF_{u0}\) (the BF of the unconstrained hypothesis against the null) which in this case, is equal to \(1.092\text{e+}169\), i.e., there is overwhelming support in the data for the unconstrained hypothesis.
## [1] 1.092365e+168
The \(BF_{iu}\) is around 2, which indicates that the support in the data is two times in favour of \(H_i\). Additionally, we can easily obtain the BF of the informative hypothesis against the null hypothesis, by taking the ratio of their respective BFs against the unconstrained hypothesis. In this case \(BF_{iu} =2.5\text{e+}181\).
## [1] 2.5036e+181
Finally, we inspect the value for the fraction b:
## [1] 0.002770083
The value for the fraction b is smaller than 0.05, allowing
us to interpret the resulting BFs as Approximate BFs. Thus, we
say: using the default AAFBF (Gu et al., 2018) set to equal
19 when the marginal \(R^2\)
for the fixed effects is zero in the data (Hoijtink, 2021), the
approximate BF of the informative hypothesis against the null
hypothesis is \(2.5\text{e+}181\), and
we conclude that given the data, the fixed coefficient for the
predictor standlrt
is larger than zero.
Example 2: Model with random intercept and a random level-1 predictor where \(H_0\) is true.
Simulate the data where \(H_0\) is true
In order to further clarify the operating characteristics of this BF,
we simulate the outcome normexam
by having the fixed effect
for standlrt
equal to zero and redo all the analyses.
set.seed(123) # set a random seed to make the results reproducible
nG <- 65 # number of schools
b <- 0 # new coefficient
# simulate the random effects (using the estimated variances from model.1)
intercept_var <- rnorm(nG, 0, 0.3)
u_0 <- rep(intercept_var, times = n_gr)
slope_var_1 <- rnorm(nG, 0, 0.1)
u_1 <- rep(slope_var_1, times = n_gr)
# same with the residual variance
epsilon <- rnorm(4059, 0, 0.7)
# calculate the new outcome
normexam <- b *tutorial$standlrt + u_0 + u_1*tutorial$standlrt + epsilon
# put everything together in a new df
tutorial.2 <- data.frame(tutorial$school, tutorial$student, normexam, tutorial$standlrt)
names(tutorial.2) <- c("school", "student", "normexam", "standlrt")
Fit the model
Inspect the parameter estimates and calculate \(R^2_m\)
## MODEL INFO:
## Observations: 4059
## Dependent Variable: normexam
## Type: Mixed effects linear regression
##
## MODEL FIT:
## AIC = 8800.37, BIC = 8838.23
## Pseudo-R² (fixed effects) = 0.00
## Pseudo-R² (total) = 0.16
##
## FIXED EFFECTS:
## -------------------------------------------------------
## Est. S.E. t val. d.f. p
## ----------------- ------ ------ -------- ------- ------
## (Intercept) 0.01 0.04 0.23 65.23 0.82
## standlrt 0.00 0.02 0.09 61.43 0.93
## -------------------------------------------------------
##
## p values calculated using Kenward-Roger standard errors and d.f.
##
## RANDOM EFFECTS:
## ------------------------------------
## Group Parameter Std. Dev.
## ---------- ------------- -----------
## school (Intercept) 0.29
## school standlrt 0.10
## Residual 0.70
## ------------------------------------
##
## Grouping variables:
## --------------------------
## Group # groups ICC
## -------- ---------- ------
## school 65 0.15
## --------------------------
Note that the value for the fixed effect of standlrt
is
estimated to be zero (this can also be seen from the highly
non-significant p-value). Moreover, now \(R^2_m = 0\), and we expect that our new
default BF will tend towards 19.
Additionally, the ICC is .15 which yields an ICC based \(N_{eff}\) of 375 and MI-based \(N_{eff}\) of 1070.
Test the fixed effects using the BF
BFs.2 <- bain_2lmer(model.2, hypotheses, standardize = FALSE,
N = 1070, seed = 123, jref = TRUE)
print(BFs.2)
## Bayesian informative hypothesis testing for an object of class numeric:
##
## Fit Com BF.u BF.c PMPa PMPb PMPc
## H1 23.062 1.218 18.927 18.927 0.946 0.901 0.904
## H2 0.535 0.500 1.070 1.150 0.054 0.051 0.051
## Hu 0.048
## Hc 0.465 0.500 0.930 0.044
##
## Hypotheses:
## H1: standlrt=0
## H2: standlrt>0
##
## Note: BF.u denotes the Bayes factor of the hypothesis at hand versus the unconstrained hypothesis Hu. BF.c denotes the Bayes factor of the hypothesis at hand versus its complement. PMPa contains the posterior model probabilities of the hypotheses specified. PMPb adds Hu, the unconstrained hypothesis. PMPc adds Hc, the complement of the union of the hypotheses specified.
The \(BF_{0u} = 18.93\) and the
\(BF_{iu} = 1.1\). In this case, we
conclude that given the data, the fixed coefficient for the
predictor standlrt
is not different from zero.
## [1] 16.45191
## [1] 0.002770083
The BF of the null hypothesis against the informative hypothesis,
\(BF_{0i} = 16.5\). Thus, we say that
the support in the data is around 16 times in favour of the null
hypothesis against the informative, that is, given the data the
fixed effect for standlrt
is equal to zero, based on the
approximate \(BF_{0i}\) (the
value for the fraction b is still 0.03).
Example 3: Model that includes a level-2 predictor where \(H_0\) is false
To illustrate that this approach can also be used when the model
contains a continuous level-2 predictor, we add the variable
avslrt
to the model, which represents the average LRT score
for each school. Thus, the linear equation for the model becomes:
\[\text{normexam}_{ij} = \alpha + \beta_1\;\text{standlrt}_{ij} + \beta_{1,2}\;\text{avslrt}_{j}+ u_{0j} + u_{1j}\;\text{standlrt}_{ij} + \epsilon_{ij},\]
where, \(\beta_{1,2}\) denotes the
estimated coefficient for the level-2 predictor avslrt
.
Fit the model and inspect the \(R^2_m\)
model.3 <- lmer(normexam ~ standlrt + avslrt + (standlrt | school), REML = FALSE, data = tutorial)
summ(model.3)
## MODEL INFO:
## Observations: 4059
## Dependent Variable: normexam
## Type: Mixed effects linear regression
##
## MODEL FIT:
## AIC = 9324.43, BIC = 9368.59
## Pseudo-R² (fixed effects) = 0.35
## Pseudo-R² (total) = 0.44
##
## FIXED EFFECTS:
## --------------------------------------------------------
## Est. S.E. t val. d.f. p
## ----------------- ------- ------ -------- ------- ------
## (Intercept) -0.00 0.04 -0.03 65.91 0.97
## standlrt 0.55 0.02 27.09 63.16 0.00
## avslrt 0.29 0.11 2.70 71.47 0.01
## --------------------------------------------------------
##
## p values calculated using Kenward-Roger standard errors and d.f.
##
## RANDOM EFFECTS:
## ------------------------------------
## Group Parameter Std. Dev.
## ---------- ------------- -----------
## school (Intercept) 0.27
## school standlrt 0.12
## Residual 0.74
## ------------------------------------
##
## Grouping variables:
## --------------------------
## Group # groups ICC
## -------- ---------- ------
## school 65 0.12
## --------------------------
Now, we see that the level-2 predictor is also larger than zero,
evident both by the value of the estimate and its respective SE but also
by the value of \(R^2_m\) which is .35
(compared to .32 when only including standlrt
as a
predictor).
Test the (fixed) effects using the BF
Now, the hypotheses are defined as follows:
\[H_0: \beta_1 = \beta_{1,2} = 0; \;H_i:\beta_1 > 0 \; \&\; \beta_{1,2} > 0.\]
We calculate the BFs by using the ICC-based \(N_{eff}\) = 358 (which can be calculated
automatically within the wrapper function
) since the
MI-based \(N_{eff}\) has not yet been
extended to include level-2 predictors and, as shown in the paper, the
sample size does not influence the BF when using \(J_{ref}\).
BFs.3 <- bain_2lmer(model.3, hypotheses, standardize = FALSE,
N = "ICC_effective", seed = 123, jref = TRUE)
print(BFs.3)
## Bayesian informative hypothesis testing for an object of class numeric:
##
## Fit Com BF.u BF.c PMPa PMPb PMPc
## H1 0.000 3.946 0.000 0.000 0.000 0.000 0.000
## H2 0.997 0.236 4.224 1229.900 1.000 0.809 0.999
## Hu 0.191
## Hc 0.003 0.764 0.003 0.001
##
## Hypotheses:
## H1: standlrt=avslrt=0
## H2: standlrt>0&avslrt>0
##
## Note: BF.u denotes the Bayes factor of the hypothesis at hand versus the unconstrained hypothesis Hu. BF.c denotes the Bayes factor of the hypothesis at hand versus its complement. PMPa contains the posterior model probabilities of the hypotheses specified. PMPb adds Hu, the unconstrained hypothesis. PMPc adds Hc, the complement of the union of the hypotheses specified.
## [1] 0.05263158
This yields \(BF_{0u} \simeq0\) and
\(BF_{iu} \simeq 4.2\), with a value
for b of exactly 0.05. Thus we say that based on the
approximate BF, there is evidence in the data that both the
fixed effect for standlrt
and the level-2 coefficient for
avslrt
are larger than zero.
## [1] 1.981648e+167
## [1] 2.437228e+170
Additionally, \(BF_{u0} \simeq 1.9\text{e+}167\) and \(BF_{i0}\simeq 2.4\text{e+}170\).
Since the value for b is exactly equal to 0.05 we say that
based on the approximate BF, there is substantial evidence in the data
that both the fixed effect for standlrt
and the level-2
coefficient for avslrt
are larger than zero.
Example 4: Model that includes a level-2 predictor where \(H_0\) is true
Finally, we repeat this analysis again, by simulating the outcome where the coefficients for both the level-1 and the level-2 predictors are zero i.e., the \(R^2_m = 0\).
set.seed(12) # set a random seed in order to make the results reproducible
nG <- 65 # number of schools
b <- 0 # new coefficient
# simulate the random effects (using the estimated variances from model.1)
intercept_var <- rnorm(nG, 0, 0.3)
u_0 <- rep(intercept_var, times = n_gr)
slope_var_1 <- rnorm(nG, 0, 0.1)
u_1 <- rep(slope_var_1, times = n_gr)
# same with the residual variance
epsilon <- rnorm(4059, 0, 0.7)
# calculate the new outcome
normexam <- b *tutorial$standlrt + b*tutorial$avslrt + u_0 + u_1*tutorial$standlrt + epsilon
# put everything together in a new df
tutorial.3 <- data.frame(tutorial$school, tutorial$student, normexam, tutorial$standlrt, tutorial$avslrt)
names(tutorial.3) <- c("school", "student", "normexam", "standlrt", "avslrt")
Fit the model and inspect the \(R^2_m\)
model.4 <- lmer(normexam ~ standlrt + avslrt + (standlrt | school), REML = FALSE, data = tutorial.3)
summ(model.4)
## MODEL INFO:
## Observations: 4059
## Dependent Variable: normexam
## Type: Mixed effects linear regression
##
## MODEL FIT:
## AIC = 8934.82, BIC = 8978.98
## Pseudo-R² (fixed effects) = 0.00
## Pseudo-R² (total) = 0.14
##
## FIXED EFFECTS:
## --------------------------------------------------------
## Est. S.E. t val. d.f. p
## ----------------- ------- ------ -------- ------- ------
## (Intercept) -0.01 0.04 -0.20 65.96 0.85
## standlrt -0.01 0.02 -0.35 62.44 0.73
## avslrt -0.08 0.11 -0.75 71.30 0.45
## --------------------------------------------------------
##
## p values calculated using Kenward-Roger standard errors and d.f.
##
## RANDOM EFFECTS:
## ------------------------------------
## Group Parameter Std. Dev.
## ---------- ------------- -----------
## school (Intercept) 0.26
## school standlrt 0.10
## Residual 0.71
## ------------------------------------
##
## Grouping variables:
## --------------------------
## Group # groups ICC
## -------- ---------- ------
## school 65 0.12
## --------------------------
We can now see that the \(R^2_m\) = 0, thus we expect the BF to be in favour of the null hypothesis.
Test the (fixed) effects using the BF
BFs.4 <- bain_2lmer(model.4, hypotheses, standardize = FALSE,
N = "ICC_effective", seed = 123, jref = TRUE)
print(BFs.4)
## Bayesian informative hypothesis testing for an object of class numeric:
##
## Fit Com BF.u BF.c PMPa PMPb PMPc
## H1 58.159 4.527 12.846 12.846 0.978 0.908 0.895
## H2 0.069 0.236 0.294 0.241 0.022 0.021 0.020
## Hu 0.071
## Hc 0.931 0.764 1.218 0.085
##
## Hypotheses:
## H1: standlrt=avslrt=0
## H2: standlrt>0&avslrt>0
##
## Note: BF.u denotes the Bayes factor of the hypothesis at hand versus the unconstrained hypothesis Hu. BF.c denotes the Bayes factor of the hypothesis at hand versus its complement. PMPa contains the posterior model probabilities of the hypotheses specified. PMPb adds Hu, the unconstrained hypothesis. PMPc adds Hc, the complement of the union of the hypotheses specified.
## [1] 53.19949
## [1] 0.05263158
We obtain a \(BF_{0u} = 12.8\) and a
\(BF_{iu} = 0.3\), which translates to
a \(BF_{0i}= 53.2\). Thus, we can say
that the evidence in the data is 53 times in favour of \(H_0\) (i.e., that the effects of both
standlrt
and avslrt
are zero) against \(H_i\) (i.e., that the effects of both
standlrt
and avslrt
are larger than zero).
Example 5: Model that includes a cross-level interaction
Finally, using the data set with the simulated outcome with no effect
of the predictors (used in the preceding example), we estimate a model
that includes a cross-level interaction between standlrt
and avslrt
.
model.5 <- lmer(normexam ~ standlrt + avslrt + standlrt:avslrt + (standlrt | school), REML = FALSE, data = tutorial.3)
summ(model.5)
## MODEL INFO:
## Observations: 4059
## Dependent Variable: normexam
## Type: Mixed effects linear regression
##
## MODEL FIT:
## AIC = 8933.29, BIC = 8983.76
## Pseudo-R² (fixed effects) = 0.00
## Pseudo-R² (total) = 0.14
##
## FIXED EFFECTS:
## ------------------------------------------------------------
## Est. S.E. t val. d.f. p
## --------------------- ------- ------ -------- ------- ------
## (Intercept) -0.01 0.04 -0.32 66.97 0.75
## standlrt -0.00 0.02 -0.18 63.31 0.86
## avslrt -0.04 0.11 -0.33 71.86 0.74
## standlrt:avslrt 0.10 0.05 1.87 69.68 0.07
## ------------------------------------------------------------
##
## p values calculated using Kenward-Roger standard errors and d.f.
##
## RANDOM EFFECTS:
## ------------------------------------
## Group Parameter Std. Dev.
## ---------- ------------- -----------
## school (Intercept) 0.26
## school standlrt 0.10
## Residual 0.71
## ------------------------------------
##
## Grouping variables:
## --------------------------
## Group # groups ICC
## -------- ---------- ------
## school 65 0.12
## --------------------------
Based on the estimate and its standard error, the cross-level interaction effect is larger than the fixed effects of the two predictors, which leads us to believe that the cross-level interaction is different from zero (it’s also “almost significant” based on its p-value).
In order to formally test this using our proposed framework, we include a second null hypothesis:
\[H_{0_{2}}:\beta_1 = \beta_{1,2} = \beta_{int} = 0\],
where \(\beta_{int}\) is the regression coefficient for the interaction term \(\text{standlrt}_{ij}*\text{avslrt}_j\).
We keep the other two hypotheses used in Examples 3 and 4 i.e., \(H_{0_{1}}: \beta_1 = \beta_{1,2} = 0\) and \(H_i:\beta_1 > 0 \; \&\; \beta_{1,2} > 0\).
hypotheses <- "standlrt = avslrt =0;
standlrt = avslrt = standlrt:avslrt = 0;
standlrt > 0 & avslrt > 0 & standlrt:avslrt > 0"
BFs.5 <- bain_2lmer(model.5, hypotheses, standardize = F, N = "ICC_effective", seed = 123, jref = TRUE)
print(BFs.5)
## Bayesian informative hypothesis testing for an object of class numeric:
##
## Fit Com BF.u BF.c PMPa PMPb PMPc
## H1 79.317 12.052 6.581 6.581 0.682 0.618 0.618
## H2 71.897 34.967 2.056 2.056 0.213 0.193 0.193
## H3 0.145 0.144 1.009 1.010 0.105 0.095 0.095
## Hu 0.094
## Hc 0.855 0.856 0.999 0.094
##
## Hypotheses:
## H1: standlrt=avslrt=0
## H2: standlrt=avslrt=standlrt:avslrt=0
## H3: standlrt>0&avslrt>0&standlrt:avslrt>0
##
## Note: BF.u denotes the Bayes factor of the hypothesis at hand versus the unconstrained hypothesis Hu. BF.c denotes the Bayes factor of the hypothesis at hand versus its complement. PMPa contains the posterior model probabilities of the hypotheses specified. PMPb adds Hu, the unconstrained hypothesis. PMPc adds Hc, the complement of the union of the hypotheses specified.
## [1] 0.1404422
## [1] 3.200749
We obtain the following results: \(BF_{0_{1}u}\) = 6.6, \(BF_{0_{2}u}\) = 2.1 and \(BF_{iu}\) = 1. Which yields a \(BF_{0_{1} 0_{2}}\) = 3.2. Since now the
fraction b is equal to 0.14, it follows that it should be
explicitly stated that the BFs given by bain
, in this case,
are treated as information criteria that are inspired by the BF.
However, for all practical purposes, the interpretation of the values
for the BFs (as support in the data for one of the hypotheses against
the other) remains the same.
Thus, we say that, based on the information criterion inspired by the BF, the evidence in the data is 3.2 times in favour of \(H_{0_{1}}\) against \(H_{0_{2}}\). This means that there is support in the data for the null hypothesis that only the fixed effects are equal to zero, whereas the interaction effect is larger than zero.