Fixed effects vs. random effects for web browsing data – a simulation
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
When you work with trace data — data that emerge when people interact with technology — you will notice that such data often have properties that open up questions about statistical modelling. I currently work with browsing records, obtained at several times from the same users (i.e., a panel data set). A first typical characteristic of such data: Browsing behaviors are skewed. If you’re interested in people reading politically extreme web sites, you will find a few people doing it a lot, and most people doing little to none.
A second characteristic relates to the panel nature of the data. If you’re looking at people’s visits to online shops, they do not change their habits much over time — at least these within-person differences are not as great as the differences across people. Panel data lend itself to hierarchical modelling, for example with (1) fixed effects (FE) or (2) random-effects multilevel modelling (RE). For commonalities and differences between these two modelling approaches, see for example Gelman and Hill (2006).[1] The amount of within-person variation relative to between-person variation has important implications for these two approaches.
Below, I simulate the performance of FE vs. RE models, with these data characteristics in mind. I am mainly interested in the statistical power of each model, although my code can be easily adapted to examine, for example, bias and the RSME. The code builds on two sources: Conceptually, on the paper “Should I Use Fixed or Random Effects?” by Clark and Linzer (2015)[2]. In terms of code architecture, on the R package simhelpers
, and its documentation.[3]
I consider a model with independent variable xᵢ and dependent variable yᵢ, for which we have i = 1, … m observations grouped in j = 1, … n units. Importantly, we model theunit effect (or unit intercept) aⱼ, which intuitively can be understood as explaining everything about y that cannot be explained by xᵢ at the unit level. The model formally: yᵢ = aⱼ+ bxᵢ + eᵢ. To understand how the unit effects are estimated differently by FE and RE, I again recommend Gelman and Hill (2007). One key difference between RE and FE is that for the former to be unbiased, it requires a zero correlation between unit effects aⱼ and xᵢ.
Let’s start with the required packages.
library(tidyverse) library(broomExtra) library(mnonr) library(MASS) library(plm) library(lme4) library(lmerTest) library(simhelpers)
In the simulation, I assume a small true effect b that is kept constant across simulations. I vary the following four parameters: (1) the number of units (500, 1000 or 2000); (2) the number of observations per unit (2, 3 or 5); (3) the true correlation between unit effects aⱼ and xᵢ (0 or 0.8); (4) the within-unit standard deviation in xᵢ (between 0.1 and 1, while the standard deviation of xᵢ between units is fixed at 1).
design_factors <- list( n = c(500, 1000, 2000), n_perunit = c(2, 3, 5), truecor = c(0, 0.8), b_1 = c(0.2), x_sd = seq(0.1, 1, 0.1))
I start by defining a data-generating function. It first creates a unit-level average of xᵢ, as well as the unit effects aⱼ. To account for the skewness of browsing data, I draw the unit-level averages of xᵢ from a skewed normal distribution (note that the values of 3 for skew and 33 for kurtosis are motivated by actual browsing data). I then create values of yᵢ according to the model outlined above, adding the normally distributed error eᵢ.
generate_dat <- function(n, n_perunit, truecor, b_1, x_sd) { # Generate unit effects and unit-level means v <- unonr(n, c(0, 0), matrix(c(1, truecor, truecor, 1), 2, 2), skewness = c(0, 3), kurtosis = c(0, 33)) b_0 <- v[, 1] xmean <- v[, 2] # Generate observations per unit dat <- NULL for (unit in 1:n) { x <- rnorm(n_perunit, xmean[unit], x_sd) y <- b_0[unit] + b_1*x + rnorm(n_perunit, 0, 1) dat <- rbind(dat, cbind(x, y, unit = unit)) } return(dat) }
Next, I define a function that estimates the models , and one that pulls the estimate, standard error and p-value for b. Note that I also estimate a standard “pooled” model for reference.
pull_stats <- function(model, method) { estimate <- broomExtra::tidy(model) %>% filter(term == “x”) %>% pull(estimate) p.value <- broomExtra::tidy(model) %>% filter(term == “x”) %>% pull(p.value) std.error <- broomExtra::tidy(model) %>% filter(term == “x”) %>% pull(std.error) return(tibble(method = method, est = estimate, se = std.error, p = p.value)) } estimate <- function(dat){ # Pooled model reg_pool <- lm(y ~ x, data = dat) # Random-effects model reg_fe <- plm(y ~ x, data = dat, index = “unit”, model = “within”) # Fixed-effects model reg_re <- lmerTest::lmer(y ~ x + (1|unit), data = dat) results <- bind_rows( pull_stats(reg_pool, "pooled"), pull_stats(reg_fe, “FE”), pull_stats(reg_re, “RE”)) return(results) }
Next, the function driving the simulation:
run_sim <- function(iterations, n, n_perunit, truecor, b_1, x_sd, seed = NULL) { if (!is.null(seed)) set.seed(seed) results <- rerun(iterations, { dat <- generate_dat(n, n_perunit, truecor, b_1, x_sd) estimate(dat)}) %>% bind_rows() }
At last, run the simulation:
set.seed(20220516) params <- cross_df(design_factors) %>% mutate( iterations = 1000, seed = round(runif(1) * 2³⁰) + 1:n() ) system.time( results <- params %>% mutate( res = pmap(., .f = run_sim) ) %>% unnest(cols = c(res)) )
Now you can inspect and plot the results.
power <- results %>% group_by(method, n, n_perunit, truecor, x_sd, b_1) %>% do(calc_rejection(., p_values = p)) power %>% as.data.frame() %>% filter(truecor == 0) %>% rename(“Number of units” = n, “Observations per unit” = n_perunit) %>% ggplot(aes(x = x_sd, y = rej_rate, group = method, color = method)) + geom_smooth(se = FALSE) + scale_y_continuous(name = “Rejection rate”) + scale_x_continuous(name = “Within-unit SD”) + scale_color_discrete(name = “Method”) + theme_light() + theme(legend.position = “bottom”) + facet_wrap(~ `Number of units` + `Observations per unit`, labeller = label_both)
The plot below shows that, as intuition would tell us, when the within-unit variation is small, FE models have a hard time detecting a true effect, especially when the number of units, and the number of observations per unit, are small.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.