Mixed Effects Models 2: Crossed vs. Nested Random Effects

Why do we need it? What are the benefits?

If we don’t account for repeated measurements, we’ll finish up with a big sample size due to pseudoreplication and will most likely get significant results which won’t make any sense. However, if we account for repeated measurements, but wrongly, we’ll answer some interesting, but also wrong, questions.

What’s the difference?

Random effects (factors) can be crossed or nested - it depends on the relationship between random-effect variables and your research question.

Nested random effects

Think of the Russian nesting dolls, or hierarchy within a company on different management levels. That is actually the reason these models are sometimes called hierarchical and sometimes multilevel linear models. Every doll can only be nested within a bigger doll. Similarly, every worker belongs to a particular department of a company and has only one boss. All the department bosses also have one (big-)boss. As you can see, random effect is nested if it appears in only one of the higher hierarchy levels. If you think about the data, then level simply means category or group of a categorical variable. Thus, one observation (data row) can only be part of one particular category of a variable (data column), and be part of several columns.

Imagine two patients are always going to the same doctor. And two other patients always going to other doctor:

Picture originates from here

nested <- tibble(
  patient = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
  doctor  = c(rep("Bob", 8), rep("Mik", 8))
table(nested$patient, nested$doctor)
##     Bob Mik
##   1   4   0
##   2   4   0
##   3   0   4
##   4   0   4

The table above shows 4 repeated measurements per patient and doctor, therefore our Nested Mixed Effect Model will look like:

lmer(response ~ predictor + (1|doctor/patient))

Another example could be animals within a farm and farms within a region and region within a country. Animals within different countries (Spain vs. Sweden), regions (Mountains of Bavaria vs. flat grasslands of Holstein in Germany), farms (bio vs. traditional) could be similar (correlated). The random effect structure of such a model then would look like:

lmer(response ~ predictor + (1|country/region/farm/animal)

Crossed random effects

A good news is that you already know and used crossed random effects in the previous post.

You know that every school, has the same hierarchy of classes, e.g. Class 1, Class 2 etc.:

Picture originates from here

The random effects of classes within the schools are crossed, just like the arrows on the pic above.

Now, imagine the doctors started to switch shifts in the hospital, and any patient could now get to any of the doctors:

crossed <- tibble(
  patient = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
  doctor  = c(rep(c("Bob", "Mik"), 8))
table(crossed$patient, crossed$doctor)
##     Bob Mik
##   1   2   2
##   2   2   2
##   3   2   2
##   4   2   2

A good indicator for the crossed effect is a completely filled (no zeros, like in the former example) cross table. As you can see, random effect is crossed if it appears in more then one of the higher hierarchy levels.

Crossed random effect can be fully crossed, as in the table above, where every low level (e.g. patient) appears in every higher level (doctors), or partially crossed, where low levels appear in more then one of the higher hierarchy levels but not in all.

In this case Crossed Mixed Effect Model would look like that:

lmer(response ~ predictor + (1|doctor) + (1|patient))

But, the bad news is that Crossed and Nested effects could be necessary in the model at the same time, but on different levels, and this can become very complex and be very confusing.

Crossed and Nested at the same time

both <- nested %>% 
  mutate(season = rep(c("summer", "winter", "spring", "autumn"), 4))
##     autumn spring summer winter
##   1      1      1      1      1
##   2      1      1      1      1
##   3      1      1      1      1
##   4      1      1      1      1
table(both$doctor, both$season)
##       autumn spring summer winter
##   Bob      2      2      2      2
##   Mik      2      2      2      2

You see that if we make a cross table with the season, there are no zeros. Therefore, we keep our nested part of the model unchanged, but add a cross part to it:

lmer( response ~ predictor + (1|doctor/patient)) + (1|season) )

Another example would be patients which are only allowed to go to their local hospital, making them nested (1|hospital/patient). However the doctor are allowed to work at several hospitals, making this part of random effect crossed: (1|hospital/patient) + (1|doctor).

Or, think about the school students: every student can belong to only one class and to only one school, thus the effect is nested, while, as picture above shows, every school has “Class 1”, thus such effect is crossed. This experimental design results in following model structure:

lmer(response ~ predictor + (1|school/class/student) + (1|class)) + (1|school)


A picture from this amazing nature paper on nested designs summarizes my explanations perfectly:


The golden rule: if random effects aren’t nested, they are crossed!

Important to know, that without knowledge about the experimental design, it might be impossible to figure out whether the effects are nested or crossed from the data alone (the data could have been recorded wrongly).

What’s next

Yury Zablotski
Data Scientist at LMU Munich, Faculty of Veterinary Medicine

Passion for applying Biostatistics and Machine Learning to Life Science Data


comments powered by Disqus