# Adding fixed and random effects

Here we will see how to add fixed effects and random effects to our linear animal model. We will also see how the addition of these effects can impact the calculation of heritability.

We still use the gryphon dataset with `birth_weight`

as the response, and MCMCglmm.

```
phenotypicdata <- read.csv("data/gryphon.csv")
pedigreedata <- read.csv("data/gryphonped.csv")
```

```
library(MCMCglmm)
```

```
inverseAmatrix <- inverseA(pedigree = pedigreedata)$Ainv
```

Previously we run the simple model `model1.2`

(not run again here):

```
model1.2 <- MCMCglmm(birth_weight ~ 1, #Response and Fixed effect formula
random = ~id, # Random effect formula
ginverse = list(id = inverseAmatrix), # correlations among random effect levels
data = phenotypicdata, # data set
burnin = 10000, nitt = 30000, thin = 20) # run the model for longer compare to the default
```

## Adding fixed effects

We will start by adding sex as a fixed effect. Why? Because sexes appear to be quite different in term of birth weight and our `model1.2`

does not capture those differences. A model that accounts for important structures in the response variables will fit the data better and will tend to provide empirical parameter estimates that are more accurate, precise and better correspond to the conceptual parameter of interest. The addition of any fixed effect must be motivated by the scientific question or the need to control structures in the data though, think about it carefully.

We can check that sexes are quite different in birth weight visually:

```
library(ggplot2)
ggplot(phenotypicdata, aes(x=as.factor(sex), y=birth_weight)) + geom_boxplot()
```

```
## Warning: Removed 230 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
```

Sex is coded with values 1 and 2. It can be simpler to read model outputs if sex is coded with values 0 and 1, or with explicit labels. We will use `"M"`

and `"F"`

and store the new encoding as the column `sexMF`

:

```
phenotypicdata$sexMF <- ifelse(phenotypicdata$sex==1, "M", "F")
```

To add `sexMF`

as a fixed effect to the model we add the column name in the line which contains the response variable, on the right-hand side of the tilde (`~`

). Different fixed effects can be added using `+`

signs. The `1`

that was already there stands for the intercept.

```
model1.3 <- MCMCglmm(birth_weight ~ 1 + sexMF, #Response and Fixed effect formula
random = ~id,
ginverse = list(id = inverseAmatrix),
data = phenotypicdata,
burnin = 10000, nitt = 30000, thin = 20)
```

```
summary(model1.3)
```

```
##
## Iterations = 10001:29981
## Thinning interval = 20
## Sample size = 1000
##
## DIC: 3714.216
##
## G-structure: ~id
##
## post.mean l-95% CI u-95% CI eff.samp
## id 3.102 2.226 4.202 447.9
##
## R-structure: ~units
##
## post.mean l-95% CI u-95% CI eff.samp
## units 2.936 2.193 3.748 469.8
##
## Location effects: birth_weight ~ 1 + sexMF
##
## post.mean l-95% CI u-95% CI eff.samp pMCMC
## (Intercept) 8.269 7.978 8.529 1000.0 <0.001 ***
## sexMFM -2.211 -2.508 -1.891 921.4 <0.001 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

The model summary shows a posterior mean difference of -2.211 of males compared to females. You know the difference is about males because the name of the parameter in the summary ends with “M”. Females are therefore the reference in this model.

Fixed effects are stored in the `$Sol`

element of MCMCglmm models.
We can visualise the trace and posterior distribution of the difference:

```
plot(model1.3$Sol[,"sexMFM"])
```

And we can compute the usual summary of the posterior distribution of the sex difference:

```
median(model1.3$Sol[,"sexMFM"])
```

```
## [1] -2.219992
```

```
posterior.mode(model1.3$Sol[,"sexMFM"])
```

```
## var1
## -2.242623
```

```
HPDinterval(model1.3$Sol[,"sexMFM"])
```

```
## lower upper
## var1 -2.507572 -1.891486
## attr(,"Probability")
## [1] 0.95
```

### “Testing” fixed effects

We can also calculate posterior probabilities that are analogue to p-values using the proportion of posterior samples that are beyond a certain value. Thus, the posterior probability that the difference is not negative is:

```
mean( model1.3$Sol[,"sexMFM"] >= 0 )
```

```
## [1] 0
```

We will generally want to double that value to make it analogue to a two-sided p-value:

```
2 * mean( model1.3$Sol[,"sexMFM"] >= 0 )
```

```
## [1] 0
```

Here there are no samples above zero, so twice the posterior probability is estimated to exactly 0. Since we had only 1000 samples we are actually unable to estimate a probability less that 0.001. That is why we will want to report a posterior probability $p_{MCMC}$ of `<0.001`

. That is what is reported in the summary of the model.

Note that you can compute posterior probabilities testing the parameter estimate against arbitrary values. For instance, the $p_{MCMC}$ that the sex difference is above $-2$ is:

```
2 * mean( model1.3$Sol[,"sexMFM"] >= -2 )
```

```
## [1] 0.188
```

and the $p_{MCMC}$ that the sex difference is below $-2.5$ is:

```
2 * mean( model1.3$Sol[,"sexMFM"] <= -2.5 )
```

```
## [1] 0.064
```

### Adding a continuous fixed effect

There are not many variables to experiment with in this dataset, but let’s say we want to include cohort as a fixed effect, for instance because we want to account for a linear change in the response through time.

Let’s just re-scale cohort to avoid changing the intercept and perhaps to help a bit the estimation algorithm (it is often easier to have fixed effects on similar scales) and interpretation (it may be easier to see what is a big or small effect when fixed effects are standardized, although it is by no mean necessary).

```
model1.4 <- MCMCglmm(birth_weight ~ 1 + sexMF + scale(cohort), #Response and Fixed effect formula
random = ~id,
ginverse = list(id = inverseAmatrix),
data = phenotypicdata,
burnin = 10000, nitt = 30000, thin = 20)
```

```
summary(model1.4)
```

```
##
## Iterations = 10001:29981
## Thinning interval = 20
## Sample size = 1000
##
## DIC: 3711.059
##
## G-structure: ~id
##
## post.mean l-95% CI u-95% CI eff.samp
## id 3.101 2.068 4.004 347.7
##
## R-structure: ~units
##
## post.mean l-95% CI u-95% CI eff.samp
## units 2.912 2.146 3.701 427.4
##
## Location effects: birth_weight ~ 1 + sexMF + scale(cohort)
##
## post.mean l-95% CI u-95% CI eff.samp pMCMC
## (Intercept) 8.284178 8.034420 8.564222 1125 <0.001 ***
## sexMFM -2.226082 -2.540083 -1.928606 1000 <0.001 ***
## scale(cohort) -0.180259 -0.360857 0.005216 1000 0.052 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

As previously we can plot the trace and posterior distribution of the effect of cohort, compute its posterior mean, median, mode, credible interval, and $p_{MCMC}$

```
plot(model1.4$Sol[,"scale(cohort)"])
```

```
mean(model1.4$Sol[,"scale(cohort)"])
```

```
## [1] -0.1802587
```

```
median(model1.4$Sol[,"scale(cohort)"])
```

```
## [1] -0.1803727
```

```
posterior.mode(model1.4$Sol[,"scale(cohort)"])
```

```
## var1
## -0.1720475
```

```
HPDinterval(model1.4$Sol[,"scale(cohort)"])
```

```
## lower upper
## var1 -0.3608569 0.005216475
## attr(,"Probability")
## [1] 0.95
```

```
2*mean(model1.4$Sol[,"scale(cohort)"]>0)
```

```
## [1] 0.052
```

## Adding random effects

We will often want to include random effects in addition to the additive genetic variance random effect. In particular it is an important way to control for non-genetic sources of similarity between relatives which may inflate the estimates of additive genetic variance and heritability.

For instance, it is very common to include the mother identity as a random effect, as individuals born from the same mother may share an environment in early life (e.g., nest or food provisioning).

To include mother as a random effect we write the variable name in the argument `random`

, after `ìd +`

. We do not need to change the argument `ginverse`

if we assume that the levels of the additional random effect are uncorrelated (this is often not strictly true, and it can be interesting to model the genetic correlations between maternal levels, but this is a more advanced topic.)

```
model1.5 <- MCMCglmm(birth_weight ~ 1 + sexMF + scale(cohort),
random = ~id + mother, # Random effect formula
ginverse = list(id = inverseAmatrix),
data = phenotypicdata,
burnin = 10000, nitt = 30000, thin = 20)
```

```
summary(model1.5)
```

```
##
## Iterations = 10001:29981
## Thinning interval = 20
## Sample size = 1000
##
## DIC: 3568.019
##
## G-structure: ~id
##
## post.mean l-95% CI u-95% CI eff.samp
## id 2.671 1.594 3.921 189.5
##
## ~mother
##
## post.mean l-95% CI u-95% CI eff.samp
## mother 1.071 0.5586 1.589 563.9
##
## R-structure: ~units
##
## post.mean l-95% CI u-95% CI eff.samp
## units 2.306 1.375 3.174 201.3
##
## Location effects: birth_weight ~ 1 + sexMF + scale(cohort)
##
## post.mean l-95% CI u-95% CI eff.samp pMCMC
## (Intercept) 8.319287 8.034262 8.582700 1000 <0.001 ***
## sexMFM -2.233155 -2.542082 -1.917310 1000 <0.001 ***
## scale(cohort) -0.181168 -0.375359 -0.004372 1000 0.05 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

Posterior distributions of random effect variance parameters are stored in the `$VCV`

element of MCMCglmm models.

We can visualise the trace and posterior distribution of the variance associated with mother identity as:

```
plot(model1.5$VCV[,"mother"])
```

And we can compute summaries of the posterior distribution:

```
mean(model1.5$VCV[,"mother"])
```

```
## [1] 1.071025
```

```
median(model1.5$VCV[,"mother"])
```

```
## [1] 1.059679
```

```
posterior.mode(model1.5$VCV[,"mother"])
```

```
## var1
## 0.981803
```

```
HPDinterval(model1.5$VCV[,"mother"])
```

```
## lower upper
## var1 0.5586244 1.589435
## attr(,"Probability")
## [1] 0.95
```

Note that it is not really meaningful to compute $p_MCMC$ to test a variance in greater than zero, as the parameter is constrained to be greater than zero, and the probability will always be 1. It is possible to compute $p_MCMC$ for other values though, for instance, we can estimate the probability that the variance associated with mother identity is greater than 1.5:

```
2*mean(model1.5$VCV[,"mother"] >= 1.5 )
```

```
## [1] 0.14
```

or we can estimate if the variance associated with mother identity is less than the additive genetic variance:

```
2*mean(model1.5$VCV[,"mother"] > model1.5$VCV[,"id"] )
```

```
## [1] 0.022
```

(here the additive genetic variance is almost certainly greater than the mother variance.)

We can add a third random effect, for instance cohort (yes, we can include cohort both as fixed and random effect):

```
model1.6 <- MCMCglmm(birth_weight ~ 1 + sexMF + scale(cohort),
random = ~id + mother + cohort, # Random effect formula
ginverse = list(id = inverseAmatrix),
data = phenotypicdata,
burnin = 10000, nitt = 30000, thin = 20)
```

## Computing heritability without accounting for fixed effects

We could compute heritability as the ratio of additive genetic variance over the sum of random effect and residual variance. In the case of model1.6 that would be $h^2 = V_A / (V_A + V_M + V_C + V_R) $

```
h2_nofixef <- model1.6$VCV[,"id"] / (model1.6$VCV[,"id"] + model1.6$VCV[,"mother"] + model1.6$VCV[,"cohort"] + model1.6$VCV[,"units"])
```

```
median(h2_nofixef)
```

```
## [1] 0.386974
```

```
HPDinterval(h2_nofixef)
```

```
## lower upper
## var1 0.2455021 0.5369925
## attr(,"Probability")
## [1] 0.95
```

Note that the estimate of heritability is a bit less than what we estimated from model1.2. This is likely because the random effect `mother`

corrects for some similarity between siblings that was conflated with additive genetic variance in the that did not account for mother identity.

However, this calculation of heritability may not be satisfactory as it leaves out some variance that is accounted in the fixed effects.

## Computing heritability while accounting for fixed effects

Often we will want to include the variance accounted in the fixed effect back into the computation of heritability. But this is a complex issue and depending on data and scientific questions it may be best to either exclude fixed effects, include all fixed effects, or include some fixed effects but not others. It may even be useful to exclude some random effects (for instance if a random effect structurally captures measurement error.)

The variance due to fixed effect can be computed as the variance in model predictions (without accounting for random effects). Those predictions can be computed as the matrix product of predictors by parameter estimates. In MCMCglmm it can be done like this:

```
predictions1.6 <- model1.6$X %*% t(model1.6$Sol)
```

This object contains one row for each data point and one column for each posterior sample.

```
dim(predictions1.6)
```

```
## [1] 1084 1000
```

For each posterior sample we can compute the variance:

```
fixef_variance <- apply(predictions1.6, MARGIN = 2, var)
```

We can then plug it in the calculation of heritability:

```
h2_fixef <- model1.6$VCV[,"id"] / (model1.6$VCV[,"id"] + model1.6$VCV[,"mother"] + model1.6$VCV[,"cohort"] + model1.6$VCV[,"units"] + fixef_variance)
```

```
median(h2_fixef)
```

```
## [1] 0.3300817
```

## Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.