# Reference code and tutorials

This is a quick reference guide to fit and interpret animal models.

Animal models can be fitted by a variety of software. Currently we present only workflows using the statistical software R using a few specialised packages widely used by the community of wild animal quantitative geneticists: MCMCglmm, ASReml-R, brms, Stan.

If you are completely new to quantitative genetics or animal models we strongly recommend you start by reading some background information about the goals and basic principles of the framework before going through this documentation. See for instance the freely available ecologists guide to the animal model, published in the *Journal of Animal Ecology* and available here. Then start with Simple univariate animal model and walk your way through the different topics in order.

If you are not completely new, look for topics of interests in the menu on the left-hand side.

## Where should I go next?

# 1 - Getting Started

Installing R, packages, loading test data

We will need R, some packages and some data. Here we just make sure you have all you need to get started on the Wild Animal Model adventure. RStudio is not necessary but we may assume you use some of its functionalities.

## Prerequisites

To install R, see https://cran.rstudio.com/index.html.

If you do not know R at all, a good place to get started may be the Carpentries courses. For instance the Data Carpentries on data analysis for ecology in R: https://datacarpentry.org/R-ecology-lesson/
You can follow the content on your own or see if you can attend or request a course in your area (https://carpentries.org/workshops/)

### Install the R-packages you want to use

You can fit most models with any of the packages we present. No need to install all the packages, just pick the one (or few) you want to try. For now we will favour *MCMCglmm*, which is easy to install with a single command:

```
install.packages("MCMCglmm")
```

Same for brms:

Unlike the other packages, ASREML-R relies on a non-free software (ASREML). You will need a license to use it (many universities or research centers have one). If you want to work with ASREML-R see at https://vsni.co.uk/software/asreml-r.

Check also asremlPlus for extra features (https://cran.r-project.org/web/packages/asremlPlus/index.html)

# 1.1 - Loading data

Download data to use in the first examples.

For this tutorial we will use the simulated gryphon dataset (download gryphon data as zip file).

# 2 - Simple univariate animal model

Fitting a simple univariate model in R.

Here we demonstrate how to fit the simplest univariate animal model to calculate the additive genetic variance component of a phenotypic variable, for example body size. In later sections we show how to build more complex models with fixed or random effects and how to compute heritability.

is the simplest of all animal models and assumes no repeated measures across years or individuals.

# Example data

For this tutorial we will use the simulated gryphon dataset (download zip file).

```
phenotypicdata <- read.csv("data/gryphon.csv")
pedigreedata <- read.csv("data/gryphonped.csv")
```

# The univariate animal model

**Definitions**

Sample code in the following examples includes the following variables:

- Response variable: Size
- Fixed effects: Intercept (\(\mu\))
- Random effects: Additive genetic variance
- Data containing phenotypic information: phenotypicdata
- Data containing pedigree data:pedigreedata

We first fit the simplest possible animal model: no fixed effects apart from the interecept, a single random effect (the breeding values, associated with the additive genetic variance), and Gaussian redisuals.

Here it is not too difficult to describe the model in plain English, but for more complex models a mathematical equation may prove easier, much shorter and less ambiguous. Here we could write the model as (y_i = \mu + a_i + \epsilon_i), where $y_i $ is the response for individual (i), with the residuals assumed to follow a normal distribution of mean zero and variance (\sigma_R^2), which we can write as (\mathbf{\epsilon} \sim N(0,\mathbf{I}\sigma_R^2)), where (\mathbf{I}) is an identity matrix; and with the breeding values (\mathbf{a} \sim N(0,\mathbf{A} \sigma_A^2), where (\mathbf{A}) is the pairwise additive genetic relatedness matrix.

## Using MCMCglmm

Here is the simplest implementation of an animal model in MCMCglmm:

```
library(MCMCglmm) #load the package
```inverseAmatrix <- inverseA(pedigree = pedigreedata)$Ainv # compute the inverse relatedness matrix

```
model1.1<-MCMCglmm(birth_weight~1, #Response and Fixed effect formula
random=~id, # Random effect formula
ginverse = list(id=inverseAmatrix), # correlations among random effect levels (here breeding values)
data=phenotypicdata)# data set
```

Note the use of the argument `ginverse`

`summary(model1.1)`

```
##
## Iterations = 3001:12991
## Thinning interval = 10
## Sample size = 1000
##
## DIC: 3914.063
##
## G-structure: ~id
##
## post.mean l-95% CI u-95% CI eff.samp
## id 3.406 2.34 4.71 177.8
##
## R-structure: ~units
##
## post.mean l-95% CI u-95% CI eff.samp
## units 3.845 2.916 4.9 195.4
##
## Location effects: birth_weight ~ 1
##
## post.mean l-95% CI u-95% CI eff.samp pMCMC
## (Intercept) 7.592 7.322 7.889 865.2 <0.001 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

## Using ASReml

From R:

```
model1<-asreml(fixed=SIZE~ 1 #1
, random= ~ped(ANIMAL,var=T,init=1) #2
, data=phenotypicdata #3
,ginverse=list(ANIMAL=pedigreedata_inverse) #4
, na.method.X="omit", na.method.Y="omit") #5
```

# 2.1 - Adding fixed effects

A short lead description about this content page. It can be **bold** or *italic* and can be split over multiple paragraphs.

# Adding fixed effects to a model

## Using MCMCglmm

```
model1.2<-MCMCglmm(BWT~SEX,random=~animal,
pedigree=Ped,data=Data,prior=prior1.1,
nitt=65000,thin=50,burnin=15000,verbose=FALSE)
posterior.mode(model1.2$Sol[,"SEX2"])
HPDinterval(model1.2$Sol[,"SEX2"],0.95)
posterior.mode(model1.2$VCV)
posterior.heritability1.2<-model1.2$VCV[,"animal"]/
(model1.2$VCV[,"animal"]+model1.2$VCV[,"units"])
posterior.mode(posterior.heritability1.2)
HPDinterval(posterior.heritability1.2,0.95)
```

## Using ASReml

```
ASReml analysis of size
ANIMAL !P
SIZE
SEX !A #denotes a factor
AGE #here treated as a linear effect
pedigreedata.ped !skip 1
phenotypicdata.dat !skip 1 !DDF 1 !FCON #specifies method of sig testing
SIZE ~ mu AGE SEX AGE.SEX !r ANIMAL
```

# 2.2 - Calculating heritability

A short lead description about this content page. It can be **bold** or *italic* and can be split over multiple paragraphs.

# Calculating heritability

## Using MCMCglmm

```
posterior.heritability1.1<-model1.1$VCV[,"animal"]/
(model1.1$VCV[,"animal"]+model1.1$VCV[,"units"])
HPDinterval(posterior.heritability1.1,0.95)
posterior.mode(posterior.heritability1.1)
plot(posterior.heritability1.1)
```

## Using ASReml

In ASReml standalone:
In ASReml a second command file (with extension .pin) is used to caculate functions of estimated variance components ad their associated standard errors. So for a model in the .as file such as

`SIZE ~ mu ! ANIMAL`

the primary output file (.asr) will contain two variance components. The first will be the ANIMAL (i.e. additive genetic component), the second will be the residual variance. A .pin file to calculate heritability from these components migt be

```
F VP 1+2 #adds components 1 and 2 to make a 3rd variance denoted VP
H h2 1 3 #divides 1 (VA) by 3 (VP) to calculate h2
```

NOTE - if you change the random effects stucture of your model in .as you need to modify the .pin file accordingly or you will get the wrong answer!

From R:

`summary(model)$varcomp[1,3]/sum(summary(model)$varcomp[,3])`

#1: SIZE is the response variable and the only fixed effect is the mean(denoted as1)
#2: fit random effect of ANIMAL Va with an arbitrary starting value of 1
#3: use data file phenotypic data
#4: connect the individual in the data file to the pedigree
#5: omit any rows where the response or predictor variables are missing

to see the estimates of the fixed effects:

`summary(model)$coef.fixed`

and the estimates of the random effects:

`summary(model)$varcomp`

# 2.3 - Testing random effects

A short lead description about this content page. It can be **bold** or *italic* and can be split over multiple paragraphs.

# Testing significance of random effects

## Using MCMCglmm

MCMCglmm, and more in general Bayesian methods, do not provide a simple, consensual way to test the statistical significance of a variance parameters. Indeed, variances parameters are constrained to be positive, and their credible intervals (e.g., as returned by HPDinterval()) cannot include exactly zero (although it may look like it due to rounding. Covariance and correlation parameters do not have this issue because they are not constrained to be positive and their credible interval can be used to estimate the probability that they are positive or negative. The old WAMBAM website recommended to compare DIC (Deviance Information Criterion, analog to AIC) across models with and without a random effect. However, DIC may be focused at different levels of a mixed model, and is calculated for the lowest level of the hierarchy in MCMCglmm, which may not be appropriate for comparing different random effect structures.

## Using ASReml

In ASReml statistical the significance of a variance parameter can be tested using a Likelihood Ratio Test. Fit a model with and without a particular random effect. Then use log likelihoods reported in the primary results file to perform a ratio test.

From R:

```
model1<-asreml(fixed=SIZE~1+SEX
,random=~ped(ANIMAL,var=T,init=1)+RANDOMEFFECT
,data=phenotypicdata
,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')
model2<-asreml(fixed=SIZE~1+SEX
,random=~ped(ANIMAL,var=T,init=1)
,data=phenotypicdata
,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')
#calculate the chi-squared stat for the log-likelihood ratio test
2*(model1$loglik-model2$loglik)
#calculate the associated significance
1-pchisq(2*(model1$loglik-model2$loglik),df=1)
```

However, this test is conservative with 1 degree of freedom. Using df=0.5 gives a better (but still a bit conservative) test.

# 2.4 - Testing random effects

A short lead description about this content page. It can be **bold** or *italic* and can be split over multiple paragraphs.

# Testing significance of random effects

## Using MCMCglmm

MCMCglmm, and more in general Bayesian methods, do not provide a simple, consensual way to test the statistical significance of a variance parameters. Indeed, variances parameters are constrained to be positive, and their credible intervals (e.g., as returned by HPDinterval()) cannot include exactly zero (although it may look like it due to rounding. Covariance and correlation parameters do not have this issue because they are not constrained to be positive and their credible interval can be used to estimate the probability that they are positive or negative.
The old WAMBAM website recommended to compare DIC (Deviance Information Criterion, analog to AIC) across models with and without a random effect. However, DIC may be focused at different levels of a mixed model, and is calculated for the lowest level of the hierarchy in MCMCglmm, which may not be appropriate for comparing different random effect structures.

## Using ASReml

In ASReml statistical the significance of a variance parameter can be tested using a Likelihood Ratio Test. Fit a model with and without a particular random effect. Then use log likelihoods reported in the primary results file to perform a ratio test.

From R:

```
model1<-asreml(fixed=SIZE~1+SEX
,random=~ped(ANIMAL,var=T,init=1)+RANDOMEFFECT
,data=phenotypicdata
,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')
model2<-asreml(fixed=SIZE~1+SEX
,random=~ped(ANIMAL,var=T,init=1)
,data=phenotypicdata
,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')
#calculate the chi-squared stat for the log-likelihood ratio test
2*(model1$loglik-model2$loglik)
#calculate the associated significance
1-pchisq(2*(model1$loglik-model2$loglik),df=1)
```

However, this test is conservative with 1 degree of freedom. Using df=0.5 gives a better (but still a bit conservative) test.

# 3 - Self-contained tutorials

Collection of self-contained tutorials developped for classes and workshops.

This is a collection of self-contained tutorials developped for classes and workshops