This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Simple univariate animal model

Fitting a simple univariate model in R.

Here we demonstrate how to fit the simplest univariate animal model to calculate the additive genetic variance component of a phenotypic variable, for example body size. In later sections we show how to build more complex models with fixed or random effects and how to compute heritability.

is the simplest of all animal models and assumes no repeated measures across years or individuals.

Example data

For this tutorial we will use the simulated gryphon dataset (download zip file).

phenotypicdata <- read.csv("data/gryphon.csv")
pedigreedata <- read.csv("data/gryphonped.csv")

The univariate animal model

Definitions

Sample code in the following examples includes the following variables:

  • Response variable: Size
  • Fixed effects: Intercept (\(\mu\))
  • Random effects: Additive genetic variance
  • Data containing phenotypic information: phenotypicdata
  • Data containing pedigree data:pedigreedata

We first fit the simplest possible animal model: no fixed effects apart from the interecept, a single random effect (the breeding values, associated with the additive genetic variance), and Gaussian redisuals.

Here it is not too difficult to describe the model in plain English, but for more complex models a mathematical equation may prove easier, much shorter and less ambiguous. Here we could write the model as (y_i = \mu + a_i + \epsilon_i), where $y_i $ is the response for individual (i), with the residuals assumed to follow a normal distribution of mean zero and variance (\sigma_R^2), which we can write as (\mathbf{\epsilon} \sim N(0,\mathbf{I}\sigma_R^2)), where (\mathbf{I}) is an identity matrix; and with the breeding values (\mathbf{a} \sim N(0,\mathbf{A} \sigma_A^2), where (\mathbf{A}) is the pairwise additive genetic relatedness matrix.

Using MCMCglmm

Here is the simplest implementation of an animal model in MCMCglmm:

library(MCMCglmm) #load the package

inverseAmatrix <- inverseA(pedigree = pedigreedata)$Ainv # compute the inverse relatedness matrix

model1.1<-MCMCglmm(birth_weight~1, #Response and Fixed effect formula random=~id, # Random effect formula ginverse = list(id=inverseAmatrix), # correlations among random effect levels (here breeding values) data=phenotypicdata)# data set

Note the use of the argument ginverse

summary(model1.1)
## 
##  Iterations = 3001:12991
##  Thinning interval  = 10
##  Sample size  = 1000 
## 
##  DIC: 3914.063 
## 
##  G-structure:  ~id
## 
##    post.mean l-95% CI u-95% CI eff.samp
## id     3.406     2.34     4.71    177.8
## 
##  R-structure:  ~units
## 
##       post.mean l-95% CI u-95% CI eff.samp
## units     3.845    2.916      4.9    195.4
## 
##  Location effects: birth_weight ~ 1 
## 
##             post.mean l-95% CI u-95% CI eff.samp  pMCMC    
## (Intercept)     7.592    7.322    7.889    865.2 <0.001 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Using ASReml

From R:

model1<-asreml(fixed=SIZE~ 1                    #1  
  , random= ~ped(ANIMAL,var=T,init=1)           #2  
  , data=phenotypicdata                         #3
  ,ginverse=list(ANIMAL=pedigreedata_inverse)   #4
  , na.method.X="omit", na.method.Y="omit")     #5

1 - Adding fixed effects

A short lead description about this content page. It can be bold or italic and can be split over multiple paragraphs.

Adding fixed effects to a model

Using MCMCglmm

model1.2<-MCMCglmm(BWT~SEX,random=~animal,
                       pedigree=Ped,data=Data,prior=prior1.1,
                       nitt=65000,thin=50,burnin=15000,verbose=FALSE)

posterior.mode(model1.2$Sol[,"SEX2"])

HPDinterval(model1.2$Sol[,"SEX2"],0.95)

posterior.mode(model1.2$VCV)

posterior.heritability1.2<-model1.2$VCV[,"animal"]/
                (model1.2$VCV[,"animal"]+model1.2$VCV[,"units"])
posterior.mode(posterior.heritability1.2)

HPDinterval(posterior.heritability1.2,0.95)

Using ASReml

ASReml analysis of size  
 ANIMAL       !P 
 SIZE
 SEX         !A  #denotes a factor
 AGE          #here treated as a linear effect

pedigreedata.ped      !skip 1   
phenotypicdata.dat    !skip 1    !DDF 1 !FCON #specifies method of sig testing
 
SIZE ~ mu  AGE SEX AGE.SEX !r ANIMAL

2 - Calculating heritability

A short lead description about this content page. It can be bold or italic and can be split over multiple paragraphs.

Calculating heritability

Using MCMCglmm


posterior.heritability1.1<-model1.1$VCV[,"animal"]/
             (model1.1$VCV[,"animal"]+model1.1$VCV[,"units"])

HPDinterval(posterior.heritability1.1,0.95)

posterior.mode(posterior.heritability1.1)

plot(posterior.heritability1.1)

Using ASReml

In ASReml standalone: In ASReml a second command file (with extension .pin) is used to caculate functions of estimated variance components ad their associated standard errors. So for a model in the .as file such as

SIZE ~ mu ! ANIMAL

the primary output file (.asr) will contain two variance components. The first will be the ANIMAL (i.e. additive genetic component), the second will be the residual variance. A .pin file to calculate heritability from these components migt be

F VP 1+2  #adds components 1 and 2 to make a 3rd variance denoted VP
H h2 1 3  #divides 1 (VA) by 3 (VP) to calculate h2

NOTE - if you change the random effects stucture of your model in .as you need to modify the .pin file accordingly or you will get the wrong answer!

From R:

summary(model)$varcomp[1,3]/sum(summary(model)$varcomp[,3])

#1: SIZE is the response variable and the only fixed effect is the mean(denoted as1) #2: fit random effect of ANIMAL Va with an arbitrary starting value of 1 #3: use data file phenotypic data #4: connect the individual in the data file to the pedigree #5: omit any rows where the response or predictor variables are missing

to see the estimates of the fixed effects:

summary(model)$coef.fixed

and the estimates of the random effects:

summary(model)$varcomp

3 - Testing random effects

A short lead description about this content page. It can be bold or italic and can be split over multiple paragraphs.

Testing significance of random effects

Using MCMCglmm

MCMCglmm, and more in general Bayesian methods, do not provide a simple, consensual way to test the statistical significance of a variance parameters. Indeed, variances parameters are constrained to be positive, and their credible intervals (e.g., as returned by HPDinterval()) cannot include exactly zero (although it may look like it due to rounding. Covariance and correlation parameters do not have this issue because they are not constrained to be positive and their credible interval can be used to estimate the probability that they are positive or negative. The old WAMBAM website recommended to compare DIC (Deviance Information Criterion, analog to AIC) across models with and without a random effect. However, DIC may be focused at different levels of a mixed model, and is calculated for the lowest level of the hierarchy in MCMCglmm, which may not be appropriate for comparing different random effect structures.

Using ASReml

In ASReml statistical the significance of a variance parameter can be tested using a Likelihood Ratio Test. Fit a model with and without a particular random effect. Then use log likelihoods reported in the primary results file to perform a ratio test.

From R:

model1<-asreml(fixed=SIZE~1+SEX
           ,random=~ped(ANIMAL,var=T,init=1)+RANDOMEFFECT
           ,data=phenotypicdata
           ,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')

model2<-asreml(fixed=SIZE~1+SEX
           ,random=~ped(ANIMAL,var=T,init=1)
           ,data=phenotypicdata
           ,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')

#calculate the chi-squared stat for the log-likelihood ratio test
2*(model1$loglik-model2$loglik) 

#calculate the associated significance
1-pchisq(2*(model1$loglik-model2$loglik),df=1)

However, this test is conservative with 1 degree of freedom. Using df=0.5 gives a better (but still a bit conservative) test.

4 - Testing random effects

A short lead description about this content page. It can be bold or italic and can be split over multiple paragraphs.

Testing significance of random effects

Using MCMCglmm

MCMCglmm, and more in general Bayesian methods, do not provide a simple, consensual way to test the statistical significance of a variance parameters. Indeed, variances parameters are constrained to be positive, and their credible intervals (e.g., as returned by HPDinterval()) cannot include exactly zero (although it may look like it due to rounding. Covariance and correlation parameters do not have this issue because they are not constrained to be positive and their credible interval can be used to estimate the probability that they are positive or negative. The old WAMBAM website recommended to compare DIC (Deviance Information Criterion, analog to AIC) across models with and without a random effect. However, DIC may be focused at different levels of a mixed model, and is calculated for the lowest level of the hierarchy in MCMCglmm, which may not be appropriate for comparing different random effect structures.

Using ASReml

In ASReml statistical the significance of a variance parameter can be tested using a Likelihood Ratio Test. Fit a model with and without a particular random effect. Then use log likelihoods reported in the primary results file to perform a ratio test.

From R:

model1<-asreml(fixed=SIZE~1+SEX
           ,random=~ped(ANIMAL,var=T,init=1)+RANDOMEFFECT
           ,data=phenotypicdata
           ,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')

model2<-asreml(fixed=SIZE~1+SEX
           ,random=~ped(ANIMAL,var=T,init=1)
           ,data=phenotypicdata
           ,ginverse=list(ANIMAL=ainv), na.method.X="omit', na.method.Y="omit')

#calculate the chi-squared stat for the log-likelihood ratio test
2*(model1$loglik-model2$loglik) 

#calculate the associated significance
1-pchisq(2*(model1$loglik-model2$loglik),df=1)

However, this test is conservative with 1 degree of freedom. Using df=0.5 gives a better (but still a bit conservative) test.