This is an old revision of the document!

BLUPF90+

Summary

This program combines blupf90, remlf90, and airemlf90.

See PREGSF90 for options using genomic (SNP) information.

Hint: type blupf90+ --help to see all the blupf90+ options or blupf90+ --help-genomic to see all the genomic options blupf90+ can take.

Options

OPTION method VCE (default BLUP with blupf90 options)

Runs airemlf90 for variance component estimation (default running blupf90)

OPTION conv_crit 1d-12

Convergence criterion (default 1d-10).

OPTION maxrounds 1000

Maximum rounds (default 5000). When the number = 0, the program calculates BLUP without iterating REML and some statistics (-2logL, AIC, SE for (co)variances, …).

OPTION EM-REML 10

Runs EM-REML (REMLF90) for first 10 rounds to get initial variances within the parameter space (default 0).

OPTION use_yams

Runs the program with YAMS (modified FSPAK). The computing time can be dramatically improved.

OPTION tol 1d-12

Tolerance (or precision) (default 1d-14) for positive definite matrix and g-inverse subroutines.
Convergence may be much faster by changing this value.

OPTION sol se

Stores solutions and those standard errors.

OPTION store_pev_pec 6

Stores triangular matrices of standard errors and its covariances for correlated random effects such as direct-maternal effects and random-regression effects in “pev_pec_bf90”.

OPTION residual

y-hat and residuals will be included in “yhat_residual”.

OPTION missing -999

Specifies the missing value (default 0) in integer.

OPTION constant_var 5 1 2 ...

5: effect number
1: first trait number
2: second trait number
implying the covariance between traits 1 and 2 for effect 5.

Heterogeneous residual variances for a single trait

OPTION hetres_pos 10 11

Specifies the position of covariables.

OPTION hetres_pol 4.0 0.1 0.1

Initial values of coefficients for heterogeneous residual variances using ln(a0, a1, a2, …) to make these values.

Heterogeneous residual variances for multiple traits
Convergence will be very slow with multiple trait heterogeneous residual variances

OPTION hetres_pos 10 10 11 11

or

OPTION hetres_pos 10 11 12 13

Specify the position of covariables (trait first). “10 10” or “10 11” could be linear for first and second traits.
“11 11” or “12 13” could be quadratic.

OPTION hetres_pol 4.0 4.0 0.1 0.1 0.01 0.01

Initial values of coefficients for heterogeneous residual variances using ln(a0, a1, a2, …) to make these values (trait first).
“4.0 4.0” are intercept for first and second traits.
“0.1 0.1” could be linear and “0.01 0.01” could be quadratic.
To transform back to the original scale, use exp(a0+a1*X1+a2*X2).

OPTION SNP_file snp

Specify the SNP file name to use genotype data.

OPTION se_covar_function <label> <function>

As an alternative of SE, calculate SD for function of (co)variances by repeated sampling of parameters estimates from their asymptotic multivariate normal distribution, following ideas presented by Meyer and Houle 2013.

<label>
A name for a particular function (e.g., P1 for phenotypic variance of trait 1, H2_1 for heritability for trait 1, rg12 for genetic correlation between traits 1 and 2, …).

<function>
A formula to calculate a function of (co)variances to estimate SD. All terms of the function should be written with no spaces.

Each term of the function corresponds to (co)variance elements and could include any random effects (G) and residual (R) (co)variances.

Notation is with reference to the effect number and the trait number (G_eff1_eff2_trt1_trt2) that indicate the element of the (co)variance matrix for random effect eff1 and eff2 and trt1 and trt2,
where eff1 and eff2 are effect numbers 1 and 2, and trt1 and trt2 are trait numbers 1 and 2.
R_trt1_trt1 indicates the element of the residual (co)variance matrix for traits 1 and 2.

Several functions could be added, with one OPTION line per function.

Examples:

OPTION se_covar_function P G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1

OPTION se_covar_function H2d G_2_2_1_1/(G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1)

OPTION se_covar_function H2t (G_2_2_1_1+1.5*G_2_3_1_1+0.5*G_3_3_1_1)/(G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1)

OPTION se_covar_function rg12 G_2_2_1_2/(G_2_2_1_1*G_2_2_2_2)**0.5

The first function calculates the SD for the total variance for a maternal model with permanent maternal effect, where 2 and 3 are the effect number for the direct and maternal additive genetic effects respectively, and 4 is the effect number for the maternal permanent random effect.

The second function calculates the heritability for the direct component.

The third function the total heritability.

The fourth function calculates the SD of the genetic correlation between traits 1 and 2 for the direct genetic effect (effect number 2)

OPTION samples_se_covar_function <n>

Set the number of samples to calculate SE for function of (co)variances.
default value 10000

OPTION out_se_covar_function

Indicate to store in file samples of (co)variances function for postprocessing (histogram, etc.)

Storing accuracy based on PEV

OPTION store_accuracy eff

Stores reliabilities based on PEV, where eff is the number of the animal effect.
By default, it uses inbreeding (F) in the denominator of the reliability formula: reliability = 1-PEV/($\sigma\mathbf{}_{u}^{2}$(1 + F)) Aguilar et al. (2020).
It uses inbreeding based on $\mathbf{A}$ or $\mathbf{H}$ from the direct inversion of $\mathbf{A}^{-1}$ or $\mathbf{H}^{-1}$, whichever is being used.

OPTION type 1.0

Select 1.0 for dairy cattle (Reliability) or 0.5 for beef cattle (BIF accuracy) (default 1.0).

OPTION correct_accuracy_by_inbreeding filaname

filename is the name of the inbreeding file if other than renf90.inb

OPTION correct_accuracy_by_inbreeding_direct 0

This option turns off the inbreeding correction in the reliability formula.

Omit A-inverse

  OPTION omit_ainv

This option prohibits the program from creating $\mathbf{A}^{-1}$. It is especially useful for GBLUP. For example, if you would like to perform the exact GBLUP, you can put the following options to your parameter file.

  OPTION omit_ainv
  OPTION TauOmega 1.0 0.0
  OPTION AlphaBeta 0.95 0.05

With the above options, the program doesn't create $\mathbf{A}^{-1}$ but calculates $\tau\mathbf{G}^{-1}-\omega\mathbf{A}_{22}^{-1}$. When the omega ($\omega$) is zero, only $\mathbf{G}^{-1}$ will be included in the equations. $\mathbf{G}$ is blended with $\mathbf{A}_{22}$ as $\alpha\mathbf{G}+\beta\mathbf{A}_{22}$ before the inversion ($\alpha=0.95$ and $\beta=0.05$ in this case).

Details

Assuming a single-trait ssGBLUP, the mixed model equations are as follows.

$ \left[ \begin{array}{ll} \mathbf{X}'\mathbf{X} & \mathbf{X}'\mathbf{Z}\\ \mathbf{Z}'\mathbf{X} & \mathbf{Z}'\mathbf{Z} + \lambda\mathbf{H}^{-1} \end{array} \right] \left[ \begin{array}{c} \mathbf{\hat{b}}\\ \mathbf{\hat{u}} \end{array} \right] = \left[ \begin{array}{c} \mathbf{X}'\mathbf{y}\\ \mathbf{Z}'\mathbf{y} \end{array} \right] $

where $\mathbf{H}$ is a matrix combining additive genetic relationship matrices and a genomic relationship matrix.

$ \mathbf{H}^{-1} = \mathbf{A}^{-1} + \left[ \begin{array}{cc} \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \tau\mathbf{G}^{-1}-\omega\mathbf{A}_{22}^{-1} \end{array} \right] $

If we omit $\mathbf{A}^{-1}$ and $\mathbf{A}_{22}^{-1}$, the equations are equivalent to GBLUP. GBLUP by BLUPF90 was not so easy because the program creates $\mathbf{A}^{-1}$ by default and there was no way to avoid this behavior. The new option removes $\mathbf{A}^{-1}$ from the equations so GBLUP will be easily performed.

Tricks

When the covariance matrix is close to non-positive definite, the AIREMLF90 may not converge. There are two options you might want to try:

1. change the tolerance value (xx) in the option:

OPTION tol xx

to a very strict value (e.g., 1d-20) or a lenient value (1d-06).

2. use an option to use EM-REML inside AI-REML:

OPTION EM-REML xx

where xx is the number of iterations for EM-REML you expect to get a good starting value for AI-REML. After running xx rounds with EM-REML, the AIREMLF90 program will automatically switch from EM-REML to AI-REML using the last estimate from EM-REML as a starting value for AI-REML.

Table of Contents