Using the BLUPF90 family of programs for genomics

Ignacy Misztal, Ignacio Aguilar, Shogo Tsuruta & Andres Legarra, September 22, 2011

Nearly all programs from BLUPF90 family have been updated to support genomic analyses using the single-step approach. To utilize genotypes in these programs:

1.     Read papers about the single-step methodology (see list at the end).

2.     Look at example1 and at example2

3.     Download data files  for example 1 and run.

4.     Prepare genotype file in the correct format.

5.     Run renumf90 with keyword for SNP file.

6.     Run blupf90, remlf90, etc.

7.     If problems, sign for group blupf90 at groups.yahoo.com.

See an additional documentation and a theoretical justification of single-step.

 

Genomic calculations are normalized for the following case:

-        Single line of single breed

-        SNP of at least 10k SNP or imputed from at least 3k

Automatic edits on genotypes include:

a)     Removal of monomorphic SNP and SNP with MAF < 0.05

b)    Removal of SNP with < 0.9 callrate

c)     Removal of genotypes with < 0.9 callrate (after the previous edit)

d)    Removal of parents with parent-offspring conflicts.

New edits are added as needed. Many defaults can be overridden with OPTION keywords; these keywords are listed in Readme.

An output from blupf90 etc. includes details of processing the genomic information. In case of problems, examine that output carefully. In particular, look at the number of SNP removed due to callrates, the number of animals removed due to call rates, and the number of animals removed due to parent-offspring conflicts. Also examine correlations between genomic and pedigree relationships. With good genotypes and a few generations of pedigree this correlation is 0.7-0.9. Low correlations indicate possibly wrong genotype IDs and the program stops when those correlations are < 0.3.

Currently, genotypes are limited to 3,000 animals with no limit on the number of pedigrees or phenotypes. 

 

Click here to access most of papers listed below.

Paper

Content/message

Misztal et al., 2009

Motivation for single-step; computations

Legarra et al., 2009

Construction of H matrix

Aguilar et al., 2010

H-1, large application in dairy

Forni et al., 2010

Effect of scaling of G on estimates of variances and EBV

Chen et al., 2010

Application in chicken

Simeone et al., 2011a

Multiple lines and diagonals of G

Simeone et al, 2011b

Effects of scaling of G on multiple lines

Aguilar et al., 2011a

Efficient computations of G and A22

Aguilar et al., 2011b

MT application for dairy fertility

Tsuruta et al., 2011

Application for 18 type traits in Holsteins

Chen et al., 2011

Effect of MAF and different allele frequencies on accuracy and biases; origins of biases of for genotyped subjects

Vietzica et al., 2011

Effects of selection on genomic analyses by single and two-step method; theoretical derivation for scaling of G