This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
readme.blupf90new [2022/06/29 19:22] dani removed |
— (current) | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== BLUPF90+ ====== | ||
- | |||
- | ===== Summary ===== | ||
- | This program combines blupf90, remlf90, and airemlf90.\\ | ||
- | |||
- | \\ | ||
- | See [[http://nce.ads.uga.edu/wiki/doku.php?id=readme.pregsf90|PREGSF90]] for options using genomic (SNP) information. | ||
- | |||
- | \\ | ||
- | **Hint**: type ''blupf90+ %%--%%help'' to see all the blupf90+ options or ''blupf90+ %%--%%help-genomic'' to see all the genomic options blupf90+ can take. | ||
- | ===== Options ===== | ||
- | <file> | ||
- | OPTION method VCE (default BLUP with blupf90 options) | ||
- | </file> | ||
- | Runs airemlf90 for variance component estimation (default running blupf90) | ||
- | <file> | ||
- | OPTION conv_crit 1d-12 | ||
- | </file> | ||
- | Convergence criterion (default 1d-10). | ||
- | <file> | ||
- | OPTION maxrounds 1000 | ||
- | </file> | ||
- | Maximum rounds (default 5000). When the number = 0, the program calculates BLUP without iterating REML and some statistics (-2logL, AIC, SE for (co)variances, ...). | ||
- | <file> | ||
- | OPTION EM-REML 10 | ||
- | </file> | ||
- | Runs EM-REML (REMLF90) for first 10 rounds to get initial variances within the parameter space (default 0). | ||
- | <file> | ||
- | OPTION use_yams | ||
- | </file> | ||
- | Runs the program with YAMS (modified FSPAK). The computing time can be dramatically improved. | ||
- | <file> | ||
- | OPTION tol 1d-12 | ||
- | </file> | ||
- | Tolerance (or precision) (default 1d-14) for positive definite matrix and g-inverse subroutines.\\ | ||
- | Convergence may be much faster by changing this value. | ||
- | <file> | ||
- | OPTION sol se | ||
- | </file> | ||
- | Stores solutions and those standard errors. | ||
- | <file> | ||
- | OPTION store_pev_pec 6 | ||
- | </file> | ||
- | Stores triangular matrices of standard errors and its covariances for correlated random effects such as direct-maternal effects and random-regression effects in "pev_pec_bf90". | ||
- | <file> | ||
- | OPTION residual | ||
- | </file> | ||
- | y-hat and residuals will be included in "yhat_residual". | ||
- | <file> | ||
- | OPTION missing -999 | ||
- | </file> | ||
- | Specifies the missing value (default 0) in integer. | ||
- | <file> | ||
- | OPTION constant_var 5 1 2 ... | ||
- | </file> | ||
- | 5: effect number\\ | ||
- | 1: first trait number\\ | ||
- | 2: second trait number\\ | ||
- | implying the covariance between traits 1 and 2 for effect 5. | ||
- | |||
- | **Heterogeneous residual variances for a single trait** | ||
- | <file> | ||
- | OPTION hetres_pos 10 11 | ||
- | </file> | ||
- | Specifies the position of covariables. | ||
- | <file> | ||
- | OPTION hetres_pol 4.0 0.1 0.1 | ||
- | </file> | ||
- | Initial values of coefficients for heterogeneous residual variances using //ln//(a0, a1, a2, ...) to make these values. | ||
- | |||
- | **Heterogeneous residual variances for multiple traits**\\ | ||
- | Convergence will be very slow with multiple trait heterogeneous residual variances | ||
- | <file> | ||
- | OPTION hetres_pos 10 10 11 11 | ||
- | </file> | ||
- | or | ||
- | <file> | ||
- | OPTION hetres_pos 10 11 12 13 | ||
- | </file> | ||
- | Specify the position of covariables (trait first). | ||
- | "10 10" or "10 11" could be linear for first and second traits.\\ | ||
- | "11 11" or "12 13" could be quadratic. | ||
- | <file> | ||
- | OPTION hetres_pol 4.0 4.0 0.1 0.1 0.01 0.01 | ||
- | </file> | ||
- | Initial values of coefficients for heterogeneous residual variances using //ln//(a0, a1, a2, ...) to make these values (trait first).\\ | ||
- | "4.0 4.0" are intercept for first and second traits.\\ | ||
- | "0.1 0.1" could be linear and "0.01 0.01" could be quadratic.\\ | ||
- | To transform back to the original scale, use exp(a0+a1*X1+a2*X2). | ||
- | <file> | ||
- | OPTION SNP_file snp | ||
- | </file> | ||
- | Specify the SNP file name to use genotype data. | ||
- | |||
- | <file>OPTION se_covar_function <label> <function></file> | ||
- | As an alternative of SE, calculate SD for function of (co)variances by repeated sampling of parameters estimates from their asymptotic multivariate normal distribution, following ideas presented by Meyer and Houle 2013.\\ | ||
- | \\ | ||
- | ''<label>''\\ | ||
- | A name for a particular function (e.g., ''P1'' for phenotypic variance of trait 1, ''H2_1'' for heritability for trait 1, ''rg12'' for genetic correlation between traits 1 and 2, …).\\ | ||
- | \\ | ||
- | ''<function>''\\ | ||
- | A formula to calculate a function of (co)variances to estimate SD. All terms of the function should be written with no spaces.\\ | ||
- | \\ | ||
- | Each term of the function corresponds to (co)variance elements and could include any random effects (G) and residual (R) (co)variances.\\ | ||
- | \\ | ||
- | Notation is with reference to the effect number and the trait number (''G_eff1_eff2_trt1_trt2'') that indicate the element of the (co)variance matrix for random effect ''eff1'' and ''eff2'' and ''trt1'' and ''trt2'',\\ | ||
- | where ''eff1'' and ''eff2'' are effect numbers 1 and 2, and ''trt1'' and ''trt2'' are trait numbers 1 and 2.\\ | ||
- | ''R_trt1_trt1'' indicates the element of the residual (co)variance matrix for traits 1 and 2.\\ | ||
- | \\ | ||
- | Several functions could be added, with one OPTION line per function.\\ | ||
- | \\ | ||
- | Examples:\\ | ||
- | \\ | ||
- | ''OPTION se_covar_function P G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1''\\ | ||
- | |||
- | ''OPTION se_covar_function H2d G_2_2_1_1/(G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1)''\\ | ||
- | |||
- | ''OPTION se_covar_function H2t (G_2_2_1_1+1.5*G_2_3_1_1+0.5*G_3_3_1_1)/(G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1)''\\ | ||
- | |||
- | ''OPTION se_covar_function rg12 G_2_2_1_2/(G_2_2_1_1*G_2_2_2_2)**0.5''\\ | ||
- | \\ | ||
- | The first function calculates the SD for the total variance for a maternal model with permanent maternal effect, where 2 and 3 are the effect number for the direct and maternal additive genetic effects respectively, and 4 is the effect number for the maternal permanent random effect. | ||
- | |||
- | The second function calculates the heritability for the direct component. | ||
- | |||
- | The third function the total heritability. | ||
- | |||
- | The fourth function calculates the SD of the genetic correlation between traits 1 and 2 for the direct genetic effect (effect number 2) | ||
- | |||
- | <file>OPTION samples_se_covar_function <n></file> | ||
- | Set the number of samples to calculate SE for function of (co)variances.\\ | ||
- | default value 10000 | ||
- | <file>OPTION out_se_covar_function</file> | ||
- | Indicate to store in file samples of (co)variances function for postprocessing (histogram, etc.) | ||
- | |||
- | |||
- | ==== Storing accuracy based on PEV ==== | ||
- | |||
- | OPTION store_accuracy eff | ||
- | Stores reliabilities based on PEV, where //eff// is the number of the animal effect.\\ | ||
- | By default, it uses inbreeding (F) in the denominator of the reliability formula: reliability = 1-PEV/($\sigma\mathbf{}_{u}^{2}$(1 + F)) [[https://doi.org/10.1111/jbg.12470|Aguilar et al. (2020)]].\\ | ||
- | It uses inbreeding based on $\mathbf{A}$ or $\mathbf{H}$ from the direct inversion of $\mathbf{A}^{-1}$ or $\mathbf{H}^{-1}$, whichever is being used. \\ | ||
- | |||
- | OPTION type 1.0 | ||
- | Select 1.0 for dairy cattle (Reliability) or 0.5 for beef cattle (BIF accuracy) (default 1.0). | ||
- | |||
- | OPTION correct_accuracy_by_inbreeding filaname | ||
- | //filename// is the name of the inbreeding file if other than renf90.inb | ||
- | |||
- | OPTION correct_accuracy_by_inbreeding_direct 0 | ||
- | This option turns off the inbreeding correction in the reliability formula. | ||
- | |||
- | ==== Omit A-inverse ==== | ||
- | |||
- | OPTION omit_ainv | ||
- | |||
- | This option prohibits the program from creating $\mathbf{A}^{-1}$. | ||
- | It is especially useful for GBLUP. | ||
- | For example, if you would like to perform the exact GBLUP, you can put the following options to your parameter file. | ||
- | |||
- | OPTION omit_ainv | ||
- | OPTION TauOmega 1.0 0.0 | ||
- | OPTION AlphaBeta 0.95 0.05 | ||
- | |||
- | With the above options, the program doesn't create $\mathbf{A}^{-1}$ but calculates $\tau\mathbf{G}^{-1}-\omega\mathbf{A}_{22}^{-1}$. When the omega ($\omega$) is zero, only $\mathbf{G}^{-1}$ will be included in the equations. $\mathbf{G}$ is blended with $\mathbf{A}_{22}$ as $\alpha\mathbf{G}+\beta\mathbf{A}_{22}$ before the inversion ($\alpha=0.95$ and $\beta=0.05$ in this case). | ||
- | |||
- | === Details === | ||
- | Assuming a single-trait ssGBLUP, the mixed model equations are as follows. | ||
- | |||
- | \( | ||
- | \left[ | ||
- | \begin{array}{ll} | ||
- | \mathbf{X}'\mathbf{X} & \mathbf{X}'\mathbf{Z}\\ | ||
- | \mathbf{Z}'\mathbf{X} & \mathbf{Z}'\mathbf{Z} + \lambda\mathbf{H}^{-1} | ||
- | \end{array} | ||
- | \right] | ||
- | \left[ | ||
- | \begin{array}{c} | ||
- | \mathbf{\hat{b}}\\ | ||
- | \mathbf{\hat{u}} | ||
- | \end{array} | ||
- | \right] | ||
- | = | ||
- | \left[ | ||
- | \begin{array}{c} | ||
- | \mathbf{X}'\mathbf{y}\\ | ||
- | \mathbf{Z}'\mathbf{y} | ||
- | \end{array} | ||
- | \right] | ||
- | \) | ||
- | |||
- | where $\mathbf{H}$ is a matrix combining additive genetic relationship matrices and a genomic relationship matrix. | ||
- | |||
- | \( | ||
- | \mathbf{H}^{-1} | ||
- | = | ||
- | \mathbf{A}^{-1} | ||
- | + | ||
- | \left[ | ||
- | \begin{array}{cc} | ||
- | \mathbf{0} & \mathbf{0} \\ | ||
- | \mathbf{0} & \tau\mathbf{G}^{-1}-\omega\mathbf{A}_{22}^{-1} | ||
- | \end{array} | ||
- | \right] | ||
- | \) | ||
- | |||
- | If we omit $\mathbf{A}^{-1}$ and $\mathbf{A}_{22}^{-1}$, the equations are equivalent to GBLUP. | ||
- | GBLUP by BLUPF90 was not so easy because the program creates $\mathbf{A}^{-1}$ by default and there was no way to avoid this behavior. | ||
- | The new option removes $\mathbf{A}^{-1}$ from the equations so GBLUP will be easily performed. | ||
- | |||
- | |||
- | ===== Tricks ===== | ||
- | When the covariance matrix is close to non-positive definite, the AIREMLF90 may not converge. | ||
- | There are two options you might want to try: | ||
- | |||
- | 1. change the tolerance value (xx) in the option: | ||
- | |||
- | OPTION tol xx | ||
- | |||
- | to a very strict value (e.g., 1d-20) or a lenient value (1d-06). | ||
- | |||
- | 2. use an option to use EM-REML inside AI-REML: | ||
- | |||
- | OPTION EM-REML xx | ||
- | |||
- | where xx is the number of iterations for EM-REML you expect to get a good starting value for AI-REML. After running xx rounds with EM-REML, the AIREMLF90 program will automatically switch from EM-REML to AI-REML using the last estimate from EM-REML as a starting value for AI-REML. | ||