Differences

This shows you the differences between two versions of the page.

--- readme.aireml [2012/05/28 14:10] – created shogo
+++ readme.aireml [2024/03/25 18:22] (current) – external edit 127.0.0.1
@@ Line 1: / Line 1: @@
-AIREMLF90
+====== AIREMLF90 ======
-A modification of REMLF90 with computing by the Average-Information
+===== Summary =====
-Algorithm.
+A modification of REMLF90 for estimating variances with the Average-Information algorithm. Initially written by Shogo Tsuruta in 03/99-07/99. AIREMLF90 uses a second derivative REML algorithm with extra heuristics, as is described in Jensen et al. (1996-7). For most models, it converges in far fewer rounds than EM-REML as implemented in REMLF90. While typically REMLF90 takes 50-300 rounds to converge, AIREMLF90 converges in 5-15 rounds and to a higher accuracy. The final results will be saved in "airemlf90.log".
+\\
-Initially written by Shogo Tsuruta, University of Georgia, 03/99-07/99
+\\
+See PREGSF90 with genotypes (SNP) for options.
-AIREMLF90 uses a second derivative REML algrithm with extra heuristics, as is
-described in Jensen et al. (1996-7). For most problems, it converges in far
-fewer rounds than EM REML as implemented in REMLF90. While typically REMLF90
-takes 50-300 rounds to converge, AIREMLF90 converges in 5-15 rounds and to a
-higher accuracy. For selected problems, AI REML fails to converge when the
-covariance matrix is close to non-positive definite. Adjust sensitivity of the
-program by setting the appropriate tolerance.
-Several options are avaiable:
-OPTION conv_crit 1d-12
-    convergence criterion (default 1d-10).
-OPTION maxrounds 500
-    maximum rounds (default 5000).
-    when it is negative, the program calculates BLUP without running REML.
-OPTION EM-REML 10
-    run EM-REML (REMLF90) for first 10 rounds to get initial variances within the
-    parameter space (default 0).
+===== Options =====
+<file>
+OPTION conv_crit 1d-10
+</file>
+Convergence criterion (default 1d-12).
+<file>
+OPTION maxrounds n
+</file>
+Maximum rounds (default 5000). When n = 0, the program calculates BLUP without iterating REML and provides some statistics (-2logL, AIC, SE for (co)variances, ...).
+<file>
+OPTION EM-REML n
+</file>
+Run EM-REML (REMLF90) for first n rounds to get initial starting variances for AIREMLF90 within the parameter space (default 0). With n is large (e.g., 1000, 10000, ....), AIREMLF90 runs as REMLF90 until convergence, and then switching back to AIREMLF90.
+<file>
+OPTION use_yams
+</file>
+Run the program with YAMS (modified FSPAK). The computing time can be dramatically improved.
+<file>
 OPTION tol 1d-12
+</file>
-    tolerance (or precision) (default 1d-14) for positive definite matrix and
+Tolerance (or precision) (default 1d-14) for positive definite matrix and g-inverse subroutines.\\
-    g-inverse subroutines. Convergence may be much faster by changing this
+Convergence may be much faster by changing this value.
-    value.
+<file>
 OPTION sol se
+</file>
+Store solutions and those standard errors.
+<file>
+OPTION store_pev_pec 6
+</file>
+Store triangular matrices of standard errors and its covariances for correlated random effects such as direct-maternal effects and random-regression effects in "pev_pec_bf90".
+<file>
+OPTION residual
+</file>
+y-hat and residuals will be included in "yhat_residual".
+<file>
+OPTION missing -999
+</file>
+Specify the missing value (default 0) in integer.
+<file>
+OPTION constant_var 5 1 2 ...
+</file>
+: effect number\\
+: first trait number\\
+: second trait number\\
+implying the covariance between traits 1 and 2 for effect 5.
-    store solutions and s.e.
+**Heterogeneous residual variances for a single trait**
+<file>
+OPTION hetres_pos 10 11
+</file>
+Specify the column positions of (two) covariables in the data file.
+<file>
+OPTION hetres_pol 4.0 0.1 0.1
+</file>
+Initial values of coefficients for heterogeneous residual variances using //ln//(a0, a1, a2, ...) to make these values.
-OPTION missing -1
+**Heterogeneous residual variances for multiple traits**\\
+Convergence will be very slow with multiple trait heterogeneous residual variances
+<file>
+OPTION hetres_pos 10 10 11 11
+</file>
+or
+<file>
+OPTION hetres_pos 10 11 12 13
+</file>
+Specify the column positions of covariables (trait first) in the data file.
+"10 10" or "10 11" could be linear for first and second traits.\\
+"11 11" or "12 13" could be quadratic.
+<file>
+OPTION hetres_pol 4.0 4.0 0.1 0.1 0.01 0.01
+</file>
+Initial values of coefficients for heterogeneous residual variances using //ln//(a0, a1, a2, ...) to make these values (trait first).\\
+"4.0 4.0" are intercept for first and second traits.\\
+"0.1 0.1" could be linear and "0.01 0.01" could be quadratic.\\
+To transform back to the original scale, use exp(a0+a1*X1+a2*X2).
+<file>
+OPTION SNP_file snp
+</file>
+Specify the SNP file name to use genotype data.
-    set the missing value (default 0).
+<file>OPTION se_covar_function <label> <function></file>
+As an alternative of SE, calculate SD for function of (co)variances by repeated sampling of parameters estimates from their asymptotic multivariate normal distribution, following ideas presented by Meyer and Houle 2013.\\
+\\
+''<label>''\\
+A name for a particular function (e.g., ''P1'' for phenotypic variance of trait 1, ''H2_1'' for heritability for trait 1, ''rg12'' for genetic correlation between traits 1 and 2, …).\\
+\\
+''<function>''\\
+A formula to calculate a function of (co)variances to estimate SD. All terms of the function should be written with no spaces.\\
+\\
+Each term of the function corresponds to (co)variance elements and could include any random effects (G) and residual (R) (co)variances.\\
+\\
+Notation is with reference to the effect number and the trait number (''G_eff1_eff2_trt1_trt2'') that indicate the element of the (co)variance matrix for random effect ''eff1'' and ''eff2'' and ''trt1'' and ''trt2'',\\
+where ''eff1'' and ''eff2'' are effect numbers 1 and 2, and ''trt1'' and ''trt2'' are trait numbers 1 and 2.\\
+''R_trt1_trt1'' indicates the element of the residual (co)variance matrix for traits 1 and 2.\\
+\\
+Several functions could be added, with one OPTION line per function.\\
+\\
+Examples:\\
+\\
+''OPTION se_covar_function  P  G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1''\\
+''OPTION se_covar_function  H2d  G_2_2_1_1/(G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1)''\\
-# Heterogeneous residual variances for a single trait
+''OPTION se_covar_function  H2t  (G_2_2_1_1+1.5*G_2_3_1_1+0.5*G_3_3_1_1)/(G_2_2_1_1+G_2_3_1_1+G_3_3_1_1+G_4_4_1_1+R_1_1)''\\
-OPTION hetres_pos 10 11
+''OPTION se_covar_function  rg12  G_2_2_1_2/(G_2_2_1_1*G_2_2_2_2)**0.5''\\
+\\
+The first function calculates the SD for the total variance for a maternal model with permanent maternal effect, where 2 and 3 are the effect number for the direct and maternal additive genetic effects respectively, and 4 is the effect number for the maternal permanent random effect.
+The second function calculates the heritability for the direct component.
-    specify the position of covariables.
+The third function the total heritability.
-OPTION hetres_pol 4.0 0.1 0.1
+The fourth function calculates the SD of the genetic correlation between traits 1 and 2 for the direct genetic effect (effect number 2)
-    initial values of coefficients for heterogeneous residual variances
+<file>OPTION samples_se_covar_function <n></file>
-    use ln(a0, a1, a2, ...) to make these values.
+Set the number of samples to calculate SE for function of (co)variances.\\
+default value 10000
+<file>OPTION out_se_covar_function</file>
+Indicate to store in file samples of (co)variances function for postprocessing (histogram, etc.)
+===== Tricks =====
+When the covariance matrix is close to non-positive definite, the AIREMLF90 may not converge.
+There are two options you might want to try:
-# Heterogeneous residual variances for mutiple traits
+. change the tolerance value (xx) in the option:
-OPTION hetres_pos 10 10 11 11
+OPTION tol xx
-    specify the position of covariables (trait first).
+to a very strict value (e.g., 1d-20) or a lenient value (1d-06).
-OPTION hetres_pol 4.0 4.0 0.1 0.1 0.01 0.01
+. use an option to use EM-REML inside AI-REML:
-    initial values of coefficients for heterogeneous residual variances
+OPTION EM-REML xx
-    use ln(a0, a1, a2, ...) to make these values (trait first).
+where xx is the number of iterations for EM-REML you expect to get a good starting value for AI-REML. After running xx rounds with EM-REML, the AIREMLF90 program will automatically switch from EM-REML to AI-REML using the last estimate from EM-REML as a starting value for AI-REML.