Modifications of gibbs2f90 for heterogeneous residual variances written by Ignacy Misztal, September 19, 2002
It allows estimation of separate residual (co)variances for different subsets of observations. The estimation is enabled by addition of the following line at the end of the parameter file:
See PREGSF90 with genotypes (SNP) for options.
OPTION hetres_int col nlev
where col is column in the data file that selects which residual (co)variance to select, and nlev is the maximum number of levels. Different residual (co)variances need to be numbered consecutively starting from 1.
Initially, all residual (co)variances are assigned identical (co)variances as in the parameter file. During the estimation, (co)variances that are 0 in the parameter file will not be estimated.
The number of observations per each subset must be large enough to allow the estimation, and the missing-trait pattern should be similar. The program has not been tested in multiple-trait situations when one trait is present in some subclasses but always missing in another subclass.
number of samples and length of burn-in?
In the first run, if you have no idea about the number of samples and burn-in, just type your guess (10000 or whatever) for samples and (0) for burn-in. You may need 2 or 3 runs to figure out the convergence.
Give n to store every n-th sample?
Gibbs samples are usually highly correlated, so you do not have to keep all samples. Maybe every 10th,20th, 50th, …
To check the convergence and to calculate posterior means and SD, run postgibbsf90.
OPTION hetres_int 5 10
The position “5” to identify the interval in the data file and the number of intervals “10” for heterogeneous residual variances.
OPTION fixed_var all 1 2 3
All solutions and posterior means and SD for effects for effects1, 2, and 3 are stored in “all_solutions” and in “final_solutions” using fixed variances. Without numbers, all solutions for all effects are stored.
OPTION fixed_var mean 1 2 3
Posterior means and SD for effects1, 2, and 3 in “final_solutions” using fixed variances.
x11 x12 x13 x21 x22 x23 x31 x32 x33 y11 y12 y13 y21 y22 y23 y31 y32 y33 : : : z11 z12 z13 z21 z22 z23 z31 z32 z33
When fixed_var
is used, you should provide the residual (co)variances in the file hetres
.
Give residual covariances for each interval.
For example, in the above file, we assume a 3 trait-model and each covariance matrix is 3 x 3 (e.g., x, y, and z).
Note that the older version reads the variance components from the parameter file.
The current version reads the external file “hetres” and the variance components included in the parameter file will not be used any more.
OPTION solution all 1 2 3
Caution: this option will create a huge output solution file when you run many rounds and/or use a large model. All solutions and posterior means and SD for effects1, 2, and 3 are stored in “all_solutions” and in “final_solutions” every round. Without numbers, all solutions for all effects are stored.
OPTION solution mean 1 2 3
Posterior means and SD for effects1, 2, and 3 in “final_solutions”.
OPTION cont 10000
10000 is the number of samples run previously when restarting the program from the last run.
OPTION prior 5 2 -1 5
The (co)variance priors are specified in the parameter file.
Degree of belief for all random effects should be specified using the following structure:
OPTION prior eff1 db1 eff2 db2 … effn dbn -1 dbres
effx correspond to the effect number and dbx to the degree of belief for this random effect, -1 corresponds to the degree of belief of the residual variance.
In this example 2 is the degree of belief for the 5th effect, and 5 is the degree of belief for the residual.
OPTION seed 123 321
Two seeds for a random number generator can be specified.
OPTION SNP_file snp
Specify the SNP file name to use genotype data.
Model
y_ijk1=hys_i1+sire_j1+e_ijk1 y_ijk2=hys_i2+sire_j2+e_ijk2
var(sire_i1)=75; var(e_ijk1)=50+ceil(k/2000)*50 y2=2*y1+normal(0,16)
50 sires, 10,000 records, 5 intervals
Data (datasire)
1 - HYS 2 - sire 3 - y1 4 - heterogeneous class 5 - y2
cat datasire
6 13 317.55 1 644.26 3 10 280.44 1 563.05 .................... 37 1 270.52 5 543.63 53 10 286.43 5 579.84
Parameter file (ex5)
DATAFILE datasire NUMBER_OF_TRAITS NUMBER_OF_EFFECTS OBSERVATION(S) WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 1 1 100 cross 2 2 50 cross RANDOM_RESIDUAL VALUES 500 100 100 1000 RANDOM_GROUP RANDOM_TYPE diagonal FILE (CO)VARIANCES 75 10 10 150 OPTION hetres_int 4 5
gibbs3f90
name of parameter file?ex5 GIBBS3F90 ver. 1.50 by I. Misztal Parameter file: ex5 Data file: datasire Number of Traits 2 Number of Effects 2 Position of Observations 3 5 Position of Weight (1) 0 Value of Missing Trait/Observation 0 EFFECTS # type position (2) levels [positions for nested] 1 cross-classified 1 1 100 2 cross-classified 2 2 50 Residual (co)variance Matrix 500.00 100.00 100.00 1000.0 Random Effect(s) 2 Type of Random Effect: diagonal trait effect (CO)VARIANCES 1 2 75.00 10.00 2 2 10.00 150.0 REMARKS (1) Weight position 0 means no weights utilized (2) Effect positions of 0 for some effects and traits means that such effects are missing for specified traits number of samples and length of burn-in 100 20 Give n to store every n-th sample? (1 means store all samples) 1 Option hetres_int -- to estimate heteregenous residual variances in intervals read: 5 intervals, data column 4 Data record length = 5 Missing traits found in 0 records 77.1 151. 151. 304. Residual variance, interval 1 df_r 1997 ee/n 108.294454569457 108. 199. 199. 416. Residual variance, interval 2 df_r 1997 ee/n 152.131225307972 150. 287. 287. 597. Residual variance, interval 3 df_r 1997 ee/n 204.357724084865 206. 396. 396. 807. Residual variance, interval 4 df_r 1997 ee/n 240.494017267479 239. 462. 462. 941. Residual variance, interval 5 df_r 1997 ee/n 301.022455438506 294. 579. 579. 0.119E+04 ..................................... round 97 111. 221. 221. 441. Residual variance, interval 1 df_r 1997 ee/n 100.000009514270 98.9 197. 197. 402. Residual variance, interval 2 df_r 1997 ee/n 145.159649196084 147. 293. 293. 592. Residual variance, interval 3 df_r 1997 ee/n 195.187063040059 206. 412. 412. 834. Residual variance, interval 4 df_r 1997 ee/n 232.588622554471 230. 460. 460. 929. Residual variance, interval 5 df_r 1997 ee/n 299.190674678875 300. 602. 602. 0.122E+04 round 98 209. 416. 416. 828. Residual variance, interval 1 df_r 1997 ee/n 99.4738134864675 101. 202. 202. 412. Residual variance, interval 2 df_r 1997 ee/n 146.518188769043 148. 296. 296. 602. Residual variance, interval 3 df_r 1997 ee/n 198.183671561078 198. 397. 397. 806. Residual variance, interval 4 df_r 1997 ee/n 232.307903786663 228. 455. 455. 917. Residual variance, interval 5 df_r 1997 ee/n 301.189371418363 311. 622. 622. 0.126E+04 round 99 80.4 160. 160. 319. ave G 90.1 180. 180. 358. SD G 24.2 48.3 48.3 96.3 Residual variance, interval 1 df_r 1997 ee/n 99.4617504906976 99.7 199. 199. 406. ave R 99.5 199. 199. 406. SD R 2.91 5.93 5.93 12.2 Residual variance, interval 2 df_r 1997 ee/n 144.322137485480 139. 279. 279. 567. ave R 145. 291. 291. 591. SD R 4.97 10.2 10.2 21.0 Residual variance, interval 3 df_r 1997 ee/n 198.649297883965 201. 404. 404. 821. ave R 200. 400. 400. 810. SD R 5.41 10.9 10.9 22.1 Residual variance, interval 4 df_r 1997 ee/n 233.760179013841 227. 451. 451. 904. ave R 232. 464. 464. 937. SD R 6.56 13.2 13.2 26.6 Residual variance, interval 5 df_r 1997 ee/n 298.462490006407 303. 607. 607. 0.123E+04 ave R 300. 602. 602. 0.122E+04 SD R 9.13 18.4 18.4 37.2 round 100 solutions stored in file: "solutions" Samples stored in file "gibbs_samples"