User Tools

Site Tools


readme.gibbsf90new

GIBBSF90TEST

UNDER CONSTRUCTION

Summary

This is a combined program of gibbs2f90, gibbs3f90, thrgibbs1f90, and thrgibbs3f90.
GIBBS1F90 implements Gibbs sampler for mixed threshold-linear models involving multiple categorical and linear variables. Thresholds and variances can be estimated or assumed. Another version of thrgibbs1f90b for binary responses is available.
See PREGSF90 with genotypes (SNP) for options.

Run-time options

Keyboard input

When you run the program, the program will ask you several questions. You have to type-in (input) the answer to each question. The first one is the name of parameter file.

 name of parameter file?

Then, the program asks you the number of total samples (rounds) and the number of burn-in. The program runs through up to the rounds specified here and the first burn-in samples will not be saved.

  number of samples and length of burn-in

Here, you put two integer numbers separated by spaces (or Enter key). For example, if you need 1000 samples with no burn-in, the correct input is 1000 0.

Finally, the program asks you the interval to save samples. It is also known as thinning.

 Give n to store every n-th sample? (1 means store all samples)

For example, if you put 10, the program save the sample every 10 rounds. The other samples will be discarded. To save all samples, please input 1 here.

Command line options

Instead of typing the above parameters, you can put them in the command line in shell. Here is the general form of command line.

gibbsf90test parfile --samples i --burnin j --interval k

Each item should be replaced with your favorite value.

  • parfile: the name of parameter file (characters)
  • i: the number of samples (integer)
  • j: the number of burn-in (integer)
  • k: interval to save samples (integer)

For example, when you use `renf90.par` as parameter file, sample 10000 times with no burn-in, and save all samples, the following command works.

gibbsf90test renf90.par --samples 10000 --burnin 0 --interval 1

Tips: you can use --rounds instead of --samples (even the singular form is okay: --sample and --round). Also, --thinning is acceptable as --interval.

Parameters

The parameter file is the same as for BLUPF90 except for options.

Options in parameter file

OPTION cat 0 0 2 5

“0” indicate that the first and second traits are linear. “2” and “5” indicate that the third and fourth traits are categorical with 2 (binary) and 5 categories.

OPTION fixed_var all

Store all samples for solutions in “all_solutions” and posterior means and SD for all effects in “final_solutions”, assuming that (co)variances in the parameter file are known.

OPTION fixed_var all 1 2 3

Store all samples for solutions in “all_solutions” and posterior means and SD for 1, 2, and 3 effects in “final_solutions”, assuming that (co)variances in the parameter file are known.

OPTION fixed_var mean

Only posterior means and SD for solutions are calculated for all effects in “final_solutions”, assuming that (co)variances in the parameter file are known.

OPTION fixed_var mean 1 2 3

Only posterior means and SD for solutions are calculated for effects 1, 2, and 3 in “final_solutions”, assuming that (co)variances in the parameter file are known.

OPTION solution all

Caution: this option will create a huge output solution file when you run many rounds and/or use a large model. Store all samples for solutions in “all_solutions” and posterior means and SD for all effects. The file “all_solutions” could be very large.

OPTION solution all 1 2 3

Caution: this option will create a huge output solution file when you run many rounds and/or use a large model. Store all samples for solutions in “all_solutions” and posterior means and SD for 1, 2, and 3 effects. The file “all_solutions” could be very large.

OPTION solution mean

Only posterior means and SD for solutions are calculated for all effects in “final_solutions” while sampling (co)variances.

OPTION solution mean 1 2 3

Only posterior means and SD for solutions are calculated for effects 1, 2, and 3 in “final_solutions” while sampling (co)variances.

OPTION save_halfway_samples 5000

The program saves every “5000” samples to restart or recover the job right after the last saved samples. It is useful when the program accidentally stopped.

OPTION cont 10000

“10000” is the number of samples run previously. The user can restart the program from the last run. This option requires “binary_final_solutions”, “gibbs_samples”, and “fort.99” files. When using “OPTION cont”, all output files will be replaced by new ones. Before running with this option, all files should be backed up.

OPTION prior 5 2 -1 5 

The (co)variance priors are specified in the parameter file.
Degree of belief for all random effects should be specified using the following structure:
OPTION prior eff1 db1 eff2 db2 … effn dbn -1 dbres
effx correspond to the effect number and dbx to the degree of belief for this random effect, -1 corresponds to the degree of belief of the residual variance.
In this example 2 is the degree of belief for the 5th effect, and 5 is the degree of belief for the residual.

OPTION seed 123 -432

Two seeds for a random number generator can be specified.

OPTION thresholds 0.0 1.0 2.0

Set the fixed thresholds. No need to set 0 for binary traits.

OPTION residual 1

Set the residual variance = 1 for categorical traits. For binary traits, the residual variance is automatically set to 1, so no need to use this option.

OPTION pos_def x.x

Specify checking pos-def for fixed effects where x.x is a tolerance (default=1d-08).

OPTION censored 1 0

Negative values for the categorical trait in the data set indicate censored records. “1 0” determines that the first categorical trait is censored and the second uncensored.

OPTION SNP_file snp

Specify the SNP file name to use genotype data.

OPTION external_script name n

program gibbstest will stop every n iterations, write samples of variances, unknowns and thresholds, wait for external process 'name' to run and then restart This is useful to compute diagnostics from samples without storing them or modifying the program.

Save intermediate results for "cold start"

  OPTION save_halfway_samples n

This option can help the 'cold start' (to continue the sampling when the program accidentally stops before completing the run). An integer value n is needed. In every n rounds, the program saves intermediate samples to 2 files (last_solutions and binary_final_solutions). The program can restart the sampling form the last round where the intermediate files were saved. The program also writes a log file save_halfway_samples.txt with useful information for the next run.

To restart, add OPTION cont 1 to your parameter file and run gibbsf90test again. Input 3 numbers (samples, burn-in, and interval) according to save_halfway_samples.txt. Gibbsf90test can take care of all restarting process by itself, so no other tools are needed.

Tips

  • Small n will make the program slow because of frequent file writing. The n should be a multiple of the interval (the 3rd number you will input in the beginning of the program).
  • If the program stops during burn-in, the restart will fail because gibbs_samples is not created. Recommendation is burn-in=0 (but it doesn't provide posterior mean and SD for solutions).
  • The cold start may add tiny numerical errors to the samples. Samples from the cold start wouldn't be identical to samples from a non-stop analysis.
  • If, unfortunately, the program is killed during its saving the intermediate samples, the cold start will fail. To avoid this, you can manually make a backup for gibbs_samples, fort.99, last_solutions, and binary_final_solutions at some point and write them back if needed.

Example

Put the following option in your parameter file for the first time when you run the program.

  OPTION save_halfway_samples 100

Run gibbsf90test. You will see the following message on screen.

  '**** saving halfway samples in every         100  rounds (default=0)

In this example, we assume the number of total samples is 3000, the burn-in is 0, and the interval is 10.

   number of samples and length of burn-in
   3000 0
   Give n to store every n-th sample? (1 means store all samples)
   10

Make sure the intermediate results are saved to files.

          100  rounds
  G
    2758.       1900.       2019.    
    1900.       1690.       1656.    
    2019.       1656.       1737.    
  G
    225.5      -91.35      -9.998    
   -91.35       403.5       474.6    
   -9.998       474.6       702.2    
  R
    1755.       868.7       817.0    
    868.7       2122.       1361.    
    817.0       1361.       1804.    
  * Last seeds =  1877469549   348652151
  * Number of samples kept =         100
  solutions stored in binary file: "last_solutions"
  solutions stored in file: "binary_final_solutions"

Stop the program. In this case, program stops in the round 880.

           880  rounds
  forrtl: error (69): process interrupted (SIGINT)
  Image              PC                Routine            Line        Source             
  thrgibbs1f90       0000000000943031  Unknown               Unknown  Unknown
  thrgibbs1f90       0000000000941787  Unknown               Unknown  Unknown

Make sure there are the following 5 files.

  binary_final_solutions  fort.99  gibbs_samples  last_solutions  save_halfway_samples.txt

Browse the file save_halfway_samples.txt. Only information in the 'suggestion' block is needed for the next run. The halfway samples were saved in the round 800 so you will start the sampling from the round 801 in the next run. Remaining 2200 rounds are needed to satisfy the initial goal (3000 rounds).

  Saved on 2017-03-10 10:53:22
  
  State in the current run:
    last round              =        800
    sampled in this run     =        800
    total number of samples =       3000
    number of burn-in       =          0
    interval                =         10
  
  Suggestion for input in the next run:
    total number of samples =       2200
    number of burn-in       =          0
    interval                =         10
  
  When you restart the program, do not forget to put the following option
  in your parameter file.
     OPTION cont 1

Put the option OPTION cont 1 to your parameter file manually (by hand). It can invoke the 'cold start' module in gibbsf90test.

  OPTION cont 1

Run gibbsf90test again. You will see the following message on screen.

  '*** continuous sampling selected *** previous # samples =           1

NOTE: Although the message may say the previous number of sample is 1, you can safely ignore it. The program runs with the cold start mode and it works correctly.

Input the three numbers that are shown in save_halfway_samples.txt.

   number of samples and length of burn-in
   2200 0
   Give n to store every n-th sample? (1 means store all samples)
   10

The program will start from the round 801 as expected.

           801  rounds
   G
     828.0       601.3       610.8    
     601.3       996.3       877.4    
     610.8       877.4       822.8    
   G
     2541.       1531.       1756.    
     1531.       1459.       1598.    
     1756.       1598.       1898.    
   R
     1800.       833.0       813.3    
     833.0       2114.       1303.    
     813.3       1303.       1778.    

You can interrupt the program again. The results will be basically the same to ones from the non-stop analysis.

readme.gibbsf90new.txt · Last modified: 2019/05/07 01:54 by yutaka