User Tools

Site Tools


readme.renumf90

This is an old revision of the document!


<file>

RENUMF90 - renumbering program for the BLUPF90 family

          now works with SNP info

Ignacy Misztal and Ignacio Aguilar, University of Georgia August 27, 2001 - Mar 17, 2011

Summary

RENUMF90 is a renumbering program for the BLUPF90 family of programs. It supports multiple traits, different effects per trait, alphanumeric and numeric fields. The program provides data statistics, performs comprehensive pedigree checking, and supports unknown parent groups etc.

It accepts files where fields in data and pedigree files are separated by spaces. The program is still in active development so errors are possible and some features may not work or work incorrectly.

Warnings

  1. input files cannot contain character #.
  2. missing animals have code 0; 00 may be treated as a known animal

Structure of parameter file

The parameter file contains keywords in capital followed by specifications of a given effect/data item. The keywords need to be typed exactly. Specific keywords need to occur sequentially, as shown below.

Bugs ====

IDs starting with “-” may not work

Fields in the parameter file

# Parameter file for program renf90; it is translated to parameter # file for BLUPF90 family of programs.

 Lines with # are treated as comments
    
 

DATAFILE f1

 The data file is f1

TRAITS t1 t2 .. tn

 t1-tn are positions of traits in datafile; n defines the number of traits

FIELDS_PASSED TO OUTPUT p1 p2 .. pm

 fields p1-pn are passed to output without changes; can be empty
 

WEIGHT(S) w

 w is  position of weight if present; can be empty
 

RESIDUAL_VARIANCE r

 r is matrix of residual (co)variances  of size n x n

EFFECT e1.. en type form

 this line defines one group of effects; e1 .. en are positions of this 
 effect for all traits; positions can be different for each trait for fixed
 effects; for random effects, only one position + 0 (misising) efefct are
 possible.
 
 type is 'cross' for crossclassified or 'cov' for covariables
 
 form is 'alpha' for alphanumeric or 'numer' for numeric

NESTED d1 .. dn form

 optional for covariables only, specifies nesting; form is as above

RANDOM rtype

 the RANDOM keyword occurs only if the current effect is random;
 rtype is 'diagonal', 'sire' or 'animal'
 

OPTIONAL o1 o2.. oq

 causes extra effects appended to the animal effect; current options include
 'pe' for permanent environment, 'mat' for maternal, and 'mped' for maternal 
 permanent environment 
    

FILE fped

 for animal and sire model only, fped specifies the pedigree file

FILE_POS an s d alt_dam yob

 for animal effect only; specifies positions in the pedigree file of animal
 an, sire s, dam d, alternate_dam dam rec_dam , and year of birth yob; missing
 alt_dam or yob can be replaced by 0; if this line is not given, defaults are 
 1 2 3 0 0. If maternal effect is specified, the maternal effect is due to
 position of d if alt_dam field is 0, or otherwise is due to alt_dam; If
 alt_dam field is not zero, it should include ID of real or recipient
 dam.
 

SNP_FILE fsnp

 optional; fsnp specifies files with ID and SNP information; if present, the 
 relationship matrix will be constructed as in Aguilar et al. (2010) and will
 include the genomic information; file fsnp should start with ID with the
 same format as fped and SNP info needs to start from a fixed column and
 include digits 0, 1, 2 and 5; ID and SNP info need to be separated by
 at leats one space; see info for program PreGSf90 
 

PED_DEPTH p

 
 optional for animal effect only; p specifies the depth of pedigree search; 
 the default is 3; all pedigrees are loaded if p=0.
 

GEN_INT min avg max

 optional; specifies minimum, average and maximum generation interval;
 applicable only if year of birth present; minimum and maximum used for
 pedigree checks; average used to predict year of birth of parent with missing
 pedigree.
 

REC_SEX i

 optional; if only one sex has records, specifies which parent it is; used for
 pedigree checks.
 

UPG_TYPE t

 optional; 
 if t is 'yob', the asignment is based on year of birth; the
 subsequent line should contain list of years to separate different UPG; 
 
 if t is 'in_pedigrees', the value of a missing parent should be -x, where x is
 UPG number that this missing parent should be allocated to; in this option, 
 all known parents should have pedigree lines, i.e., each parent field should
 contain either the ID of a real parent, or a negative UPG number.
 
 if t is 'internal',  allocation is by a user-written function 
 custom_upg(year_of_birth,sex,ID, parent_code).

RANDOM_REGRESSION r_type

 Specifies that random regressions should be applied to the animal and
 corresponding effects (mat, pe and mpe), this keyword also could be 
 applied to set covariables for fixed effects; r_type is 'data' if covariables for
 random regressions are in the data, or "legendre' if legendre plynomials are
 to be generated from a single data variable; not yet implemented
 

RR_POSITION r1 .. rq

 for random regressions, r1-rq specifies positions of covariables if
 r_type='data', or r1 is order of legendre polynomial and r2 is position of
 covariable if r_type='legendre'; not yet implemented
 

(CO)VARIANCES g

 g are (co)variances for the animal effect; the dimensions of g should 
 account for the maternal effect if present
 

(CO)VARIANCES_PE gpe

 gpe are (co)variances for the PE effect if present

(CO)VARIANCES_MPE gmpe

 gmpe are (co)variances for the MPE effect if present

Sections starting from EFFECTS can be repeated any number of types. If (Co)variances for any effect are missing, they are substituted with matrices containing 1.0 on diagonals and 0.1 on off-diagonals.

Warning: for variance estimation by EM REML,usually there is improved convergence rate if the starting values for (co)variances are too large than too small.

The sequence of keywords should be as above although optional fields can be skipped. Keywords out of order may not be recognized.

The following options can added at the end of the parameter file to redefine parameter used to read the input file:

- the default size of character fields

OPTION alpha size nn
 where nn is the new size.

- the size of th record length

OPTION max_string_readline nn
 where nn is the new size.

- the maximun number of fields

      OPTION max_field_readline nn
 where nn is the number of fields. 

The end of the parameter file for RENUMF90 can contain many lines beginning with OPTION. All of these lines are passed to parameter file renf90.par to be used by application programs.

Combining fields or interactions

Several fields in the data file can be combined into one using a COMBINE keyword.

COMBINE a b c ....

catenates b c … into c. Keywords COMBINE need to be on top of the parameter file, but possibly after comments. There may be many combined fields. For example:

COMBINE 7 2 3 4

combines content of fields 2 3 4 into field 7; the data file is not changed, only the program treats field 7 as fields 2 3 4 put together (without spaces). The combined fields can be treated as “numeric”, if they are composed of numbers and if their total length is <9. Otherwise, they need to be used as “alpha”. Please note that the maximum size of the combined variable is limited by the largest size of the “alpha” field.

Additive Pedigree File

The additive pedigree file(s) renadd* has the following structure:

 1) animal number (from 1)                                        
 2) parent 1 number or unknown parent group number for parent 1   
 3) parent 2 number or unknown parent group number for parent 2   
 4) 3 minus number of known parents                               
 5) known or estimated year of birth (0 if not provided)          
 6) number of known parents (parents might be eliminated if not   
    contributing; if animal has genotype 10+number of know parents                                                  
 7) number of records             
 8) number of progenies (before elimination due to other effects) 
    as parent 1
 9) number of progenies (before elimination due to other effects) 
    as parent 2  
10) original animal id                                            

Extensions

The program is being modified to support inbreeding, dominance, random regressions with automatic calculations of Legendre polynomials,…

Example

data file - data.test


1 aa 34.5 11 12 zz 3 bb 21.333 22 23 xx 8 cc 23.666 33 34 yy 1 dd 29 44 45 xx 3 aa 30 55 56 yy 5 bb 1234567.890 66 67 zz

pedigree file - test.ped


qq 0 0 aa 0 0 bb qq aa cc qq 0 dd 0 aa

parameter file - testpar1


# Parameter file for program renf90; it is translated to parameter # file for BLUPF90 family f programs. DATAFILE data.test TRAITS 3 4 FIELDS_PASSED TO OUTPUT 2 1 # passing alphanumeric WEIGHT(S)

RESIDUAL_VARIANCE 5 2 2 4 EFFECT 1 1 cross alpha EFFECT 2 2 cross alpha RANDOM animal OPTIONAL mat mpe pe FILE test.ped (CO)VARIANCES 10 3 2 1 3 11 4 5 2 4 12 6 1 5 6 13.01 (CO)VARIANCES_PE 5.3 2.1 2.1 4.85 (CO)VARIANCES_MPE 1.03 .27 .27 .85 EFFECT 5 0 cov NESTED 1 0 alpha EFFECT 6 6 cross alpha RANDOM diagonal

printout (temporary; the amount of details may change)


RENUMF90 version 1.93 name of parameter file? testpar1 datafile:data.test traits: 3 4 fields passed: 2 1 R

 5.000       2.000    
 2.000       4.000    

Processing effect 1 of type cross item_kind=alpha

Processing effect 2 of type cross item_kind=alpha Optional maternal effect Optional maternal permanent environment Optional permanent environment pedigree file name “test.ped” positions of animal, sire, dam, alternate dam and yob 1 2 3 0 0 Reading (CO)VARIANCES: 4 x 4 Reading (CO)VARIANCES_PE: 2 x 2 Reading (CO)VARIANCES_MPE: 2 x 2

Processing effect 3 of type cov item_kind=alpha

Processing effect 4 of type cross item_kind=alpha

Maximum size of character fields: 20

Maximum size of record (max_string_readline): 800

Maximum number of fields innput file (max_field_readline): 100

hash tables for effects set up read 6 records table with 4 elements sorted added count Effect group 1 of column 1 with 4 levels table expanded from 10000 to 10000 records added count Effect group 2 of column 1 with 4 levels table with 4 elements sorted added count Effect group 3 of column 1 with 4 levels table expanded from 10000 to 10000 records table with 3 elements sorted added count Effect group 4 of column 1 with 3 levels table expanded from 10000 to 10000 records wrote statistics in file “renf90.tables”

Basic statistics for input data (missing value code is 0) Pos Min Max Mean SD N

 3    21.333     0.12346E+07 0.20578E+06 0.50400E+06       6
 4    11.000      66.000      38.500      20.579           6
 5    12.000      67.000      39.500      20.579           6

Correlation matrix

      3     4     5
3   1.00  0.65  0.65
4   0.65  1.00  1.00
5   0.65  1.00  1.00

Counts of nonzero values (order as above)

        6         6         6
        6         6         6
        6         6         6

random effect 2 type:animal opened output pedigree file “renadd02.ped” read 5 pedigree records loaded 3 parent(s) in round 1

Pedigree checks

Number of animals with records: 4 Number of parents without records: 1 Number of phantom dams: 2 Total number of animals: 7

random effect 4 type:diag

Wrote parameter file “renf90.par” Wrote renumbered data “renf90.dat”

new parameter file - renf90.par


# BLUPF90 parameter file created by RENF90 DATAFILE renf90.dat NUMBER_OF_TRAITS

         2

NUMBER_OF_EFFECTS

         7

OBSERVATION(S)

  1    2

WEIGHT(S)

EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT[EFFECT NESTED]

3  3         4 cross 
4  4         7 cross 
5  5         7 cross
5  5         7 cross
4  4         7 cross
6  0         4 cov   7  0
8  8         3 cross 

RANDOM_RESIDUAL VALUES

 5.000       2.000    
 2.000       4.000    

RANDOM_GROUP

   2     3

RANDOM_TYPE add_animal FILE renadd02.ped (CO)VARIANCES

 10.00       3.000       2.000       1.000    
 3.000       11.00       4.000       5.000    
 2.000       4.000       12.00       6.000    
 1.000       5.000       6.000       13.01    

RANDOM_GROUP

   4

RANDOM_TYPE diagonal FILE

(CO)VARIANCES

 1.030      0.2700    
0.2700      0.8500    

RANDOM_GROUP

   5

RANDOM_TYPE diagonal FILE

(CO)VARIANCES

 5.300       2.100    
 2.100       4.850    

RANDOM_GROUP

   7

RANDOM_TYPE diagonal FILE

(CO)VARIANCES

 1.000      0.1000    
0.1000       1.000

data file - renf90.dat


34.5 11 1 3 5 12 1 3 aa 1 21.333 22 2 1 3 23 2 1 bb 3 23.666 33 4 4 7 34 4 2 cc 8 29 44 1 2 3 45 1 1 dd 1 30 55 2 3 5 56 2 2 aa 3 1234567.890 66 3 1 3 67 3 3 bb 5

Pedigree file (same format as from renum) - renadd02.ped


1 6 3 1 0 2 2 0 0 bb 6 0 0 1 0 0 0 2 0 qq 2 0 3 1 0 1 1 0 0 dd 7 0 0 1 0 0 0 0 1 D@@0000002 5 0 0 1 0 0 0 0 1 D@@0000001 3 0 5 1 0 1 2 0 2 aa 4 6 7 1 0 2 1 0 0 cc

renumbering tables - renf90.tables


Effect group 1 of column 1 with 4 levels Value # consecutive number 1 2 1 3 2 2 5 1 3 8 1 4 Effect group 3 of column 1 with 4 levels Value # consecutive number 1 2 1 3 2 2 5 1 3 8 1 4 Effect group 4 of column 1 with 3 levels Value # consecutive number xx 2 1 yy 2 2 zz 2 3 <\file>

readme.renumf90.1338166919.txt.gz · Last modified: 2024/03/25 18:22 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki