This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
readme.pregsf90 [2019/09/04 17:31] ignacio [Quality Control (QC) for G] |
readme.pregsf90 [2020/11/10 19:25] dani [Input files] |
||
---|---|---|---|
Line 86: | Line 86: | ||
Useful for check for Mendelian conflicts and HWE (with also ''OPTION sex_chr'') and for GWAS (see ''PostGSF90'' program) | Useful for check for Mendelian conflicts and HWE (with also ''OPTION sex_chr'') and for GWAS (see ''PostGSF90'' program) | ||
- | The //file// should has a header with the following column names:\\ | + | The //file// should have a header with the following column names:\\ |
//SNP_ID// - identification of the SNP (alphanumeric) \\ | //SNP_ID// - identification of the SNP (alphanumeric) \\ | ||
//CHR// - chromosome number (numeric), starting from 1 \\ | //CHR// - chromosome number (numeric), starting from 1 \\ | ||
Line 95: | Line 95: | ||
- | First SNP in the Map fiel corresponds to first SNP in genotype file, and so on. | + | The first SNP in the Map file corresponds to the first SNP in the genotype file, and so on. |
- | Other alphanumeric field are optionals. | + | Other alphanumeric fields are optional. |
If ''OPTION saveCleanSNPs'' is present fields are output. | If ''OPTION saveCleanSNPs'' is present fields are output. | ||
Line 230: | Line 230: | ||
<file>OPTION hwe x</file> | <file>OPTION hwe x</file> | ||
check departure of heterozygous from Hardy-Weinberg Equilibrium.\\ | check departure of heterozygous from Hardy-Weinberg Equilibrium.\\ | ||
- | By default this QC is not run.\\ | + | By default, this QC is not run.\\ |
Optional parameter ''x'' set the maximum difference between observed and expected frequency\\ | Optional parameter ''x'' set the maximum difference between observed and expected frequency\\ | ||
- | default value is 0.15 as used in Wiggans et al., 2009 JDS | + | the default value is 0.15 as used in Wiggans et al., 2009 JDS |
<file>OPTION high_correlation x y</file> | <file>OPTION high_correlation x y</file> | ||
Line 239: | Line 239: | ||
Optional parameter //x// set the maximum difference in allele frequency to check pair of locus.\\ | Optional parameter //x// set the maximum difference in allele frequency to check pair of locus.\\ | ||
If no value 0.025 is used. Decrease this value to speed up calculation\\ | If no value 0.025 is used. Decrease this value to speed up calculation\\ | ||
- | A pair of locus is consider high correlated if the all genotypes were the same (0-0, 1-1, 2-2) or the opposite (0-2, 1-1, 2-0) (Wiggans et al 2009 JDS)\\ | + | A pair of locus is considered highly correlated if all the genotypes are the same (0-0, 1-1, 2-2) or the opposite (0-2, 1-1, 2-0) (Wiggans et al 2009 JDS)\\ |
- | Optional parameter //y// can be used to set a threshold to check number of identical samples out of the number of genotypes.\\ | + | Optional parameter //y// can be used to set a threshold to check the number of identical samples out of the number of genotypes.\\ |
default values x=0.025 y=0.995 | default values x=0.025 y=0.995 | ||
Line 256: | Line 256: | ||
<file>OPTION exclusion_threshold x</file> | <file>OPTION exclusion_threshold x</file> | ||
- | Number of parent-progeny exclusions as percentage all SNP to determine wrong relationship.\\ | + | Number of parent-progeny exclusions as percentage all SNP to determine the wrong relationship.\\ |
default value 1 | default value 1 | ||
<file>OPTION exclusion_threshold_snp x</file> | <file>OPTION exclusion_threshold_snp x</file> | ||
- | Number of parent-progeny exclusions for each locus as percentage, of pair of genotyped animals evaluated, to exclude an SNP from the analysis\\ | + | Number of parent-progeny exclusions for each locus as a percentage, of pair of genotyped animals evaluated, to exclude an SNP from the analysis\\ |
default value 10 | default value 10 | ||
Line 287: | Line 287: | ||
but they will be included for all remaining processes. **If you want to remove sex chromosomes**, which we do recommend, use ''OPTION excludeCHR''. | but they will be included for all remaining processes. **If you want to remove sex chromosomes**, which we do recommend, use ''OPTION excludeCHR''. | ||
- | Map file need to be provided. See ''OPTION map_file''. However, note that sex chromosomes are not exlucded | + | Map file need to be provided. See ''OPTION map_file''. However, note that sex chromosomes are not excluded |
<file>OPTION threshold_duplicate_samples x</file> | <file>OPTION threshold_duplicate_samples x</file> | ||
Line 304: | Line 304: | ||
<file>OPTION plotpca</file> | <file>OPTION plotpca</file> | ||
- | Plot first two principal components to look for stratification in the population. | + | Plot the first two principal components to look for stratification in the population. |
<file>OPTION extra_info_pca file col</file> | <file>OPTION extra_info_pca file col</file> | ||
Reads from //file// the column //col// to plot with different colors for different classes.\\ | Reads from //file// the column //col// to plot with different colors for different classes.\\ | ||
- | The file should contains at least one variable with different classes for each genotyped individual, and the order should match the order of the genotypes file.\\ | + | The file should contain at least one variable with different classes for each genotyped individual, and the order should match the order of the genotypes file.\\ |
Variables could be alphanumeric and separated by one o more spaces. | Variables could be alphanumeric and separated by one o more spaces. | ||
Line 345: | Line 345: | ||
<file>OPTION no_quality_control</file> | <file>OPTION no_quality_control</file> | ||
- | This option turn off all quality control !!!! \\ | + | This option turns off all quality control !!!! \\ |
Useful to speed up run when previous QC data was performed. | Useful to speed up run when previous QC data was performed. | ||