By Yutaka Masuda
In the practice of animal breeding, selection is a critical step. The selected individuals should be superior in their genetics, and we expect them to reproduce the progeny with better performance than parents. To predict the individual’s genetic merit, genomic information is available in addition to observations and pedigree information. The prediction is based on statistical methods with possibly “big” data sets. I am looking for robust, reliable, and efficient methods using all available sources of information to identify individuals for selection. Quantitative genetics is essential for the development of statistical model.
In real-life breeding populations, especially for farm animals, the data set available is not necessarily ideal. For example, the data can have human errors, some pedigree may be missing, and genotypes should not be available for all individuals. Genomic-selection methods try to prove young individuals with limited information but it’s evaluation may change after the individual gets more information. I am finding options to handle with the difficulties in actual genomic and genetic evaluation. The solution gives feedback to the method development.
When new traits are available for selection, we have to build a statistical model suitable for genetic and genomic prediction. The model should differentiate the genetic and the non-genetic factors in the observations. We have to know the genetic parameters to describe the genetic background of the traits (like how much genetic effects contribute to the traits compared with non-genetic effects). Some traits are repeatedly observed over time and some are genetically related each other. I suggest a reasonable, evaluation model for such traits considering the data design and the population structure.
Any statistical methods are applicable to the data with software. The software should be more efficient in computation as the data gets bigger, which is the case in the application of genomic prediction in animal breeding. I have been developing a set of computing modules which quickly computes the predictions, and it is now incorporated in BLUPF90 series of programs. The software is used in several companies and associations worldwide for genomic evaluation.
I am one of the maintainers of BLUPF90; I regularly fix a bug and put new features in/to the programs. I optionally use parallel computing (shared and distributed memory), graphic processing units (GPU), and useful algorithms (e.g. sparse matrix operations) in my software. I have significantly improved the software in speed, memory usage, and usability.