The incorporation of genomic coefficients into the numerator relationship matrix allows estimation of breeding values using all phenotypic, pedigree and genomic information simultaneously. In such a single-step procedure, genomic and pedigree-based relationships have to be compatible. As there are many options to create genomic relationships, there is a question of which is optimal and what the effects of deviations from optimality are. Data of litter size (total number born per litter) for 338,346 sows were analyzed. Illumina PorcineSNP60 BeadChip genotypes were available for 1,989. Analyses were carried out with the complete data set and with a subset of genotyped animals and three generations pedigree (5,090 animals). A single-trait animal model was used to estimate variance components and breeding values. Genomic relationship matrices were constructed using allele frequencies equal to 0.5 (G05), equal to the average minor allele frequency (GMF), or equal to observed frequencies (GOF). A genomic matrix considering random ascertainment of allele frequencies was also used (GOF*). A normalized matrix (GN) was obtained to have average diagonal coefficients equal to 1. The genomic matrices were combined with the numerator relationship matrix creating H matrices. In G05 and GMF, both diagonal and off-diagonal elements were on average greater than the pedigree-based coefficients. In GOF and GOF*, the average diagonal elements were smaller than pedigree-based coefficients. The mean of off-diagonal coefficients was zero in GOF and GOF*. Choices of G with average diagonal coefficients different from 1 led to greater estimates of additive variance in the smaller data set. The correlation between EBV and genomic EBV (n = 1,989) were: 0.79 using G05, 0.79 using GMF, 0.78 using GOF, 0.79 using GOF*, and 0.78 using GN. Accuracies calculated by inversion increased with all genomic matrices. The accuracies of genomic-assisted EBV were inflated in all cases except when GN was used. Parameter estimates may be biased if the genomic relationship coefficients are in a different scale than pedigree-based coefficients. A reasonable scaling may be obtained by using observed allele frequencies and re-scaling the genomic relationship matrix to obtain average diagonal elements of 1.

/pdf/different-genomic-relationship-matrices-for-single-step-388safpn88.pdf

Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information

In livestock, genomic selection (GS) has primarily been investigated by simulation of purebred populations. Traits of interest are, however, often measured in crossbred or mixed populations with uncertain breed composition. If such data are used as the training data for GS without accounting for breed composition, estimates of marker effects may be biased due to population stratification and admixture. To investigate this, a genome of 100 cM was simulated with varying marker densities (5 to 40 segregating markers per cM). After 1,000 generations of random mating in a population of effective size 500, 4 lines with effective size 100 were isolated and mated for another 50 generations to create 4 pure breeds. These breeds were used to generate combined, F 1 , F 2 , 3- and 4-way crosses, and admixed training data sets of 1,000 individuals with phenotypes for an additive trait controlled by 100 segregating QTL and heritability of 0.30. The validation data set was a sample of 1,000 genotyped individuals from one pure breed. Method Bayes-B was used to simultaneously estimate the effects of all markers for breeding value estimation. With 5 (40) markers per cM, the correlation of true with estimated breeding value of selection candidates (accuracy) was greatest, 0.79 (0.85), when data from the same pure breed were used for training. When the training data set consisted of crossbreds, the accuracy ranged from 0.66 (0.79) to 0.74 (0.83) for the 2 marker densities, respectively. The admixed training data set resulted in nearly the same accuracies as when training was in the breed to which selection candidates belonged. However, accuracy was greatly reduced when genes from the target pure breed were not included in the admixed or crossbred population. This implies that, with high-density markers, admixed and crossbred populations can be used to develop GS prediction equations for all pure breeds that contributed to the population, without a substantial loss of accuracy compared with training on purebred data, even if breed origin has not been explicitly taken into account. In addition, using GS based on high-density marker data, purebreds can be accurately selected for crossbred performance without the need for pedigree or breed information. Results also showed that haplotype segments with strong linkage disequilibrium are shorter in crossbred and admixed populations than in purebreds, providing opportunities for QTL fine mapping.

/pdf/genomic-selection-in-admixed-and-crossbred-populations-3vs01id99d.pdf

Genomic selection in admixed and crossbred populations.

Background: One of the main limitations of many livestock breeding programs is that selection is in pure breeds housed in high-health environments but the aim is to improve crossbred performance under field conditions. Genomic selection (GS) using high-density genotyping could be used to address this. However in crossbred populations, 1) effects of SNPs may be breed specific, and 2) linkage disequilibrium may not be restricted to markers that are tightly linked to the QTL. In this study we apply GS to select for commercial crossbred performance and compare a model with breed-specific effects of SNP alleles (BSAM) to a model where SNP effects are assumed the same across breeds (ASGM). The impact of breed relatedness (generations since separation), size of the population used for training, and marker density were evaluated. Trait phenotype was controlled by 30 QTL and had a heritability of 0.30 for crossbred individuals. A Bayesian method (Bayes-B) was used to estimate the SNP effects in the crossbred training population and the accuracy of resulting GS breeding values for commercial crossbred performance was validated in the purebred population. Results: Results demonstrate that crossbred data can be used to evaluate purebreds for commercial crossbred performance. Accuracies based on crossbred data were generally not much lower than accuracies based on pure breed data and almost identical when the breeds crossed were closely related breeds. The accuracy of both models (ASGM and BSAM) increased with marker density and size of the training data. Accuracies of both models also tended to decrease with increasing distance between breeds. However the effect of marker density, training data size and distance between breeds differed between the two models. BSAM only performed better than AGSM when the number of markers was small (500), the number of records used for training was large (4000), and when breeds were distantly related or unrelated. Conclusion: In conclusion, GS can be conducted in crossbred population and models that fit breed-specific effects of SNP alleles may not be necessary, especially with high marker density. This opens great opportunities for genetic improvement of purebreds for performance of their crossbred descendents in the field, without the need to track pedigrees through the system.

/pdf/genomic-selection-of-purebreds-for-crossbred-performance-3ge14ys9yo.pdf

Genomic selection of purebreds for crossbred performance.

Several studies have shown that selection of purebreds for increased performance of their crossbred descendants under field conditions is hampered by low genetic correlations between purebred and commercial crossbred (CC) performance. Although this can be addressed by including phenotypic data from CC relatives for selection of purebreds through combined crossbred and purebred selection (CCPS), this also increases rates of inbreeding and requires comprehensive systems for collection of phenotypic data and pedigrees at the CC level. This study shows that both these limitations can be overcome with marker-assisted selection (MAS) by using estimates of the effects of markers on CC performance. To evaluate the potential benefits of CC-MAS, a model to incorporate marker information in selection strategies was developed based on selection index theory, which allows prediction of responses and rates of inbreeding by using standard deterministic selection theory. Assuming a genetic correlation between purebred and CC performance of 0.7 for a breeding program representing a terminal sire line in pigs, CC-MAS was shown to substantially increase rates of response and reduce rates of inbreeding compared with purebred selection and CCPS, with 60 CC half sibs available for each purebred selection candidate. When the accuracy of marker-based EBV was 0.6, CC-MAS resulted in 34 and 10% greater responses in CC performance than purebred selection and CCPS. Corresponding rates of inbreeding were 1.4% per generation for CC-MAS, compared with 2.1% for purebred selection and 3.0% for CCPS. For marker-based EBV with an accuracy of 0.9, CC-MAS resulted in 75 and 43% greater responses than purebred selection and CCPS, and further reduced rates of inbreeding to 1.0% per generation. Selection on marker-based EBV derived from purebred phenotypes resulted in substantially less response in CC performance than in CC-MAS. In conclusion, effective use of MAS requires estimates of the effect on CC performance, and MAS based on such estimates enables more effective selection for CC performance without the need for extensive pedigree recording and while reducing rates of inbreeding.

Marker-assisted selection for commercial crossbred performance.

Genomic predictions for New Zealand dairy bulls and integration with national genetic evaluation

Covariance between relatives in a multibreed population was derived for an additive model with multiple unlinked loci. An efficient algorithm to compute the inverse of the additive genetic covariance matrix is given. For an additive model, the variance for a crossbred individual is a function of the additive variances for the pure breeds, the covariance between parents, and segregation variances. Provided that the variance of a crossbred individual is computed as presented here, the covariance between crossbred relatives can be computed using formulae for purebred populations. For additive traits the inverse of the genotypic covariance matrix given here can be used both to obtain genetic evaluations by best linear unbiased prediction and to estimate genetic parameters by maximum likelihood in multibreed populations. For nonadditive traits, the procedure currently used to analyze multibreed data can be improved using the theory presented here to compute additive covariances together with a suitable approximation for nonadditive covariances.

Covariance between relatives in multibreed populations: additive model.

This paper presents theory and methods to compute genotypic means and covariances in a two breed population under dominance inheritance, assuming multiple unlinked loci. It is shown that the genotypic mean is a linear function of five location parameters and that the genotypic covariance between relatives is a linear function of 25 dispersion parameters. Recursive procedures are given to compute the necessary identity coefficients. In the absence of inbreeding, the number of parameters for the mean is reduced from five to three and the number for the covariance is reduced from 25 to 12. In a two-breed population, for traits exhibiting dominance, the theory presented here can be used to obtain genetic evaluations by best linear unbiased prediction and to estimate genetic parameters by maximum likelihood.

L. L. Lo

Papers

Covariance between relatives in multibreed populations: additive model.

Theory for modelling means and covariances in a two-breed population with dominance inheritance.