2011-09-30

InterBull: Partitioning of international genetic trends by origin in Brown Swiss bulls

I attended the InterBull meeting this year in Stavanger (Norway), which was jointly organised with the EAAP conference in the same place. I participated with contribution titled "Partitioning of international genetic trends by origin in Brown Swiss bulls" co-authored with colleagues. In essence we partitioned  breeding values of Brown Swiss animals by origin of selection and summarized those partitions as origin specific genetic trends. This gives us an opportunity to evaluate the effect of selection performed in different countries and how this affects the global genetic trends. Results for this breed are quite shocking! Look into the paper and talk (see bellow) for more ;)

We were looking forward for comments and/or critiques about the applied method from the audience, but did not get any direct questions. Several people approached me after the talk and one of the comments was that our method is nothing more than multiplying breeding values by the share of genes coming from different origin. This is not true and I will demonstrate this in with a simple example.

Let us assume that we have a simple small pedigree as shown bellow with R code, where column names are: id =individual code, fid = father code, mid = mother code, ori = origin, and bv = breeding value. This pedigree could represent situation where we constantly use foreign sires - here only two generations are being shown. Origin represents the country of registering/selecting aninmal.

## Simple example
example <- data.frame( id=c(  1,   2,   3,   4,   5),
                      fid=c( NA,  NA,   1,  NA,   3),
                      mid=c( NA,  NA,   2,  NA,   4),
                      ori=c("A", "B", "A", "B", "A"),
                       bv=c(100, 106, 104, 106, 103))

Now we would like to perform gene proportion analysis and partitioning of supplied breeding values, both according to origin. This is easy to achieve "by hand" for this simple example (see paper bellow for maths), but tedious for bigger examples. I have wrote an R package partAGV (not yet publicly available, but you can contact me) that can be used for such analyses.

## For gene proportion analysis
example$gp <- 1
 
library(partAGV)
partAGV(example, colAGV=6:5)
 
## Gene proportions
##    id  fid  mid ori gp gp_pa gp_w gp_A gp_B
## 1   1 <NA> <NA>   A  1     0    1 1.00 0.00
## 2   2 <NA> <NA>   B  1     0    1 0.00 1.00
## 3   3    1    2   A  1     1    0 0.50 0.50
## 4   4 <NA> <NA>   B  1     0    1 0.00 1.00
## 5   5    3    4   A  1     1    0 0.25 0.75
 
## Partitions of breeding values 
##    id  fid  mid ori  bv bv_pa bv_w  bv_A  bv_B
## 1   1 <NA> <NA>   A 100     0  100 100.0   0.0
## 2   2 <NA> <NA>   B 106     0  106   0.0 106.0
## 3   3    1    2   A 104   103    1  51.0  53.0
## 4   4 <NA> <NA>   B 106     0  106   0.0 106.0
## 5   5    3    4   A 103   105   -2  23.5  79.5

As we can see gene proportions are as expected: 1/2 for each origin in individual 3 and 1/4 vs. 3/4 for individual 5. Partitions of breeding values show that in animal 5 we 23.5 (out of 105) is attributed to selection work done in country A and 79.5 is attributed to selection work done in country B. Now, if we multiply breeding value of individual 5 (105) by origin specific gene proportions (1/4 and 3/4) we get 25.75 and 77.5, which is similar to partitions, but not the same. This shows that our method enables separation of gene flow and selection work preformed by particular country. In order to see this algebraically we can write breeding values of this individual as:

a_5   = 1/2 a_3 + 1/2 a_4 + w_5
      = 1/2 (1/2 a_1 + 1/2 a_2   +     w_3)  + 1/2 w_4 + w_5
      =      1/4 a_1 + 1/4 a_2   + 1/2 w_3   + 1/2 w_4 + w_5
      =      1/4 w_1 + 1/4 w_2   + 1/2 w_3   + 1/2 w_4 + w_5
      =      1/4 100 + 1/4 106   + 1/2   1   + 1/2 106 +  -2
      =           25 +      26.5 +       0.5 +      53 +  -2

If we now collect terms specific to each origin we get:


a_5_A =           25 +                   0.5 +         +  -2 = 23.5
a_5_B =                     26.5 +                  53       = 79.5



Partitioning of international genetic trends by origin in Brown Swiss bulls

2011-09-28

Polyploidy in sugarcane

While reading UseR conference abstracts I came across this sentence: "Sugarcane is polypoid, i.e., has 8 to 14 copies of every chromosome, with individual alleles in varying numbers." Vau! This generates really complex genotype system. Say we have biallelic gene with alleles being A and B. In diploids the possible genotypes are AA, AB, and BB. Given the above sentence in sugarcane possible genotypes are any permutation of A's and B's in a series of 8 to 14 alleles. I am not sure if 9, 11, and 13 are also allowed, that is having even number of chromosomes. In any case such permutations result in really large numbers!

Thinking about this a bit further it appears that the whole system is not that complex once we realize that genotyping does not tell as about the order of alleles (we can not distinguish between AB and BA), which simplifies from all possible permutations to all possible combinations, e.g., for biallelic gene in tetraploids this would correspond to 5 combinations and 16 permutations.

Bellow is an R snippet that shows how to enumerate all possible combinations or permutations

## Load package having nice combinatorial functions
library(package="gtools")
 
## Specify alleles - just two for simplicity
alleles <- c("A", "B")
 
## Possible genotypes for diploids
combinations(n=length(alleles), r=2, v=alleles, repeats.allowed=TRUE)
##      [,1] [,2]
## [1,] "A"  "A" 
## [2,] "A"  "B" 
## [3,] "B"  "B" 
 
## Possible genotypes for tetraploids
combinations(n=length(alleles), r=4, v=alleles, repeats.allowed=TRUE)ΕΎ
##      [,1] [,2] [,3] [,4]
## [1,] "A"  "A"  "A"  "A" 
## [2,] "A"  "A"  "A"  "B" 
## [3,] "A"  "A"  "B"  "B" 
## [4,] "A"  "B"  "B"  "B" 
## [5,] "B"  "B"  "B"  "B" 
 
permutations(n=length(alleles), r=4, v=alleles, repeats.allowed=TRUE)
##       [,1] [,2] [,3] [,4]
##  [1,] "A"  "A"  "A"  "A" 
##  [2,] "A"  "A"  "A"  "B" 
##  [3,] "A"  "A"  "B"  "A" 
##  [4,] "A"  "A"  "B"  "B" 
##  [5,] "A"  "B"  "A"  "A" 
##  [6,] "A"  "B"  "A"  "B" 
##  [7,] "A"  "B"  "B"  "A" 
##  [8,] "A"  "B"  "B"  "B" 
##  [9,] "B"  "A"  "A"  "A" 
## [10,] "B"  "A"  "A"  "B" 
## [11,] "B"  "A"  "B"  "A" 
## [12,] "B"  "A"  "B"  "B" 
## [13,] "B"  "B"  "A"  "A" 
## [14,] "B"  "B"  "A"  "B" 
## [15,] "B"  "B"  "B"  "A" 
## [16,] "B"  "B"  "B"  "B" 
 
## Possible genotypes for 8-14 ploids
spectrum <- seq(from=8, to=14, by=2)
nS <- length(spectrum)
retC <- vector(mode="list", length=nS)
retP <- vector(mode="list", length=nS)
for(i in 1:nS) {
  retC[[i]] <- combinations(n=length(alleles), r=spectrum[i], v=alleles, repeats.allowed=TRUE)
  retP[[i]] <- permutations(n=length(alleles), r=spectrum[i], v=alleles, repeats.allowed=TRUE)
}
combC <- sapply(retC, nrow)
combP <- sapply(retP, nrow)
cbind(spectrum, combC, combP)
##      spectrum combC combP
## [1,]        8     9   256
## [2,]       10    11  1024
## [3,]       12    13  4096
## [4,]       14    15 16384
Created by Pretty R at inside-R.org