2023-10-01

Overload of the term "additive" in quantitative genetics

The following discussion came up on SLiM (https://messerlab.org/slim/) mailing list (https://groups.google.com/g/slim-discuss), which I think is highly indicative of confusion among many of us about additive and non-additive gene effects, additive and non-additive genetic values, and corresponding additive and non-additive genetic variances. Doh!

It started by a common comment that there is not much non-additive genetic variation, so maybe we can ignore non-additive gene action in simulations. While this is often done, it's important to be careful about how the term "additive" and "non-additive" are used, which I think is leading to lots of confusion among many (including me).

Let's see how the term "additive" is used in multiple ways in quantitative genetics!

1) The quantitative genetics model

phenotypic_value = intercept + genetic_value + environmental_value

is additive by construction to being with, but nobody mentions this - it's likely that biology is not so linear! But, this gives an "additive" model that we can work with well and it seems to be giving us good predictions for many quantities.

2) The decomposition of genetic value

genetic_value = additive_genetic_value + dominance_deviation + epistasis_deviation,

is again additive by construction (we are summing up things), but let’s move to the additive_genetic_value (=breeding value), which is allele substitution effect (alpha) multiplied by allele dosage (if dosages are, say, 0, 1, and 2, then we have 0alpha, 1alpha, 2alpha) for each locus and then summed over all causal loci. There are two "additivities" here (in addition to the additive decomposition of the phenotypic value and the genetic value!) - adding up allele substitution effects within a locus, and then across the loci.

3) The allele substitution effects and gene action

Allele substitution effects are obtained by a linear (=additive) regression of phenotypic values onto allele dosages, which (in a randomly mating population and without epistasis and GxE, but with dominance) turns out to be:

alpha = a + d(q-p),

where -a and +a are values for the two homozygotes (with "origin" in the middle) and d is a value of the heterozygote relative to the "origin" - this is the standard quantitative genetics parameterization (see Falconer & MacKay green book page 109 - in the 1996 version). These -a, d, and +a values are the values of genotypes (genetic values) in the first (phenotype) model shown at the top.

The a value above is sometimes referred to as an additive gene action at a locus, and d as a dominant gene action at a locus.

So, the above case shows ~5 uses of "additivity", but the show goes on! We typically see that variance of dominance deviations, and likely also epistatic deviations, is small, so we can conclude that non-additive genetic variance can be ignored? It depends! I think we need to distinguish between:

A) what is happening in reality - we don't really know, but clearly biology is highly non-linear (=non-additive), BUT 1st order approximations (=additive) will capture the majority of variation

B) what we simulate - relevant discussion in the mailing list, but we want simulations to mimic A, but we can only set parameters based on C (see next)

C) what we can estimate from the data - indeed many studies find that the variance of breeding values seems to explain most of the variance in genetic values, leading to the usual statement that most of "genetic variance is additive", BUT there is a caveat that breeding values are a 1st order approximation and as such capture additive and some of the non-additive gene effects. Studies that report dominance variance, technically variance between dominance deviations (the part not captured by breeding values - and note that breeding values capture some dominance variation!), often report small values, again indicating that most variation is "additive". BUT, some of these studies are underpowered to get accurate estimates of variance between dominance deviations. On the other hand, there is quite a lot of studies of inbreeding depression and heterosis, indicating that there must/should be dominant gene effects (there are different hypothesis about this too that I will not go into!). I know that there is a substantial inbreeding depression in maize inbred lines, which then generates very large heterosis in their hybrids. Then, I guess sometimes there are real dominant gene effects, but selection is keeping allele frequency of unfavorable mutations low (so we only rarely see unfavorable/unfavorable genotype!), meaning that observed variance between dominance deviations at that locus will be low ...

So, all this "additivity" is convoluted.

These two papers touch on some of these points (there is lots more literature about this topic!):

Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1000008


The Genetic Architecture of Quantitative Traits Cannot Be Inferred from Variance Component Analysis
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1006421

2022-11-11

Find ancestor (R function)

We have a pedigree and want to find a sub-pedigree that lists the ancestors of some individuals. Here is an R function for this, but first, lets show an example:

library(package = "pedigreemm")
library(package = "graph")
library(package = "Rgraphviz")

ped <- data.frame( id = c(  1,   2,   3,   4,   5,   6,   7,   8,   9,  10),
                  fid = c( NA,  NA,   2,   2,   4,   2,   5,   5,  NA,   8),
                  mid = c( NA,  NA,   1,  NA,   3,   3,   6,   6,  NA,   9))
ped2 <- with(ped, pedigree(sire = fid, dam = mid, label = id))
g <- as(t(as(ped2, "sparseMatrix")), "graph")
plot(g)

Now the function

traceAncestors <- function(ids, ped, missing = NA) {
  # ids - a vector of individuals, possibly not unique
  # ped - data.frame of global pedigree with id, father, and mother columns
  
  # Take pedigree rows for ids
  sel <- ped[[1]] %in% ids
  ret <- ped[sel, ]
  # Find their parents (new ids)
  ids <- c(ped[[2]][sel], ped[[3]][sel])
  # ... that are unique and known
  ids <- unique(ids[!ids %in% missing])
  # ... that are not ids already
  ids <- ids[!ids %in% ret[[1]]]
  
  # Loop
  while (length(ids) > 0) {
    # Take pedigree rows for new ids
    sel <- ped[[1]] %in% ids
    ret <- rbind(ped[sel, ], ret)
    # Find their parents (new ids)
    ids <- c(ped[[2]][sel], ped[[3]][sel])
    # ... that are unique and known
    ids <- unique(ids[!ids %in% missing])
    # ... that are not ids already
    ids <- ids[!ids %in% ret[[1]]]
  }
  return(ret)
}

And a few examples

> traceAncestors(ids = 4, ped = ped)
  id fid mid
2  2  NA  NA
4  4   2  NA
> 
> traceAncestors(ids = 6, ped = ped)
  id fid mid
1  1  NA  NA
2  2  NA  NA
3  3   2   1
6  6   2   3
> 
> traceAncestors(ids = c(4, 6), ped = ped)
  id fid mid
1  1  NA  NA
2  2  NA  NA
3  3   2   1
4  4   2  NA
6  6   2   3
> 
> traceAncestors(ids = c(4, 6, 10), ped = ped)
   id fid mid
1   1  NA  NA
5   5   4   3
2   2  NA  NA
3   3   2   1
8   8   5   6
9   9  NA  NA
4   4   2  NA
6   6   2   3
10 10   8   9 

2015-08-13

Reliability of pedigree-based and genomic evaluations in selected populations

Paper on "Reliability of pedigree-based and genomic evaluations in selected populations" has finally been published after tedious reviews. This work is a follow up study on reliability of genetic evaluation in selected populations by Piter Bijma (link), but this time connecting pedigree-based and genomic evaluations. The take-home message is that PEV-based reliabilities of genomic predictions are not so much affected by selection as pedigree predictions. This has implications when different breeding program designs are compared using PEV reliabilities - commonly PEV reliability of genomic and pedigree predictions are compared - and suggests that genotyping female selection candidates has been undervalued and should be reconsidered.

2015-03-16

Plant breeding and genetics club videos

Plant breeding and genetics club at K-state have a nice website, which hosts videos from the symposia they organised (there was one in 2013 and one is planned for 2015).

2015-03-07

Potential of genotyping-by-sequencing for genomic selection in livestock populations

Our new paper titled "Potential of genotyping-by-sequencing for genomic selection in livestock populations" has been published in Genetics Selection Evolution. This work shows that genotypes called from low-coverage sequencing data can be equally or even more powerful for genomic prediction than high-quality SNP genotypes. In particular manipulation of coverage allows us to increase number of genotyped individuals at the expense of genotype quality and this can bring us a long way before accuracy of genomic predictions falls significantly. Another useful application is in increasing selection intensity by genotyping more/all selection candidates.