Reliability of pedigree-based and genomic evaluations in selected populations

Paper on "Reliability of pedigree-based and genomic evaluations in selected populations" has finally been published after tedious reviews. This work is a follow up study on reliability of genetic evaluation in selected populations by Piter Bijma (link), but this time connecting pedigree-based and genomic evaluations. The take-home message is that PEV-based reliabilities of genomic predictions are not so much affected by selection as pedigree predictions. This has implications when different breeding program designs are compared using PEV reliabilities - commonly PEV reliability of genomic and pedigree predictions are compared - and suggests that genotyping female selection candidates has been undervalued and should be reconsidered.


Plant breeding and genetics club videos

Plant breeding and genetics club at K-state have a nice website, which hosts videos from the symposia they organised (there was one in 2013 and one is planned for 2015).


Potential of genotyping-by-sequencing for genomic selection in livestock populations

Our new paper titled "Potential of genotyping-by-sequencing for genomic selection in livestock populations" has been published in Genetics Selection Evolution. This work shows that genotypes called from low-coverage sequencing data can be equally or even more powerful for genomic prediction than high-quality SNP genotypes. In particular manipulation of coverage allows us to increase number of genotyped individuals at the expense of genotype quality and this can bring us a long way before accuracy of genomic predictions falls significantly. Another useful application is in increasing selection intensity by genotyping more/all selection candidates.


Read line by line of a file in R

Are you using R for data manipulation for later use with other programs, i.e., a workflow something like this:
  1. read data sets from a disk,
  2. modify the data, and
  3. write it back to a disk.
All fine, but of data set is really big, then you will soon stumble on memory issues. If data processing is simple and you can read only chunks, say only line by line, then the following might be useful:

## File
file <- "myfile.txt"
## Create connection
con <- file(description=file, open="r")
## Hopefully you know the number of lines from some other source or
com <- paste("wc -l ", file, " | awk '{ print $1 }'", sep="")
n <- system(command=com, intern=TRUE)
## Loop over a file connection
for(i in 1:n) {
  tmp <- scan(file=con, nlines=1, quiet=TRUE)
  ## do something on a line of data 
Created by Pretty R at inside-R.org


Parse arguments of an R script

R can be used also as a scripting tool. We just need to add shebang in the first line of a file (script):


and then the R code should follow.

Often we want to pass arguments to such a script, which can be collected in the script by the commandArgs() function. Then we need to parse the arguments and conditional on them do something. I came with a rather general way of parsing these arguments using simply these few lines:

## Collect arguments
args <- commandArgs(TRUE)
## Default setting when no arguments passed
if(length(args) < 1) {
  args <- c("--help")
## Help section
if("--help" %in% args) {
      The R Script
      --arg1=someValue   - numeric, blah blah
      --arg2=someValue   - character, blah blah
      --arg3=someValue   - logical, blah blah
      --help              - print this text
      ./test.R --arg1=1 --arg2="output.txt" --arg3=TRUE \n\n")
## Parse arguments (we expect the form --arg=value)
parseArgs <- function(x) strsplit(sub("^--", "", x), "=")
argsDF <- as.data.frame(do.call("rbind", parseArgs(args)))
argsL <- as.list(as.character(argsDF$V2))
names(argsL) <- argsDF$V1
## Arg1 default
if(is.null(args$arg1)) {
  ## do something
## Arg2 default
if(is.null(args$arg2)) {
  ## do something
## Arg3 default
if(is.null(args$arg3)) {
  ## do something
## ... your code here ...
Created by Pretty R at inside-R.org

It is some work, but I find it pretty neat and use it for quite a while now. I do wonder what others have come up for this task. I hope I did not miss some very general solution.