2008-10-10

Plot of normal distribution with shaded area

Theory of quantitative genetics much relies on the normal (Gaussian) distribution. Therefore, one would often like to plot it for presentations, class notes, etc. This is not that hard with excellent tools we have today. I played with plotting in R and came up with the following plot of normal distribution showing:
  • the density of standardized breeding values with a mean of 100 units and a standard deviation of 12 unit and
  • shaded area that roughly corresponds to upper 5 % of the distribution.
Here is the plot, followed by the R code.




## Define the mean and the standard deviation
mu <- 100
sigma <- 12

## Define the % of upper are to shade
k <- 0.05

## Define the grid
x <- (seq(-5, 5, 0.01) * sigma) + mu

## Compute the density of normal distribution over the grid
y <- dnorm(x=x, mean=mu, sd=sigma)

## Compute the "threshold" for upper K % of the distribution
t <- qnorm(p=1-k, mean=mu, sd=sigma)

## Open PDF
## pdf(file="normal.pdf", width=5, height=3, pointsize=12)

## Open Windows metafile --> good for inclusion into MS Office documents
## win.metafile(filename="normal.wmf", width=5, height=3, pointsize=12)

par(mar=c(5, 4, 1, 1) + 0.1, # c(bottom, left, top, right)
bty="l",
pty="m")
plot(y ~ x, type="l", axes=FALSE,
xlab="Standardized breeding value",
ylab="Distribution", lwd=3)
## Mark the mean and the "threshold"
abline(v=c(mu, t), lwd=2, lty=2)
abline(h=0, lwd=3)
axis(1)
axis(2, labels=FALSE, tick=FALSE)
box()

## Add shaded polygon
testK <- x >= t
xK <- x[testK]
yK <- y[testK]
polygon(x=c(xK, rev(xK)), y=c(yK, rep(0, times=length(xK))),
col="black", border=NA)

## Add arrow --> this will need some manual work, i.e. modify the values
x2 <- t * 1.1
y[which(round(x) == round(x2))[1]]
max(y)
arrows(x0=(mu + sigma * 3.3), y0=y[which(round(x) == t)[1]],
x1=x2, y1=y[which(round(x) == x2)[1]], lwd=2)
text(x=142, y=y[which(round(x) == t)[1]] + 0.0015, labels=paste(t, "%"))
text(x=(mu + sigma * 3.3), y=0.02, labels="Good\n(+)")
text(x=(mu - sigma * 3.3), y=0.02, labels="Bad\n(-)")

## Close the device
## dev.off()

5 comments:

Janez said...

This is just great! I like your graph, especially in PDF. Thanks for this useful post!

Grace said...

Hi, Thanks so much for posting your code! I searched for some code that graphed normal distributions with shaded areas and came across this. Just a question though, if I want to stretch out the graph a little, what would I have to change in order to do so? Also, what do I have to change in order to label the x-axis with value of the mean of the distribution and the threshold value? Thanks!

Gregor Gorjanc said...

About the streching. You can stretch the file since PDF and metafile are vector formats. Otherwise you can play with pdf() and win.metafile() functions.

About the axis labels. You can manually add them to xlab argument or use a combination of paste().

Robin said...

Hey, very helpful, thanks for sharing! :)

NoC said...

Very helpful! I'll use your plot in a statistics class I'm teaching this semester.