2009-02-11

Fitting Legendre (orthogonal) polynomials in R

Frederick Novomestky packaged a series of orthogonal polynomials in the orthopolynom R package. However, his functions can not be used "directly" in a statistical model, say in lm(). There is no need to use functions from orthopolynom package, since there is a poly() function in stats package that is shipped with R. Nevertheless, I played with class of Legendre polynomials. (Wikipedia, Wolfram) This class of polynomials is very popular in my field since the introduction of so called random regression models (e.g. link1, link2, link3, ..., and some of our work), though the splines (Wikipedia, Wolfram) are making their way through (see this link and follow references). Let's go to orthopolynom package. Say we want to use Legendre polynomial of 4th order for a variable x. First we need to get the coefficients for basis functions:
library(package="orthopolynom")
(leg4coef <- legendre.polynomials(n=4, normalized=TRUE))
[[1]]
0.7071068

[[2]]
1.224745*x

[[3]]
-0.7905694 + 2.371708*x^2

[[4]]
-2.806243*x + 4.677072*x^3

[[5]]
0.7954951 - 7.954951*x^2 + 9.280777*x^4
To use this polynomial in a model, we need to create a design matrix with sensible column names and without the intercept:
x <- 1:100
leg4 <- as.matrix(as.data.frame(polynomial.values(polynomials=leg4coef,
x=scaleX(x, u=-1, v=1))))
colnames(leg4) <- c("leg0", "leg1", "leg2", "leg3", "leg4")
leg4 <- leg4[, 2:ncol(leg4)]
Now we can use this in a model, e.g., lm(y ~ leg4). I made this whole process easier - with the functions bellow, we can simply use lm(y ~ Legendre(x=scaleX(x, u=-1, v=1), n=4)). I contacted Fred and I hope he will add some version of these functions to his package.

Update 2009-03-16: If we want that the intercept has a value of 1, we need to rescale the polynomial coefficients by multiplying with sqrt(2).
Legendre <- function(x, n, normalized=TRUE, intercept=FALSE, rescale=TRUE)
{
## Create a design matrix for Legendre polynomials
## x - numeric
## n - see orthopolynom
## normalized - logical, see orthopolynom
## intercept - logical, add intercept
tmp <- legendre.polynomials(n=n, normalized=normalized)
if(!intercept) tmp <- tmp[2:length(tmp)]
polynomial.values(polynomials=tmp, x=x, matrix=TRUE)
}

polynomial.values <- function(polynomials, x, matrix=FALSE)
{
## Changed copy of polynomial.vales from orthopolynom in order
## to add matrix argument
require(polynom)
n <- length(polynomials)
if(!matrix) {
values <- vector(mode="list", length=n)
} else {
values <- matrix(ncol=n, nrow=length(x))
}
j <- 1
while(j <= n) {
if(!matrix) {
values[[j]] <- predict(polynomials[[j]], x)
} else {
values[, j] <- predict(polynomials[[j]], x)
}
j <- j + 1
}
values
}

3 comments:

Norman said...

Thanks! this example code is very helpful.
Norman Packard

Andrew T said...

I was looping over polynomial degree testing a few models, and found that I couldn't call the predict() generic on the model with a new dataset if the n parameter was a variable, rather than a constant, and this variable's value had subsequently changed. The following fails:

x=seq(2,20,by=2)
d=data.frame(x=x,y=cos(x/10))
degree=4
mdl=lm(y ~ Legendre(x=scaleX(x, u=-1, v=1), n=degree))
degree=3 # degree variable is modified
x2=seq(2,20,by=1)
d2=data.frame(x=x2)
d2$ypred=predict(mdl,d2) # fails as mdl seems to pull modified degree from environment

Gorjanc Gregor said...

Andrew,

I am confused. I did this in a fresh R session:

1. Copied function definitions for Legendre() and polynomial.values() from the above post

2. Run your example as:

library(package="orthopolynom")
x=seq(2,20,by=2)
d=data.frame(x=x,y=cos(x/10))
degree=4
mdl=lm(y ~ Legendre(x=scaleX(x, u=-1, v=1), n=degree), data=d)
degree=3 # degree variable is modified
x2=seq(2,20,by=1)
d2=data.frame(x=x2)
d2$ypred=predict(mdl,d2)

And got the following errors:

Error: variable 'Legendre(x = scaleX(x, u = -1, v = 1), n = degree)' was fitted with type "nmatrix.4" but type "nmatrix.3" was supplied

Yes, you are right, this fails, but I believe there is nothing I could do in my code to make it work as you defined Legendre(..., n=degree) and of course whenever you call (directly or indirectly) the function looks for degree in global environment. If you change it, you are in trouble. Do you have a proposal what to do?