2008-07-09

Symmetric sparse matrices with Matrix R package

These tests are flawed! See here

I need to solve the following system of normal equations i.e. I want to calculate x from Ax=b, where A is a square symmetric positive definite matrix. Since I use R, I tried to do this with the new Matrix package. However, I wanted to test different approaches of building and storing a symmetric matrix.

## Load the package
library(Matrix)

## Create a 100 x 100 sparse matrix
n <- 100 test <- Matrix(data=0, nrow=n, ncol=n, sparse=TRUE)
## See the structure
str(test)

I will only printout, that this is a matrix of dsCMatrix class which means that this is symmetric matrix stored in column compressed format.

object.size(test)
## 1.3 kB

## Adding some values to the first row
test[1, 1:n] <- 1:n
str(test)

Class is now dgCMatrix, since the matrix is not symmetric anymore.

object.size(test)
## 2.4 kB

## Force it to become symmetric
test2 <- forceSymmetric(test)
str(test2)
## dsCMatrix --> OK
object.size(test2)
## 2.5 kB
The size practically did not change, which is good - values are stored only once.


## Adding to the first column
test[1:n, 1] <- 1:n
str(test)


The class is still dgCMatrix, which means that there is no automatic checking that the matrix is
symmetric. This makes sense, since automatic checking could be a formidable task.


object.size(test)
## 3.6 kB
That is also the reason for larger size of the object in memory.


## Convert the class
test <- as(test, "dsCMatrix")
str(test)
## dsCMatrix
object.size(test)
## 2.5 Kb
After conversion to dsCMatrix class, we again drop redundant info from one off-diagonal part of the matrix.

Now let us try some sums to see how matrices behave.

## Summation of two sparse symmetric matrices
test2 <- test + test
str(test2)
## dsCMatrix
object.size(test2)
## 2.5 kB
The symmetry is preserved as well as the size of the object in memory - the "identical" size is just because we summed the same object.

Now let us try what happens if one matrix has upper and other lower triangle filled.

##Summation of upper and lower "triangle"
n <- 100
test1 <- test2 <- Matrix(data=0, nrow=n, ncol=n, sparse=TRUE)
test1[1, 1:n] <- 1:n
test2[1:n, 1] <- 1:n
test1 <- forceSymmetric(test1)
test2 <- forceSymmetric(test2)
str(test1)
## dsCMatrix
object.size(test1)
## 2.5 kB
str(test2)
## dsCMatrix
object.size(test2)
## 1.3 kB
Whoops! these matrices are the same in terms of the content, but the second one is almost half the size. This is probably due to the column compressed format i.e. lower triangle is better suited for this. This is good to know!


## Sum them
test <- test1 + test2
str(test)
## dsCMatrix
object.size(test)
## 2.5
The size of the sum is influenced by the "big" matrix.

## Reverse the sum
test <- test2 + test1
str(test)
## dsCMatrix
object.size(test)
## 2.5 kb
The same result as before - it seem that the order does not matter!


## Now sum only the "small" matrix
test <- test2 + test2
str(test)
## dsCMatrix
object.size(test)
## 1.3 kB
Aha! Again lower triangle wins.

## Now sum only the "big" matrix
test <- test1 + test1
str(test)
## dsCMatrix
object.size(test)
## 2.5 kB

No comments: