2008-11-26

Sweave.sh plays with cacheSweave

I have added support for caching to Sweave.sh script as implemented in cacheSweave R package written by Roger D. Peng. Now, one can set caching on for chunks that are time consuming (data import, some calculations, ...) and the Sweaving process will reuse the cached objects each time they are needed. Read the details about the cacheSweave package in the package vignette. Option --cache for Sweave.sh script should also be easy to understand. However, here is a minimalist example:
\documentclass{article}
\usepackage{Sweave}
\begin{document}

<<setup>>=
n <- 10
s <- 15
@

Let us first simulate \Sexpr{n} values from a normal distribution and add a \Sexpr{s} sec pause to show the effect of caching.

<<simulate, cache=true>>=
x <- rnorm(n)
Sys.sleep(s)
@

Now print the values:

<<print, results=verbatim>>=
print(x)
@

\end{document}
Now, one can run the following command:
Sweave.sh --cache test.Rnw
and the output on the command line is:
Run Sweave and postprocess with LaTeX directly from the command line
- cache mode via cacheSweave R package

R version 2.8.0 (2008-10-20)
Copyright (C) 2008 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(package='cacheSweave'); Sweave(file='test.Rnw', driver=cacheSweaveDriver);
Loading required package: filehash
filehash: Simple key-value database (2.0 2008-08-03)
Loading required package: stashR
A Set of Tools for Administering SHared Repositories (0.3-2 2008-04-30)
Writing to file test.tex
Processing code chunks ...
1 : echo term verbatim (label=setup)
2 : echo term verbatim (label=simulate)
3 : echo term verbatim (label=print)

You can now run LaTeX on 'test.tex'
When you repeat the Sweaving process, which you more or less always do, there is no need to wait for 15 second since cacheSweave package takes the x object from the cache! Excellent job Roger!

11 comments:

Yihui said...

Thanks for your information! I haven't tried the package yet. Do you mean the code "Sys.sleep(s)" was ignored by cacheSweave? Or the system was still suspended as required but Sweave just moved on to the next code chunk to evaluate x? (parallel computation?)

Bernd said...

This is really helpful and much appreciated.

BTW: There's a small typo: It's "results" in

"print, result="verbatim".

Gregor Gorjanc said...

Thank you Bernd, I fixed the typo!

Yihui, the simulate chunk is not executed at all - the Sweave process skips the evaluation of the whole chunk (including the Sys.sleep()) since caching is turned on with the cache=true option! Object x is pulled out of the cache when it is needed, i.e., in the print cunck.

I used Sys.sleep() since it nicely mimicks the "waiting problem" with Sweave. When I use Sweave I write the Rnw document (the R code and LaTeX markup) and I often compile it to check for possible typos or any side effects. When the document gets large and when the computations are more involved it turns out that you spend the most of the time in Sweaving the document. With cacheSweave, this time us much reduced.

Sarah Haile said...

This script is great, I don't know what I would do without it! Thanks a lot!

Angel said...

Hi Gregor,

I am using your magnificent script since there was no cache-support and this tweak was the only option, that I felt missing at that time. However, I am still wondering whether 'weaver' could also be easily integrated? Sorry for the lame question, but I do not feel comfortable checking/fixing bash scripts.

Cheers,
Angel
ps: weaver link: http://www.bioconductor.org/packages/1.9/bioc/html/weaver.html

Gregor Gorjanc said...

See here ;)

Angel said...

Man, that was really FAST! :) Compliment.

kjetil said...

When I open Sweave.sh in (GNU) emacs (ubuntu), I am warned that the script contain "unsafe" variables, and asked to change the script! (emacs offers to change it automaticall!)

What are this?

Kjetil halvorsen

Gregor Gorjanc said...

Kjetil, can you please let Emacs do the change and send me the script so that I can study the differences?

Tnx

Christiaan Ypma said...

I already started hacking the original script, before a h came up with the glorious idea of googling for a solution :) Thanks a lot! It works like it should work!

Gregor Gorjanc said...

Kjetil, the message from Emacs about the variables is irrelevant. Emacs complains about the variables that I use for folding the file when I edit it in Emacs.