2013-12-01

Read line by line of a file in R

Are you using R for data manipulation for later use with other programs, i.e., a workflow something like this:
  1. read data sets from a disk,
  2. modify the data, and
  3. write it back to a disk.
All fine, but of data set is really big, then you will soon stumble on memory issues. If data processing is simple and you can read only chunks, say only line by line, then the following might be useful:

## File
file <- "myfile.txt"
 
## Create connection
con <- file(description=file, open="r")
 
## Hopefully you know the number of lines from some other source or
com <- paste("wc -l ", file, " | awk '{ print $1 }'", sep="")
n <- system(command=com, intern=TRUE)
 
## Loop over a file connection
for(i in 1:n) {
  tmp <- scan(file=con, nlines=1, quiet=TRUE)
  ## do something on a line of data 
}
Created by Pretty R at inside-R.org