- read data sets from a disk,
- modify the data, and
- write it back to a disk.
All fine, but of data set is really big, then you will soon stumble on memory issues. If data processing is simple and you can read only chunks, say only line by line, then the following might be useful:
## File file <- "myfile.txt" ## Create connection con <- file(description=file, open="r") ## Hopefully you know the number of lines from some other source or com <- paste("wc -l ", file, " | awk '{ print $1 }'", sep="") n <- system(command=com, intern=TRUE) ## Loop over a file connection for(i in 1:n) { tmp <- scan(file=con, nlines=1, quiet=TRUE) ## do something on a line of data }
2 comments:
Gave it a try with my text file but got this error
"Error in system(command = com, intern = TRUE) : 'wc' not found"
wc is a UNIX/Linux command that counts the number of words and/or lines in a file. You could modify the code so that it does not really on this utility.
Post a Comment