A better way to import data from .csv file

Post date: May 9, 2014 6:21:04 AM

First, I get only a few hundreds rows of the data, then figure out the class for each column, and lastly import data data based on those classes.

setwd('~/research/RCS_data/')   filename <- "out" RData.input <- paste(filename,'.RData',sep="") csv.input <- paste(filename,'.csv',sep="")   ## ------- routine to import RCS MDS data ---------- if ( !file.exists( RData.input ) ) {   cat(sprintf('%s does not exist, so importing it...',RData.input))   sampleData <- read.csv(csv.input, header = TRUE, nrows = 300)   classes <- sapply(sampleData, class)   write.table(x=classes, file='RCS_mds_variables.csv', sep=',',append=F, col.names=T)   classes[names(classes) %in% c("seller_id","status_final","workflow_instance_id")] <- "character"   df <- read.delim(csv.input,                     sep = ",",                    stringsAsFactors = F,                    header = TRUE,                     na.strings = c("","---","9999"),                    colClasses = classes)   save(df, file=RData.input) } else {   load(RData.input) }   ## ------ preprocess ------ df[1:5,] table(df$isFraudulent, useNA='ifany') table(df$isBlocked, useNA='ifany')
Created by Pretty R at inside-R.org