storing objects

A few questions on storing objects in file:

- can we always do fout << obj and fin >> obj?
  - how about a working example and a non-working example?
    - will int or double object work?
    - will string object work?
  - does "complex" obj work?
    - what if we output complex numbers as 3+i, 3, 5i?
    - can we read it back?
    - what if there is an empty line?
    - what if there is an invalid character(e.g. 3+ai)?
- what can we do to make it work for our object?
- in many real-life situations, we need to read file from others (e.g. stock market data).
  - fin >> stock?

Ideally we want to be able to store objects directly in file, such that reading/writing is a simple exercise.

void ReadComplexFileEOF()

{

complexType ca[CA_MAX_SIZE];

ifstream fin;

fin.open(ComplexFileName);

int i=0;

while (!fin.eof()){

fin >> ca[i++];

}

fin.close();

}

That's certainly possible if we carefully design our object format on the file and construct our << and >> operator accordingly. For example, we can use (real, imaginary) format to represent every complex object on the file. The operator << and >> routines will be straightforward and we can have symmetric object read write functions. If you have control on the file format, this is certainly the preferred design. One implementation is shown below.

ostream& operator<< (ostream& os, const complexType& complex)

{

os << "(" << complex.realPart << ", " << complex.imaginaryPart << ")" << endl;

return os;

}

istream& operator>> (istream& is, complexType& complex)

{

char ch;

is >> ch; //read and discard (

is >> complex.realPart; //get the real part

is >> ch; //read and discard comma

is >> complex.imaginaryPart; //get the imaginary part

is >> ch; //read and discard )

return is;

}

of course, the whole routine can be easily written as :

is >> ch >> complex.realPart >> ch >> complex.imaginaryPart >> ch;

In real world, many data files do not conform to our carefully designed format. Therefore, import and export routines are commonly used to do necessary conversions. A simple-minded skeleton of import conversion could be:

void ImportComplexFile(string fname)

// assume the import file contents are always in the form of a+bi

{

ifstream fin;

double real,im;

char plusorminus,ichar;

string oneline;

fin.open(fname.c_str()); // error checking needed

while (!fin.eof()){

getline(fin, oneline);

stringstream(oneline)>>real>>plusorminus>>im>>ichar;

}

fin.close();

}

It works with simple, well formed files where every object is a+bi. But, it won't work with 3i, 2, or 2-3i. Additional care needs to be taken to handle all these cases. Furthermore, error conditions are not considered either, e.g. 2+%i.

We can take care of 3i, 2, or 2-3i kind of data by checking what we read in, such as the following example.

void ImportComplexFile2(string fname)

{ // works for forms of a+bi, a, bi

ifstream fin;

fin.open(fname.c_str());

double real,im;

char plusorminus,ichar;

complexType c;

string oneline;

while (!fin.eof()){

getline(fin, oneline);

real=0; im=0; plusorminus='\0'; ichar='\0';

stringstream(oneline)>>real>>plusorminus>>im>>ichar;

switch (plusorminus){

case '-': im=-im; break;

case 'i': im=real; real=0; break;

case '\0': im=0; break;

}

c.setComplex(real, im);

cout << c << endl;

}

fin.close();

}

It improves some, but it is still not good enough. It does not do any error checking and it fails on the cases of i or 2+i. Input of 2i or 1i is ok, but i is not. Why? Because the instruction >>real>>plusorminus>>im>>ichar reads something to real and failed. The same situation happens with input of 3-1i and 3-i.

A more robust solution is to use the "state machine" approach.

CSV(comma separated values) File Format

CSV is a popular file format which uses a comma (,) or other designated character as a field delimiter. Wikipedia has an extensive explanation.

To parse csv files, I prefer to use C++ string library, especially the member functions: find, find_first_of and substr.

The lessons from this exercise:

If we can control the format on file, storing objects is quite straightforward.
Design stream insertion / extraction overloading symmetrically is generally a good idea
Exporting to other format is easier than importing from other format. It is particularly true for detecting errors.
High level IO (stream operation) is usually easier, but it has limitations
"stringstream" is a good friend for converting string to stream so that we can deal with high level IO
Low level IO (reading string one character at a time) is harder, but may be necessary

Note:

- All error checks (e.g. file not found) are omitted for clarity. You need to make sure your data files are in the project directory though
- Fixed size array is used for stupidity

Page updated

Google Sites

Report abuse