leveldb

    1. 1. Why use LevelDB

      1. LevelDB is a high quality B+-tree implementation. We want to use it as a B+tree index to a data file to speed up query processing.

      2. The main functionalities of LevelDB we want are:

        1. Storing records as key/value pairs where the key is the key attributes of the records.

        2. Fast search of record(s) by the key.

        3. Traveral of records in the order of increasing key value.

      3. For the less patient readers, please refer to this well written user's guide on LevelDB API from the authors of LevelDB.

    2. 2. First few steps of creates LevelDB database

      1. LevelDB uses a directory to store its data. You should have already experimented with the basics of running LevelDB applications in "Setup" practical learning unit of this course.

    1. 2.1. Compiling with LevelDB

          1. Download leveldb-x.x.x.tar.gz from Google Code repository, and uncompress it to some directory.

          2. Run make in .../leveldb-x.x.x/. This compiles the static leveldb library, namedlibleveldb.a

          3. You need to include the libleveldb.a and the pthread library when compiling C/C++ source code that uses leveldb API.

        1. The following command compiles the my_program.cc:

      1. export LEVELDB=~/leveldb-1.9.0 g++ -o my_program my_program.cc $LEVELDB/libleveldb.a -lpthread -I $LEVELDB/include

        1. where we assume that ~/level-1.9.0/ is the path where leveldb was stored.

    1. 2.2. Creating LevelDB database

        1. #include "leveldb/db.h"

        2. ...

        3. /**

        4. * opens a database connection to "./leveldb_dir"

        5. */

        6. leveldb::DB *db;

        7. leveldb::Options options;

        8. options.create_if_missing = true;

        9. leveldb::Status status = leveldb::DB::Open(options, "./leveldb_dir", &db);

    1. 3. Writing and retrieving key/value pairs

    2. 3.1. Key value pairs

        1. LevelDB is a B+tree implementation, so it stores key/value pairs. It assumes that all keys are unique.

        2. In the C++ namespace leveldb, we find that DB class has three important methods:

        • DB::Put(WriteOptions& options, const Slice& key, const Slice& value)

        • DB::Get(ReadOptions& options, const Slice& key, std::string *value)

        • DB::Delete(WriteOptions& options, const Slice& key)

        1. We will be using Put and Get to write key/value pairs to the B+ tree index, and get them back.

        2. The default ReadOptions and WriteOptions can be constructed by:

        3. leveldb::WriteOptions()

        4. leveldb::ReadOptions()

        5. Put(...) models both the key and value as leveldb::Slice. Internally, it's just a byte array, but equipped with many very useful conversion mehods.

        6. We can construct a slice (key or value) from a string literal:

      1. leveldb::Slice key = "CSC443/WINTER";

        1. We can also construct a slice from a C++ string:

      2. leveldb::Slice value = std::string("This is a course about database implementation.")

        1. We can even construct a slice from a pointer.

        1. Get assumes std::string

        2. It's important to note that when reading key/value pairs from the B+ tree, leveldb returns the value as a C++ string.

    1. 3.2. Dealing with unique keys

        1. LevelDB assumes that all keys must be unique. Sometimes, we may wish to index records on attributes which are not keys of the table. A common trick suggested by the author of LevelDB is to append a running counter to the key.

        2. long unique_counter = 0;

        3. char key[10];

        4. char unique_key[10 + sizeof(long)];

        5. while(...) {

        6. ...

        7. unique_counter ++;

        8. get_key(record, &key);

        9. memcpy(unique_key, key, 10);

        10. memcpy(unique_key + 10, &unique_counter, sizeof(long));

        11. db->Put(Slice(unique_key, sizeof(unique_key)), value);

        12. }

    1. 4. Iterating over a range

        1. LevelDB supports efficient iteration in ascending order of keys.

        2. leveldb::Iterator is a class that supports very useful navigation in the B+ tree.

        3. The following code constructs (and frees) a new iterator:

        4. leveldb::Iterator* it = db->NewIterator(leveldb::ReadOptions());

        5. ...

        6. delete it;

        7. The following code prints out all the key/value pairs in the B+ tree:

        8. leveldb::Iterator* it = db->NewIterator(leveldb::ReadOptions());

        9. for (it->SeekToFirst(); it->Valid(); it->Next()) {

        10. leveldb::Slice key = it->key();

        11. leveldb::Slice value = it->value();

        12. std::string key_str = key.ToString();

        13. std::string val_str = value.ToString();

        14. cout << key_str << ": " << val_str << endl;

        15. }

        16. assert(it->status().ok()); // Check for any errors found during the scan

        17. delete it;

        18. To iterate over a finite range we can do this:

        19. leveldb::Slice start = ...;

        20. leveldb::Slice end = ...;

        21. leveldb::Options options;

        22. leveldb::Iterator* it = db->NewIterator(leveldb::ReadOptions());

        23. for (it->Seek(start); it->Valid(); it->Next()) {

        24. leveldb::Slice key = it->key();

        25. leveldb::Slice value = it->value();

        26. if(options.comparator->compare(key, end) > 0) {

        27. // key > end

        28. break;

        29. } else {

        30. /**

        31. * process (key, value)

        32. */

        33. }

        34. }

        35. delete it;

        1. We assume that you are using g++ compiler.

      1. comment

      2. comment