The extended version of the storage server will support tables where the records can have multiple columns, each with possibly different data types.
Here's an example of a table to store information about the TTC subway lines. The key is the colour code of the line, and the record consists of columns for the name of the line, the number of stops in each line, and the length of the line in kilometres.
The only modification to the configuration file is that the table parameter must now specify not just the name of the table but the schema of each table. The schema of a table consists of the name of each column in the table, and the type of the column. The column name is a string of alphanumeric characters, and the type is one of the following:
A column name and its type are separated by a colon, while different column specifications are separated by commas, with optional whitespace before and after the comma. In addition to the columns (which are explicitly specified in the table schema), each table has an implicit key which is unique for each record in the table. The table keys are always alphanumeric strings and are used in various client library functions in order to access the data in the table. Here are some examples of the table parameter in the configuration file.
table subwayLines name:char,stops:int,kilometres:int table cities lowTemperature:int,highTemperature:int,province:char
Note that the table specification only specifies the columns in the record. The key, which is always an alphanumeric string as in the previous milestones, is implicit. For example, in the
In case of duplicate table definitions (i.e., the same table is defined with different columns or column types), the server should exit with an appropriate error.
Here are some bad examples of the table parameter in the configuration file.
// two definitions of the same table with different columns table subwayLines name:char, stops:int, kilometres:int table subwayLines stops:int, kilometres:int // two definition of the same table with different column types (Note // the numbers 30 and 40 for char) table subwayLines name:char, stops:int, kilometres:int table subwayLines name:char, stops:int, kilometres:int
The client library is also different in this version of the storage server in order to support the more complex table schema.
There is one new
int storage_query(const char *table, const char *predicates, char **keys, const int max_keys, void *conn);
The table below summarizes the behaviour of the key functions. Note that the parameters passed to our
The record structure is not changed from the previous assignment, but the
name Bloor Danforth , stops 31 , kilometres 26 lowTemperature -7 , highTemperature 28 , province Ontario brand BMW , price 43210 id 991234567 , grade 87
Note again that only the columns in the record are included in the
// Columns out of order. (For the subwayLines table.) name Bloor Danforth , kilometres 26 , stops 31 // highTemperature should be a int. (For the cities table.) lowTemperature -7 , highTemperature twentyeight, province Ontario // brand column is missing. (For the cars table.) price 43210 // Column name is misspelled (note the case-sensitivity). (For the students table.) id 991234567 , Grade 87
// Find subway lines with more than 10 stops. (For the subwayLines table.) stops > 10 // Find cities in Ontario that don't get colder than -10 Celsius. (For the cities table.) province = Ontario , lowTemperature > -10 // Find all Toyota cars that cost less $20000. (For the cars table.) brand = Toyota , price < 20000 // Find students who are failing. (For the students table.) grade < 50
For a record to match a set of predicates, the record must match every predicate in the set.
// The stops column value must be an integer. (For the subwayLines table.) stops > 10.0 , kilometres < 40 // Cannot use ">" operator for a string column type. (For the cities table.) provice > Ontario , lowTemperature > -10 // Note the missing comma and "=" operator. (For the cars table.) brand Toyota price < 20000 // There is no Grade column name. (For the students table.) Grade < 50
You need to modify the server to be able to process the configuration file according to the specification described above, as well as communicate with the client library, including the new
You may make the following assumptions:
The constants above are defined in a
The parsing in this assignment is more challenging. The configuration file and the client/server protocol must all be able to process a more complex table schema. While you may write your own parsing algorithms, it is sometimes easier to use specialized tools to simplify this part of the code.
If you choose to do the parsing yourself, you may find functions such as
On the other hand, you may use the Flex and Bison tools to help in this task. If you do use these tools, you can earn bonus marks as outlined here. Consult the relevant lecture slides and resources in the Course Reader for help with these tools.
You must use the Check unit test framework to test the functionality of your code. You need to write at least three test suites, each in a separate directory as outlined below.
Each test application may use any of the
In all the tests you should consider testing corner cases, such as trying to get a non-existent key from a table, doing a query with column names not defined in a table, or storing a record with columns listed in the wrong order. You are free to define your own table schemas but it would be a good idea to construct tables with columns of all supported types (strings and integers)
In each test directory, it should be possible to run the test by typing
Each test application should perform all initialization required, such as starting the server. The user should just be able to type
Your deliverable should contain at least 5 added tests.
Some of the marking tests are available at
> cd ~/ece297/storage/test > tar zxf /cad2/ece297s/public/assignment3/a3-partial.tgz > cd a3-partial > make clean run
As in the previous assignment, here are some of the major tasks involved in this assignment.
Again, it is up to you whether you want to divide the tasks as above.
In this section we summarize key design decisions you will have to make when designing your storage server.
These design decisions are:
You should argue about why you came to your design decisions and discuss the pros and cons of the decisions, such as the ease of implementation, robustness to failures, performance implications, and so on. Note, for many decisions you will have to balance trade-offs. Try to understand what they are.
Please note, there are two deadlines for this assignment. We first ask you to hand in your design documents, later followed by the code you developed.
By the design document deadline, you must hand in the M3 Design Document: Extended Storage Server that includes the following content:
The maximum length for the main body of your document is eleven (11) pages. This page count does not include Executive Summary, Table of Contents, List of References, and Appendices. For this document, you may include up to five (5) pages of appendices. Please refer to the grading rubric for further information.
By the software development deadline, you must hand in your software artifacts that includes at least the following content.
Milestone 3 >