On "A History and Evaluation of System R"

Post date: Nov 23, 2012 12:21:08 PM

The second paper introduces a database system, called system R, which follows the relational data model, a modeling tool for database systems where relation between entities (things that can be uniquely identified) is featured. The paper presents techniques that can be used to optimize file access while also considering data independence, the concept of abstracting applications from how files are being accessed.


The development of system R is divided into three phases. The initial phase is designed as a testing prototype designed for single user-access; the second phase is the improvement of the initial prototype which allows multi-user access; the third phase is the evaluation phase where performance of the resulting improved version is assessed. The initial phase serves as a prototype where only single-user access is allowed. The relational access method used, called XRM, doesn't have mechanisms for locking or recovery. The system enables user to (a) query and update the database and (b) dynamically create database relations. In the initial implementation, mechanism for implementing the join construct has not yet been accomplished. From the initial phase, the authors were able to determine a more appropriate measure of cost for a query. They were also able to realize the importance of implementing join operation. They concluded that the initial implementation is designed to handle complex queries, but there is a need to put more priority on handling of simple commands.

The next phase of the development is devoted to building a multi-user prototype. The resulting work is designed so that there is a separate subsystem for access methods, called Research Storage System (RSS) and another subsystem for optimizing SQL commands before processing (the system is called Relational Data System (RDS)); RDS runs on top of RSS. Modularizing access methods and SQL optimization allows for data independence since RSS no longer needs to be concerned with locking and logging functions, while RDS is abstracted from how files are being accessed. For the multi-user prototype, RDS uses locks and views to handle security and authorization issues. On the other hand, RSS uses a data structure called a B tree, whose strength lies in accessing large amount of data in the database. Moreover, log mechanism is used to provide recovery capabilities. In logging, any transaction performed in the database is being logged before the transaction is committed.

One of the highlight of this work is the realization that SQL commands can be executed from assembling machine-language fragments. They were able to provide a library of these machine-language fragments so that compilation of SQL commands can be optimized. Since relational database models is the commonly used model for database systems, most of the techniques (use of B trees, logs, views etc.) that were proposed in this paper remains adapted to current database systems. However, the design consideration for the work is the construction of a flexible system that can cater to many applications and still performs well (on average). For example, the use of B-trees lies in the assumption that most of the time, a single command requires access to a large amount of data. For applications where such is not the case, use of such feature may result to poor performance.