Assignments

Assignment 4

posted ‎‎Nov 13, 2008 2:41 PM‎‎ by Michael Armbrust   [ updated ‎‎Dec 4, 2008 4:59 PM‎‎ ]

Overview

In this assignment, you will be creating a simple web application called MyTunes using Ruby on Rails with SQLlite as your backend database. Your website will store information about music, like songs and artists, similar to the example from discussion section earlier in the semester.  You will be practicing the database design process by designing a relational schema to represent the data we give you, defining necessary DDL statements, and writing interesting SQL queries against your database.  The technologies used in this assignment are described in more detail below.

We give you a data file in XML that describes a collection of songs, and you will be responsible for designing a relational schema to represent this data, using Ruby to parse the XML file and populate your database.  After you have populated your database, you will write DML statements to answer the queries we specify. 

Part of this assignment is presenting a front-end view to the results of your queries. Ruby on Rails uses the Model-View-Controller paradigm. Your queries will be specified in a Controller and Views will display records matching each query in your web browser. While you won't be graded on aesthetics, feel free to be as creative as you'd like with the appearance of the Views. See the MVC description below.

Administrivia

  • This assignment may be done in partners
  • It will be due December 8, 2008 at 11:59pm

Key Technologies

In this section we describe some of the tools you will be using to complete this assignment.

Ruby

Ruby is a dynamic, open-source, scripting language.  While learning ruby is obviously not a major focus of this class you will need a cursory understanding in order to complete this assignment. We recommend looking at some of the following resources before you get started. Since this is a new language for some people, please don't hesitate to ask the TAs or newgroup questions about ruby basics, as we don't want you to waste time fighting with the language.

XML

The Extensible Markup Language (XML) is a general-purpose specification for describing hierarchical data.  If you know HTML it will probably look very familiar to you.  While the full specification and validation of XML documents can be fairly complex, this should not effect you when completing this assignment.  Below is an example of the XML schema we will be giving you:

 <data>
    <song title="SexyBack (radio version)" length="208.927" sha1="13a4fd349b19aa1a1007c8c1f48a2b53">
        <artist>Justin Timberlake</artist>
        <artist>Timbaland</artist>
        <album>FutureSex/LoveSounds</album>
        <album>SexyBack (single)</album>
    </song>     <user username='michael' password='test'>
<rating song='13a4fd349b19aa1a1007c8c1f48a2b53' score='10'/>
    </user>
</data>

Each element is represented between <>.  Elements can contain child elements (such as artist with in song), attributes (such as title within song), or just plain text (such as Justin Timberlake inside of <artist></artist>.  Some elements have a <start> and </end> tag, and some (like rating) are empty-element tags.  We will be using a Ruby library, REXML, to parse the xml and convert it to a relational schema of your design.  Example code is provided in the project skeleton.

In ruby we have written most of the code for you to parse the XML.  You can access attributes (song.attributes['sha1']) and the text of the entries (artist.text.strip) from the objects passed to the lambda functions you will need to write.

RubyOnRails

Overview

Ruby On Rails is an open-source web development framework written in Ruby.  Its goal is to make it easy to rapidly design, implement and deploy database backed, interactive websites.  We will be providing you with a partially implemented site so that you can focus on schema/query desgin and not on learning the ins and outs of Ruby/Rails.  However, if you are interested in learning more there are a number of great quick start tutorials and screencasts that might help you understand some of the black magic that is going on behind the scenes.

Get Michael's live coding example from class

Understanding MVC

Rails is based around the Model-View-Controller design pattern. 

<wikipedia>
It is common to split an application into separate layers that run on different computers: presentation (UI), domain logic, and data access. In MVC the presentation layer is further separated into view and controller.  MVC is often seen in web applications, where the view is the actual HTML page, and the controller is the code that gathers dynamic data and generates the content within the HTML. Finally, the model is represented by the actual content, usually stored in a database, and the business rules that transform that content based on user actions.

Though MVC comes in different flavors, control flow generally works as follows:

  1. The user interacts with the user interface in some way (e.g. presses a button).
  2. A controller handles the input event from the user interface, often via a registered handler or callback.
  3. The controller notifies the model of the user action, possibly resulting in a change in the model's state. (e.g. controller updates user's Shopping cart).[4]
  4. A view uses the model (indirectly) to generate an appropriate user interface (e.g. the view produces a screen listing the shopping cart contents). The view gets its own data from the model. The model has no direct knowledge of the view.
  5. The user interface waits for further user interactions, which begins then a new cycle.

By decoupling models and views, MVC helps to reduce the complexity in architectural design, and to increase flexibility and reuse.

</wikipedia>

You will see this functionality separated into directories in app/ in the skeleton provided. 

There are three controllers:

  • application.rb - This is the parent controller for all other controllers in your app.  You should not need to put any code here.
  • data_controller.rb - This controller has only a single method reload that takes in an XML document as a parameter.  You can access it by pointing your browser to http://<server>:<port>/data/reload.  You will need to write the code that deletes all data from your database, parses the XML, and inserts it into the relational schema you have designed.
  • query_controller.rb - This controller has a number of functions that run interesting queries users might want to run over the their myTunes database.  Comments are provided describing the specs for each one.  You can test them by visiting http://<server>:<port>/query/.

ActiveRecord

ActiveRecord is an Object-Relational Mapping library used by RubyOnRails.  It's fairly full featured and makes a number of assumptions about the layout of your schema.  If you are familiar with this library feel free to use any basic (no plug-ins) features you. If not, don't worry, as we have provided a list of these conventions and overrides, along with few simple example below about using ActiveRecord to run arbitrarily complex database queries over the Boats/Sailors example from lecture.

Conventions:
  • Tables are the plural of the class name for any active record instance.  (I.e. class Sailor < ActiveRecord:Base would refer to data in the Sailors table)
  • The primary key for any entity is id.
    • You can override this by putting set_primary_key :key or set_primary_keys [:key1, :key2] at the top of the class definition
  • Foreign keys are always named <foreign table name>_id
    • You can override this by specifiying :foreign_key => [:song] after the belongs_to or has_many
Example:

Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
Reserves(sid, bid, day)

class Sailor < ActiveRecord:Base
    set_primary_key :sid
    has_many :reservations, :foreign_key => :sid
end


class Boat < ActiveRecord:Base
    set_primary_key :bid
    has_many :reservations, :foreign_key => :bid
end


class Reservation < ActiveRecord:Base
    set_primary_keys :sid, :bid, :day
    belongs_to :sailor, :foreign_key => :sid
    belongs_to :boat, :foreign_key => :bid
end


Make a new sailor:
s = Sailor.new
s[:sid] = 1
s[:sname] = "Michael"
s[:rating] = 10
s[:age] = 24
s.save


Find a sailor by name:
s = Sailor.find_by_sname("Michael")

Find all red boats:
b = Boat.find_all_by_color("red")

Find all reservations by a sailor:
s = Sailor.find_by_sname("Michael")
r = s.reservations

These are the basics, but sometimes you want to be able to write more complex queries.  For this you can call back on full SQL.  If you wanted all sailors who have reserved a boat today you could write.
s = Sailor.find_by_sql("SELECT DISTINCT sailor.* FROM sailors JOIN reservations ON sailors.sid = reservations.sid WHERE reservation.day = <today>")

You can even add computed collumns.  For example to find all sailors along with their reservation counts you could write:
s = Sailor.find_by_sql("SELECT sailor.*, count(*) as reservation_count JOIN reservations ON sailors.sid = reservations.sid GROUP BY sailors.sid")
s[0] #First sailor in the list
s[0][:sname] #name of the first sailor in the list
s[0][:reservation_count] #Reservation count for the first sailor in the list

Sqlite3

SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine.  Unlike postgres which uses a client server model, sqlite is an in-process database.  This simplifies deployment and removes many of the problems people had in previous labs getting postgres to run.

Creating a new database is as easy as typing: sqlite3 <db file>.  This will give you a SQL prompt where you can CREATE TABLE, SELECT, etc.  To exit the interpreter you simply type '.quit'.  For this project you will be working on a database called db/development.sqlite3

Assignment

Get the Skeleton

In your home directory on the instructional machines run the following command:
tar xvf ~cs186/fa08/hw4/skel.tar

You will see a number of directories in the main mytunes directory. Those of interest are:
  • app/
    This is the only directory you should need to make changes in. It contains source files  (*.rb and *.html.erb) for the Models, Views, and Controllers.
    Each controller file is in the controllers directory.
    There is a directory within views for each controller's view files (*.html.erb); each function in each controller has its own view, unless it directs its output to a different function's view by calling render :action => 'other_func_name'.
    Your models will go in the models subdirectory; you will create a model for each of the database relations in your schema.
    You don't need to use the helpers directory in controllers, or the layouts directory in views.
  • config/
    This directory contains the database connection configuration, the controllers' URL mappings, and other environment parameters.
  • db/
    Your database state is stored in this directory.
  • log/
    Log information is stored here. You'll be interested in development.log.

Design Your Schema

You will need to design a relational schema to hold the data.  Feel free to design it as you see fit.  Put the DDL (i.e. CREATE TABLE....) into a file called ./schema.sql in the root directory of the app.  DO NOT USE MIGRATIONS.  We will be executing this from a script to create your database.

To setup the database run:
rm -f db/development.sqlite3; sqlite3 db/development.sqlite3 < schema.sql

You can then play around with it in a console by running:
sqlite3 db/development.sqlite3

Find Functional Dependencies

Determine the set of functional dependencies present in your relational schema. Make any schema changes necessary. Save your functional dependencies in a text file called fd.txt

You do not need to list trivial functional dependencies (X -> X), but you do need to prove your schema is in BCNF by listing other dependencies that do exist and stating why they don't violate any of the BCNF conditions..

Start the Server

You can start the server by running:
script/server -p <port>

Pick any high numbered port (1024+).  Your rails app will then be accessible from http://<server>:<port>.  You can stop it with ctrl-c.  You don't need to restart the server everytime you make code changes, rails will auto-reload the files. Helpful debugging information can be found in log/development.log

Import the Data

We have provided sample data for you in the form of ./data.xml in the root of your app.  You can cut and paste this into the form at http://<server>:<port>/data/reload, after implementing the reload function. Make sure you copy the entire data file.

When importing the data you may find that there is extra whitespace surrounding the artist and album names.  The ruby strip command can be used to clean this up. (i.e.  to get the artist name type "artist.text.strip")

Write the Queries

Note: the comments in query_controller.rb for by_artist and by_album should read Fill the songs array with all songs that are by the artist (or album) with name '#{params[:key]}'

Each query should be specified in the query_controller.rb using the given function name (called an action).
  1. songs
    Show the names of all songs with their artists' names and album names
  2. artists
    Show the names of all artists
  3. albums
    Show the name of each album and its songs
  4. by_artist
    Given an artist name, show all songs by that artist
  5. by_album
    Given an album name, show all songs in that album
  6. add_rating
    Add a rating using the given user name, song, and score
  7. best_song
    Show the highest rated song (across all users)
  8. favorite_artist
    Show the highest rated artist for the given user name
  9. artist_ratings
    Show each artist and its average rating.  Artists without song ratings should still be displayed.
  10. two_artists_songs
    Given the names of two artists, show songs they have worked on together.
  11. distinct_collab
    List artists sorted by who has the largest set of distinct collaborations. The largest set is the number of other artists, not the number of collaborations.
You are free to modify the views HTML if you want to make the site look more awesome.  However, it is important you keep the classes of the tags that surround each part of the answer the same, as this is how we will be grading it.

Optimize Your Site

Too Many Queries

Try looking at the development log as you use different pages on the website by running `tail -f log/development.log` in a separate terminal window.  For each hit to the site you should see a block of text containing the following information:
  • The names of the controller and action that ran
  • The originating ip address and timestamp for the request
  • Debugging information like the session id of the user and parameters that were passed
  • A list of SQL queries executed by the request
  • Performance information
Depending on how careful you were, it it quite possible that some of your pages are executing many (10+) queries per request.  This can significantly impact the performance of your website.  Fix any action that does more than 5 queries.  We suggest looking at eager loading in the RoR API documentation.

Indexes

Similarly, some of you actions may be doing scans over large sections of the database to return an answer.  Thus you should modify your schema to add indicies to speed up the queries you run.  If you add the indexes dynamically at the sqlite console, make sure you also update your schema.sql file so that the indexes will be created when we grade the assignment. Briefly discuss your rationale for adding indexes in the fd.txt file you submit.  One useful tool you can use to find out how queries execute is EXPLAIN from inside of a sqlite3 console session.  Run this on specific queries that you have harvested from the development log.  Since EXPLAIN is not part of any standard, the output will be much different than what you saw in Assignment 3.  An example is shown below:

sqlite> explain select * from songs WHERE SHA1 = "793d5c8018384ff24983d55b6ca3ad38";;
addr opcode p1 p2 p3
---- -------------- ---------- ---------- ---------------------------------
0 Goto 0 23
1 Integer 0 0
2 OpenRead 0 2
3 SetNumColumns 0 3
4 Integer 0 0
5 OpenRead 1 3 keyinfo(1,BINARY)
6 String8 0 0 793d5c8018384ff24983d55b6ca3ad38
7 IsNull -1 20
8 MakeRecord 1 0 a
9 MemStore 0 0
10 MoveGe 1 20
11 MemLoad 0 0
12 IdxGE 1 20 +
13 IdxRowid 1 0

Each row represents an operation that is run on the sqlite virtual machine and a reference of what each of them does can be found here.  While it is not necessary to fully understand all of the instructions for the virtual machine, understanding some important operations (like IdxGE) will help you understand when indexes are being utilized vs table scans.  You should create any indicies that are necessary to efficiently execute the queries in the assignement.

What to turn in

Create a tar file of the things to turn in by running `tar cvf hw4.tar app/ schema.sql`.  Then run the submit command to turn in the following:
  1. hw4.tar - A tar file containing the app/ directory and schema.sql file for your rails app
  2. fd.txt - A description of your schema in BCNF along wiht a list of functional dependencies, as well as rationale for added indexes
  3. MY.PARTNERS - Listing the other person that you worked with

How to Submit

  1. Save your files in a directory called "hw4" within your cs186 home directory.
  2. cd into hw4
  3. Run: submit hw4

Tips

Running example

A running example of the assignment is accessible here:
 
http://sphere.cs.berkeley.edu:3000/query

Working Servers

We suggest you spread yourselves out over as many of the instructional machines as possible.  While in theory any of the machines of type "Linuxi686" or "SunOSi86pc" should work, some of them are overly busy or seem to have weird firewalls.  Below is a list of machines known to work (at least at one point).  Please post any additions or corrections to the newsgroup as you encounter them.  Note: since ruby is interpreted instead of compiled you do not need to 'make clean' and remake in order to switch machines.
  • sphere.cs.berkeley.edu
  • rhombus.cs.berkeley.edu
  • cube.cs.berkeley.edu

Debugging

  • You can run ruby code inside the rails enviroment from the command line using:
    script/console
  • Note your server's log in log/development.log

Port troubleshooting

If you get an error like TCPServer Error: Address already in use, you may be trying to start your server on someone else's port or you may already have a server running on that port. Check if you have a server running by ps -u user_name (works on SunOS machines). If you can't kill the ruby server using ctrl-c, kill it using its process id: kill -9 pid.

Assignment 3: PostgreSQL Query Optimizer

posted ‎‎Oct 27, 2008 10:41 AM‎‎ by akshay krishnamurthy   [ updated ‎‎Nov 7, 2008 12:36 PM‎‎ ]

In this homework you will carry out a number of exercises involving the optimization of relational queries using the PostgreSQL optimizer and the visualization command EXPLAIN. You will not have to write any code. This assignment has to be done individually.

There are four parts to this assignment:

  • Reading some PostgreSQL documentation
  • Setting up the database, loading the tables, creating indexes
  • Answering questions by observing the query plans generated by PostgreSQL
  • Writing and running some queries of your own.

General Instructions

  • Please spend some time to read this assignment completely before beginning.
  • You can get all the files for this assignment at ~cs186/Hw3.
  • We will be using PostgreSQL version 8.2.4, which is now set up to be the default for cs186 accounts. The instructions provided here are designed to work with the EECS instructional machines and may not work as smoothly elsewhere. The TAs and Professors will only support EECS instructional machines.
  • By default, you should be able to access the 8.2.4 versions of pg_ctl, psql, etc. (test this out by using the commands "psql --version", "pg_ctl --version"). If your path is pointing to some other version of these binaries, you may have to explicitly add to your PATH environment variable, the directory ~cs186/pgsql/bin, which has the 8.2.4 binaries. 
  • Homework 2 used PostgreSQL version 8.3.4, so make sure that you are not using that version with this assignment. Using the binaries at ~cs186/pgsql/bin should make everything work.
  • To make sure that your solutions are correct, please use one of the Solaris 10 X86 servers to do this assignment. These servers are named "rhombus", "pentagon", "cube", "sphere", "po", and "torus" followed by a ".cs.berkeley.edu".
  • Before moving from one question to the next, you should remove any indexes that you created specifically for the previous question. Also, enable any access/join methods that you had disabled for the previous question.
  • The deadline for this assignment is 5PM on Monday November 10, 2006 in the CS186 drop box in 283 Soda Hall.  
  • If you know how to use PostgreSQL, this assignment should be doable in a single sitting.

Documentation and PostgreSQL Tips

The PostgreSQL 8.0 Documentation is on-line (and searchable) at:  http://www.postgresql.org/docs/8.2/static/index.html


 

You will find the following parts particularly useful for this assignment:

  • PostgreSQL Performance Tips  http://www.postgresql.org/docs/8.2/static/performance-tips.html

  • The CREATE INDEX Command: http://www.postgresql.org/docs/8.2/static/sql-createindex.html

    • the command for creating indexes;  you will also need “DROP INDEX”
  • The SET Command: http://www.postgresql.org/docs/8.2/static/sql-set.html

    • this command lets you turn on and off operators that the query optimizer can consider.  For example, the SQL command “SET enable_hashjoin TO off;” tells the query optimizer not to use Hash Joins.
  • Runtime Configuration: http://www.postgresql.org/docs/8.2/static/runtime-config.html

    • see section 17.6.1 “planner method configuration” for  a list of the operators you can disable and re-enable for the optimizer using the SET command.
  • PostgreSQL System Catalogs: http://www.postgresql.org/docs/8.2/static/catalogs.html

    • note that PostgreSQL keeps its catalog information in tables.   The tables/views that are likely to be most useful in this assignment are “pg_stats”, “pg_class”, “pg_indexes and  pg_attribute”.   You can query these in psql just like regular tables.  For example, the query “select * from pg_indexes where tablename = 'rankings';” gives you information about the indexes defined on the rankings table.  You can see the statistics the optimizer has for the table ‘rankings’ using the query: “select * from pg_stats where tablename = 'rankings'” (note the single quotes around the tablename, which is a string attributed  in the pg_stats view).  The documentation for pg_stats (linked from the above page) explains what the numbers mean.  You can get information on tables by querying the “pg_class” table and/or using the “/d” command in psql. 

Setting up the Database
  • Create a database directory with initdb .  After that, start the postmaster with pg_ctl  (initdb tells you the exact command) Then, create a database with createdb.
$ ~cs186/pgsql/bin/initdb ~/pgdata
$ ~cs186/pgsql/bin/pg_ctl -D ~/pgdata -l hw3.log start
$ ~cs186/pgsql/bin/createdb colleges 

  • Create tables and import data into the database. We are creating three tables: 'rankings' and 'financials' and 'students' 
 $ ~cs186/pgsql/bin/psql colleges -f ~cs186/Hw3/hw3.sql
  • You MUST update the PostgreSQL statistics after adding or deleting indexes. The command for updating the statistics (in a psql session) is "VACUUM ANALYZE;". You MUST update the statistics once before starting the exercises. This step is crucial to getting correct answers to the questions!
  • When you start psql make sure you turn off the bitmap scan acces methods by typing "SET enable_bitmapscan TO off;" Do not re-enable this access method throughout the assignment.
After completing the above steps you should be able to analyze query plans produced by the optimizer for a given query (using EXPLAIN). You should also be able to modify the runtime variables for enabling/disabling join and access methods (using SET).

Questions –Hand in from here below…

CS 186 Fall 2008 Homework 3 Answer Sheet (don’t forget to staple it together!).
NAME: ____________________________              STUDENT ID:____ ______________   
IMPORTANT: Circle the last two letters of your class account:
 cs186 a b c d e f g h i j k l m n o p q r s t u v w x y z
       a b c d e f g h i j k l m n o p q r s t u v w x y z

 

1.      SOME EASY STUFF

Start psql and make sure you are running version 8.2.4 (it tells you when it starts up).  Then, using queries in psql, examine the set of tables and access methods in the database you just created. In particular examine the relation Rankings. Answer the following questions:

A. What are the attributes of the Students relation?




B. How many indexes are built on the Rankings relation? Name them and write down their type.




C. How many tuples are there in the Rankings relation?




D. How many distinct values of  “gradrate” does the query planner estimate there are for the Rankings relation?  (hint: Query pg_stats to find out)




E. How many distinct values of “gradrate” are there actually in the rankings table? (hint: it’s probably easiest to run a query to compute this!)





F. What query did you use to find the answer to E? 

 



 


2.      USING THE QUERY PLAN VIEWER

This question requires you to use the PostgreSQL query plan visualization command EXPLAIN. Read the documentation for EXPLAIN at the link given above.   Note that (like System R) EXPLAIN estimates query costs in units of disk I/Os (CPU instructions are added in by multiplying times a conversion factor).
Consider the following query:

SELECT * FROM rankings WHERE state = 'CA'; 

Answer the following questions looking at the query plan generated by the EXPLAIN command:


A. Briefly describe the plan chosen.   (e.g., what kind of scan is used?).




    

B. In what order would the tuples be returned by this plan? Why?



    

    

C. What is the estimated total cost of running the plan?

    



D. What is the estimated result cardinality for this plan? The estimated result cardinality is the number of colleges that the optimizer estimates to be in California.   Looking at the statistics, why does the optimizer come up with this estimate?


    



    

E. How many colleges actually do have “state = 'CA'”?  (hint: it’s probably easiest to run a query to compute this!)

    




F. Looking at the statistics, what are the top 10 states with the most colleges and the percentage of colleges in each of those state? (hint: you can select these by querying on the pg_stats table) 

    




G. Which value of  “state” is actually the most popular, and how many tuples have that “state”? How did you figure this out (what query did you use) ?

 



3.      SELECTS WITH INDEXES

Consider the same query from previous question: 

SELECT * FROM rankings WHERE state = 'CA'; 

Answer the following questions looking at the plans and the access methods:


A. Create a btree index on the attribute state of the relation Rankings. What is the plan chosen for the query now?   (e.g., what kind of scan is used and what is scanned?).

 



B. What is the estimated total cost of running the plan?



    

    

C. Compare this plan with the plan obtained in question 2.A above.  Which is cheaper and why?

 

 

 

4.      RANGE SELECTS

DROP the index that you created for Question 3. Don't forget to VACUUM ANALYZE

Now analyze the query plan that PostgreSQL comes up for the following query: 


SELECT * FROM rankings WHERE gradrate < 10;

Answer the following questions:


A. How many ranking tuples that have gradrate < 10 does the optimizer think there are?

 


B. How does the optimizer arrive at this estimate of the number of tuples? That is, what calculations does it perform, and where does the supporting data come from?

 

 

    

C. In what order will the tuples be returned by this plan?

 


    

D. What is a value of the constant (i.e. '10' in the above query) such that the optimizer chooses a different plan? What is that plan and why does the optimizer think it will be cheaper than the previous plan when used with this new constant?




 

     

    


E. Explain why one of the access methods is costlier than the other.

 




5.      SIMPLE JOIN

RE-ENABLE (using “SET”) the access method you turned off above.

Create a B-tree index on 'studentfacultyratio' of the rankings table.

Analyze the query plan for the following query that finds the average salary at schools who have a student to faculty ratio < 3.



SELECT R.name, F.avesalary 
FROM rankings R, financials F
WHERE R.id = F.id AND R.studentfacultyratio < 3; 


Answer the following questions:

A. What is the estimated cost of this plan?

    

    

B. What kind of join is used by the plan?




C. Disable the join type used in the above plan and re-optimize the query.  What type of join is used now, and what is the total estimated cost of the query?


 


d. What relations are sorted in this plan, and why?




6.      THREE-WAY JOIN

RE-ENABLE (using “SET”) the join method you turned off above.

Answer the following questions referring to the query below: 

SELECT S.name, R.name, F.avesalary

FROM students S, rankings R, financials F

WHERE S.school = R.id and R.id = F.id;


A. Describe the best plan estimated by the optimizer. List the joins and access methods it uses, and the order in which the relations are joined.

 

 

 


B. What is the Relational Algrebra expression for the above join order?





C. Modify the query by adding a condition R.studentfacultyratio < 10. What are the differences between this plan and the one above? Why is this new join ordering better for the extended query than the ordering obtained in part A?

 

 

 

7.      PLAYING WITH SQL 

Answer the following questions about the database (by writing queries!): 


A. What is the name of the public school (public = 1) with the highest average salary?  (hint: while one way to get close to this is simply to list all schools and the average salary for them sorted by average salary, you may also want to try writing a variant that only produces a single result row – it’s a bit ugly).




B. Find the public school (public = 1) with the largest difference between instate tuition and out of state tuition that has at least one student attending.




C. Show a query to find out some other interesting fact in the database and tell us what the answer is (or at least summarize it).

Assignment 2

posted ‎‎Oct 17, 2008 12:42 AM‎‎ by Beth Trushkowsky   [ updated ‎‎Oct 26, 2008 10:07 PM‎‎ ]

Project Overview

In Project 1, you studied how to change the page replacement policy of the PostgreSQL buffer manager. In this project you will move to a higher level in the system and add functionality to the PostgreSQL query executor. We will restrict our focus to grouped aggregates. This project will be considerably more complex than Project 1, both in terms of the amount of coding involved and in understanding existing code. The major parts of the project are:
  1. Gaining a “big picture” understanding of a query executor
  2. Examining and understanding existing code
  3. Implementing a new query operator
The grouped aggregate COUNT() that you've seen in lecture assumes that all elements for the column(s) specified will be counted and returned  For many cases however, you only actually want the groups with the k-highest counts.  One example would be a query that lists the 100 most popular videos on YouTube.  

Your task in this project is to implement a new grouped aggregate called APPROX_COUNT() that saves space in memory by only keeping track of a fixed number of elements to maintain an approximate count of the most frequent values. You will experiment with the amount of available memory and observe how your APPROX_COUNT aggregate performs compared to the traditional COUNT. You've seen in lecture saw how grouped aggregates can be implemented with sorting as well as hashing. In this project, you will be implementing your new aggregate operator using the hashing technique.

Administrivia

  • You may work on this project in teams of two
  • The project will be due October 29th.
  • Since understanding the code is an important part of this project, the TAs and Professor will not assist you in understanding the existing code beyond what is discussed here.

Description of APPROX_COUNT

The approximate count algorithm saves space by keeping track of the frequency of only the most popular items, instead of all items in the system.  Pseudocode for the approximate count aggregate is shown below.

Algorithm Pseudocode

Algorithm: 

Set up a structure to hold
elements (e_1, e_2,...e_m) and counts (count_1, ... count_m)
where m is the number of slots specified by the user.

per_input_tuple(e)
If e is in the structure as e_i,
increment count_i;
else{ let e_m be the element with least hits
Replace e_m with e;
Increment count_m;
}
end;
Your implementation will be a little different as the system automatically maintains a hash table of elements that are in your structure and tells you where e_i is located, if it exists already.

If you're interested in more details, see the original paper: 
http://www.cs.ucsb.edu/~dsl/publications/2005/ICDT2005-metwally.pdf

Setup

Clean-up old versions

In order to avoid going over quota, you will want to start by deleting the large directories we used in the last project (~/pgdata, /home/tmp/<USER>/postgres-8.0.3).  This can be done with 'rm -rf <file>'.  If that still prompts you for every file to remove, try typing 'unalias rm' first.

Download Postgres-8.3.4

$ cd /home/tmp/$USER
$ wget http://ftp7.us.postgresql.org/pub/postgresql/source/v8.3.4/postgresql-8.3.4.tar.gz
$ gunzip postgresql-8.3.4.tar.gz
$ tar xvf postgresql-8.3.4.tar

Apply the Patch We Provide

To get you started, we made many modifcations to Postgres.  All of these can be made on your source tree by applying a diff (a diff is a collection of line changes in a special format).  It is applied using the patch program.  Do this on icluster.eecs.berkeley.edu.
$ patch -p1 < ~cs186/fa08/hw2/approx_agg.diff

Build and Install Postgres 

Please refer to Assignment 1 for details on how to build and install postgres from source.

What to Implement

You will be modifying code in two files:
./src/backend/executor/nodeAgg.c (most of your code will be here) 
./src/include/nodes/execnodes.h

You must add any data structures that you think you will need. 
You will need to add code to some existing functions:
  • void approx_agg_per_input(AggState *aggstate, AggHashEntry entry, bool entry_is_new)
    • This will be called once for each tuple
    • You will need to take the data from entry and use it to update the state you are keeping in aggstate
    • If the entry is new, that means its either something you haven't seen before, or something you have seen, but evicted.  This means you will either need to make a new bucket for it, or if you are out of memory buckets, evict something.
    • If you evict an entry, you will need to delete it from the hash table as follows:
      • remove_hash_entry(aggstate, old_entry)
  • TupleTableSlot *agg_retrieve_approx(AggState *aggstate)
    • This is the function that is called after all the input has been passed to approx_agg_per_input.  It returns a TupleTableSlot with the results.  We have provided most of the code for this function
  • AggHashEntry approx_advance_iter(AggState *aggstate)
    • This function is called by agg_retrieve_approx each time we are ready for a new result tuple.
You will also need to add any additional functions that you deem necessary.

Helpful Tips

  • Looking through the diff will give you a good idea of what changes we made to the code to make this project possible.  While you don't need to understand all of it, a basic knowledge of what goes where could be very helpful
  • Parts where you will need to insert code have been marked with /* CS186: */ comments.  Try 'fgrep -rn CS186 *' from the postgres directory
  • There is a doubly-linked list implementation in src/backend/lib/dllist.c that you may find useful.
  • In agg_retrieve_approx, the setting of your count to aggvalues[] needs to be of type Int64Datum. This means that the data type of your counts should be int64, which is just a 64-bit integer. An Int64Datum is a wrapper that either stores the value of the int64 or a pointer to a Datum object that stores it.
  • We haven't discussed the concept of contexts in Postgres, but just know that to allocate and free memory, use palloc and pfree

Testing Implementation

Test Data

We have provided you with two datafiles in ~cs186/fa08/hw2/{data1,data2}.  If you have trouble accessing that path to the data files, try the absolute path: /home/ff/cs186/fa08/hw2/. Each contains a list of numbers with different distributions.  Feel free to explore by running queries over this data using the real COUNT() aggregate to get a feel for what they look like.  Create a Test database before running these queries, like in the first assignment.
test=# CREATE TABLE numbers (num int);
test=# COPY numbers FROM '<path to data file>' DELIMITERS ',';

Queries to Run

Run the following query for each dataset. Record your results for each (for your own reference for the discussion questions).
SELECT num, COUNT(*) FROM numbers GROUP BY num ORDER BY COUNT(*) DESC LIMIT 10;
Next, run the following query for each dataset using the approximate aggregate and vary the memory spaces to 10, 50 and 100.
SELECT num, APPROX_COUNT(10, mem) FROM numbers GROUP BY num;
Save the results to these queries as data<file_number>.<mem_size>.txt for submission. 

Discussion

In two-three sentences each, discuss your answers to the following:
  1. For which of the datafiles does your APPROX_COUNT opperator work well?  When doesn't it work?  What characteristics of the data explain this behavior?
  2. What effect does varying the memory size have on the accuracy?  What memory size worked best for each datafile?  Why?
Save your responses in discussion.txt.

What to turn in

  1. nodeAgg.c
  2. execnodes.h
  3.  6 files in the form: data<file number>.<mem size>.txt - Table of query run results
  4. discussion.txt - Answers to the discussion questions, TEXT ONLY!
  5. MY.PARTNERS - Listing the other person that you worked with

How to Submit

  1. Save your files in a directory called "hw2" within your cs186 home directory.
  2. cd into hw2
  3. Run: submit hw2

Assignment 1 - Part 2

posted ‎‎Sep 4, 2008 11:16 AM‎‎ by Michael Armbrust   [ updated ‎‎Sep 24, 2008 11:44 PM‎‎ by Beth Trushkowsky ]

Part 2: Implementing the CLOCK and MRU Buffer Replacement Strategies. (70%)

Setting up PostgreSQL and Other Files/Scripts

Everything needed for the second part of the assignment is available in an svn repos at ~cs186/fa08/postgress-8.0.3 ~cs186/fa08/hw1 on x86 linux instructional machines.

Note on conventions:  In the labs we will use gray boxes with the '$' prompt to signify text that you should type in verbatim to the shell on one of the instructional machines using your cs186 account.  Below we will provide an explanation of the commands you are entering.

To setup the environment start by:

$ /share/b/bin/mkhometmpdir
$ svn co file:///home/ff/cs186/fa08/postgres-8.0.3 /home/tmp/$USER/postgres-8.0.3
$ mkdir ~/hw1
$ cd ~/hw1
$ svn co file:///home/ff/cs186/fa08/hw1 hw1-code
$ ln -s /home/tmp/$USER/postgres-8.0.3 ./
  1. This is a script set up by the instructional computing support to create a special temporary directory.  We will keep the source code and other large things here to prevent problems with disk quotas.  You should clear this directory out (since it won't contain anything you can't recreate easily) after you are done with the project
  2. This checks out the slightly modified (for space reasons) version of the postgres source tree from Subversion.
  3. Create a directory for homework one in your home directory (~ is short hand for the current users home dir)
  4. Change into the directory we just created
  5. Check out the skeleton for the actual assignment (skeleton file of freelist.c).  We are keeping this in subversion as well to ease deployment of possible updated that might need to be given out after the project is assigned.  Typing 'svn update' from inside this directory will check for the latest version of the skeleton file at anytime.
  6. Create a symbolic link (like a pointer) to the temp directory copy of the postgres source.  This is purely for your convenience.

You can build and install PostgrSQL by using these commands:

$ cd ~/hw1/postgres-8.0.3
$ ./configure --prefix=$HOME/pgsql --enable-debug --enable-cassert
$ make
$ make install
  1. Move to the checked out source tree.
  2. ./configure is a script found in many unix/linux programs that automatically customizes the build process for your specific computer.  It is auto generated by autoconf (This might take a while).
  3. Build the source using make (This might take a while).
  4. Install postgres into the location you specified during configuration (--prefix=$HOME/pgsql).
Now that you have build and installed postgres, you can test your installation by running the following commands:
$ ~/pgsql/bin/initdb -D ~/pgdata
$ ~/pgsql/bin/pg_ctl -D ~/pgdata -l pre.log start
$ ~/pgsql/bin/psql template1
  1. Initialized a default empty database for postgres to use at ~/pgdata.
  2. Start the background server using the the blank environment you just initialized and the logfile ./pre.log (You can look in here for debugging information about the running server).
  3. psql is a shell that you can type SQL commands or special postgres commands into (starting with a \).  Since we have not yet created a database yet, we will use the default one "template1". 
Note: If you encounter an error like "FATAL:  user "<login>" does not exist", you may be trying to connect to another student's Postgres instance on the same port number. See the troubleshooting section below. 

This will open a postgres prompt from which you can type SQL commands to the database.  Below are some example commands to try out.  Feel free to play around to get a good feel for SQL.  Can you implement the examples we talked about in discussion section?  To quit simply press ctl-d or \q.

template1=# CREATE DATABASE test;
template1=# \connect test;
test=# CREATE TABLE test_table (field VARCHAR(25));
test=# INSERT INTO test_table (field) VALUES('Hello World');
test=# SELECT * FROM test_table;
test=# DROP TABLE test_table;
  1. This is DDL to create a new database
  2. \connect lets you switch switch the connection to a new database (in this case from template1 to test)
  3. Create a table 'test_table" with a single field named 'field'.
  4. Insert the string 'Hello World' into the database.  Note the use of single quotes instead of double quotes.
  5. Retrieve all the data from test_table
  6. Delete our experimental table.
For more information about PostgreSQL usage, you can consult the version 8.0 manual. 

Modifying the Buffer Pool Replacement Strategy

In version 8.0.3 a 2Q strategy is implemented.  If you would like to check it out look at ~/hw1/postgres-8.0.3/src/backend/storage/buffer/freelist.c.  For this part of the assignment you will be implementing two different strategies: MRU and Clock.  You will then compare the performance of your implementation to the original.  To help you out, we have provided the implementation for LRU.

Files to Look At

You can add and manage any new data structures that you need. The existing code is not extremely clear, but it is understandable. It may take you a few hours (or more) to digest it. Since understanding the existing code is a significant part of the assignment, the TAs and Professors will not assist you in your understanding of the code base (beyond what we discuss here).

The actual buffer manager code is neatly separated from the rest of the code base. It is located in the postgres-8.0.3/src/backend/storage/buffer directory, and primarily made up of the files bufmgr.c and freelist.c. While bufmgr.c contains the implementation of the buffer manager, we are only interested in freelist.c, which defines the buffer manager strategy (e.g., LRU, MRU, CLOCK, etc.).

We have provided you with a commented example freelist.lru.c which details the functions you will need to implement.  We recommend that you copy this to freelist.mru.c and freelist.clock.c in order to create your implementations.  There is also stubs.c (which contains helper code needed to test) and buftest.c (a simple testcase). You may want to add additional tests to buftest.c as we will be doing this when we grade you.

The provided Makefile will compile your implementations and generate three binaries (testlru, testmru and testclock).  Check the output to make sure you have implemented the policy correctly.

Rebuilding Postgres With Your Strategies

the following commands to rebuild and install a modified version of postgres that uses the clock or MRU replacement policy that you created, assuming you named your files freelist.mru.c and freelist.clock.c
$ cp freelist.<clock or mru>.c ~/hw1/postgres-8.0.3/src/backend/storage/buffer/freelist.c
$ cd ~/hw1/postgres-8.0.3/
$ gmake
$ gmake install
Note: The make file contains and example of how to automate this.

Performance Testing

After you have finished implementing and testing your strategies with the simple harness, it is time to test their performance in the real server.  We first start the server using a log file named for the test we are doing (Ex: mru16.log), with additional arguments to see more statistics in the log.
$ ~/pgsql/bin/initdb -D ~/pgdata
$ ~/pgsql/bin/pg_ctl start -D ~/pgdata -l <log name>.log -o "-B <numbuffers> -N 8 -o '-te -fm -fh'"
$ ~/pgsql/bin/psql test
You can see the output of this log in realtime in another window by typing:
$ tail -f <log file>
For the performance testing you will be creating two tables (raw_s_tuples and raw_r_tuples) and populating them with the provided data.
test=# CREATE TABLE raw_r_tuples (tname varchar(2), pkey int, num2 int, num3 int, num1 int, node varchar(16), inserttime float8, lifetime float8, testname varchar(64));
test=# CREATE TABLE raw_s_tuples (tname varchar(2), pkey int, num2 int, num3 int, num1 int, node varchar(16), inserttime float8, lifetime float8, testname varchar(64));

test=# COPY raw_r_tuples FROM '/home/ff/cs186/fa08/data/R' DELIMITERS ',';
test=# COPY raw_s_tuples FROM '/home/ff/cs186/fa08/data/S' DELIMITERS ',';

Testing Matrix

For the final part of the lab, we are going to have you run two different queries under different circumstances and report your results.  The statistics will be output in the log file you specify for the server and will be in the following format:

LOG:  EXECUTOR STATISTICS
DETAIL: ! system usage stats:
! 0.065834 elapsed 0.068005 user 0.000000 system sec
! [36.306269 user 0.016001 sys total]
! 0/0 [0/0] filesystem blocks in/out
! 0/0 [0/1915] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 0/0 [134/88] voluntary/involuntary context switches
! buffer usage stats:
! Shared blocks: 71 read, 0 written, buffer hit rate = 99.26%
! Local blocks: 0 read, 0 written, buffer hit rate = 0.00%
! Direct blocks: 0 read, 0 written

Take special note of the buffer hit rate (99.26% in this example).  Run each of the following queries at buffer sizes of 16, 32, and 64.  Repeat the results for LRU, MRU and CLOCK.  Record your results and the answers to the following questions in a text file performance.txt and turn it in with your submission.  You'll also turn in a text file analysis.txt with a brief analsis of the results (see below).

Queries:
  1. SELECT * FROM raw_r_tuples;
  2. SELECT * FROM raw_r_tuples r, raw_s_tuples s WHERE r.pkey = s.pkey;
Run each query three (3) times for each of the three buffer sizes, for each of LRU, MRU, and Clock. You will be recording your minimum and maximum hit rates for the combinations of buffer sizes and policies, for both queries. Record the hit rate as a decimal to two places, without the (%) sign, e.g. 99.26. You'll be filling in the following table for both queries.

Clarification: Run the first query several times for each replacement policy and record your hit rate results. Then do the same for the second query. You don't need to clear your buffer between the two queries. The important aspect of this is that you detail in your analysis what you did and why you think you got the results that you got.

   16, min  16, max  32, min  32, max  64,min  64,max
 LRU            
 Clock            
 MRU            


Sumission

Partners

We will be using the parners functionality of the grading scripts.

The student submits the assignment using 'submit'. The system will look for a file called  MY.PARTNERS in your current directory or ask you for the login of your partner.  The format of the MY.PARTNERS file  is a list of user names delimited by spaces, tabs or newlines, such as:

    cs123-aa cs123-ab cs123-ac

What To Turn In

  1. performance.txt -- TEXT ONLY! Two (one for each query) performance comparison tables: three rows, six columns separated by space, which represent the two tables we asked you to fill. Put the second table below the first.
  2. analysis.txt -- In 1-2 paragraphs, discuss whether you think the hit rates for MRU and Clock are as expected from the discussion in class.
  3. freelist.mru.c -- Your implementation of the MRU policy
  4. freelist.clock.c -- Your implementation of the Clock policy
  5. MY.PARTNERS - Listing the other person that you worked with

How to Submit

  1. Save your five files performance.txtanalysis.txtfreelist.mru.cfreelist.clock.c,and MY.PARTNERS in a directory called "hw1p2" within your cs186 home directory.
  2. cd into hw1p2.
  3. Run: submit hw1p2

Notes on Debugging

You should ONLY proceed to debugging postgresql after you ensure that the logic of your freelist.skel.c is correct with the Test Harness discussed after this section! We suggest that you debug using the logging/printing feature: elog(DEBUG1,<format>,<arg>). Please see bufmgr.c for example uses of this routine. Also, you may use gdb with the src/backend/postgres standalone executable. Make sure you run your version of postgres and not the one in the class account directory! To start the backend server in stand-alone mode, run src/backend/postgres -D <DATADIR> test. It will use the data directory (and data) you prepared using initdb -D <DATADIR> and database test you created using createdb test. Alternatively, you can use the scripts in /MyCode/PerformanceEvaluationScripts to quickly setup your database. The postgres interface is directly to the backend, so psql features will not work. If your binary does not work correctly, it may corrupt the data, so be warned.

Troubleshooting

Since you are going to be sharing the machines with many other students, you might run into port number conflicts.  To change the port number of of your Postgres instance, first shut down the instance. Then change your port number in the Postgres configuration file using your favorite text editor, e.g:
$ ~/pgsql/bin/pg_ctl -D ~/pgdata stop
$ emacs ~/pgdata/postgresql.conf
Edit the line with the port number, making sure to remove the comment (#) at the beginning of the line. Port numbers within 129 - 32768 should be usable. Then start your instance as usual, and start the your database connection using your specified port number.
$ ~/pgsql/bin/pg_ctl -D ~/pgdata -l <log name>.log start 
$ ~/pgsql/bin/psql template1 -p <port>

Assignment 1 - Part 1

posted ‎‎Sep 1, 2008 2:29 PM‎‎ by Michael Armbrust   [ updated ‎‎Sep 10, 2008 2:00 PM‎‎ by Beth Trushkowsky ]

The aim of the first two projects is to modify functionality in the backend server of PostgreSQL. For this first assignment, we focus on just one core module - the buffer manager. The actual coding for this assignment is minimal, given the highly organized and modular PostgreSQL code (the code for the buffer manager policy is cleanly separated from the rest of the code base). However, you have to figure out what code to modify, which is non-trivial! There are three major parts to this project:
  • Understanding different buffer management strategies
  • Examining, understanding and modifying existing code
  • Testing your code in a real system

Please skim through this document to the end before getting to work!

Partners

There are two parts to this assignment, the first of which is to be done individually and the second in teams of two people. We will post the deadlines for when and how to register your team with the second part of the assignment.

Deadlines

You have to submit this assignment in two main parts:

  1. The first part will test your understanding of different buffer replacement strategies. This part must be done individually and is due at Monday, September 15, 11:59PM
  2. The second phase requires that you and your partner code a buffer replacement policy and create tests code. The due date for the part is Friday, September 26, 11:59PM

Tasks and Grade Percentage

  • 30% - Written assignment on "Understanding Different Buffer Replacement Strategies"
  • 50% - Implementation of alternate buffer management strategies in postgres
  • 20% - Performance analysis of different strategies

Part 1: Understanding Different Buffer Replacement Strategies. (30%)

The textbook describes LRU, MRU and CLOCK strategies in section 9.4.1. However, you may need to read the entire section (9.4) to get a full image of a buffer manager in the DBMS. A fourth (more recent) strategy is called 2Q, and is used in Postgres v8.0.3. LRU, MRU, and CLOCK strategies work well enough for most systems so we are going to replace 2Q. But first you'll need to exercise your understanding of the LRU, MRU, and CLOCK buffer replacement strategies.

This part of the assignment requires no code. We will give you the number of page slots that the buffer manager must manage and a sequence of data blocks accessed by the different DBMS layers above the buffer manager layer. You will have to track the behavior of the buffer manager (which pages it has to replace, when, etc.) using your, and yours alone, knowledge of the three aforementioned buffer replacement strategies.

Solution format for this step

You must record your answers in three plain text files called strategy.lru, strategy.mru, strategy.clock. These files will contain the buffer/slot state and buffer manager replacement decisions in accordance to the LRU, MRU, and CLOCK policies respectively. The format of each file is as follows:

  • Three columns of text, separated by spaces
  • One line of text each time a page miss occurs
  • Each row lists the time of a page miss, a space, the buffer pool slot (P1, P2, P3, P4), a space and, if an eviction occurs, the page evicted from memory. Ex: If a miss occurs during time 3, causing us to evict page A and ping buffer pool slot 3 then we would write: "T3 P3 A"
  • Nothing will be written when there is a hit.

Example workload and solution

For example, assume there are four page slots your buffer manager must manage: P1, P2, P3, P4. All four slots are initially empty. When the buffer pool has unused slots (e.g., in the beginning when all four slots are empty), it will put the newly read data in the first unused slot. The blocks to be read from disk are labeled A through F. For simplicity here, we assume that after each access the page is pinned, and then immediately unpinned. (You cannot make this same assumption when writing the actual code. A page may be pinned for any length of time in a DBMS.)

Given this information and the following workload:

Time T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11
Block Accessed A A B C D E F A B C D

the files should be as follows:

strategy.lru strategy.mru strategy.clock
T1 P1 T1 P1 T1 P1
T3 P2 T3 P2 T3 P2
T4 P3 T4 P3 T4 P3
T5 P4 T5 P4 T5 P4
T6 P1 A T6 P4 D T6 P1 A
T7 P2 B T7 P4 E T7 P2 B
T8 P3 C T11 P3 C T8 P3 C
T9 P4 D
T9 P4 D
T10 P1 E
T10 P1 E
T11 P2 F
T11 P2 F

Several points deserve elaboration:

  • Certain times (e.g., T2) are missing for each strategy. This is due to the different buffer miss patterns exhibited by the given replacement policy.
  • The third column will be blank when the buffer miss does not result in an eviction.
  • When there are multiple FREE pages available, the page with the lowest id is chosen (e.g. P2 before P3 and P4 when all are free at T3). You should conform to this tie-breaking criteria.
  • In this case, for the given number of pages and sequence of block accesses, MRU emerges the winner (least number of misses).

Now we come to the problem that you will have to solve. Suppose the buffer manager has five page slots to manage: P1, P2, P3, P4, P5. All slots are initially empty. Suppose seven disk blocks (A, B, ..., G) are accessed by the layers above the buffer manager layer in the DBMS.

Clock policy elaboration: we make the following assumptions in the clock policy:

  • The pointer doesn't move when filling free frame in buffer
  • The pointer doesn't move when there's a hit in the buffer
  • The pointer advances after replacing a buffer frame

To get your access pattern, login to your cs186 account and type genpattern.rb at the prompt.

Remember that you must use:

  • Your cs186 class account to get your access pattern.
  • An intel machine with ruby 1.8.6 (icluster.eecs is an example) This is due to the fact we want you to have the same pseudo-random sequence, seeded with you user id, that we will be using to grade you.

Based on this information, you should perform a similar analysis as the one above. We will first have you save your pattern by running the following command

$ genpattern.rb > pattern

You must now generate the three files strategy.lru, strategy.mru, and strategy.clock for this problem. Stick to the specified format (e.g., no extra line at the end of a file), as we will use file comparison scripts to evaluate your answers.

How to Submit

You will need to use the unix submit program to hand in your assignment, it will be turned on by Monday 9/8:

  1. Save your four files pattern, strategy.lru, strategy.mru, strategy.clock in a directory called "hw1p1" within your cs186 home directory.
  2. cd into hw1p1.
  3. Run: submit hw1p1

‹ Prev    1-5 of 5    Next ›