OverviewIn this assignment, you will be creating a simple web application called MyTunes using Ruby on Rails with SQLlite as your backend database. Your website will store information about music, like songs and artists, similar to the example from discussion section earlier in the semester. You will be practicing the database design process by designing a relational schema to represent the data we give you, defining necessary DDL statements, and writing interesting SQL queries against your database. The technologies used in this assignment are described in more detail below.We give you a data file in XML that describes a collection of songs, and you will be responsible for designing a relational schema to represent this data, using Ruby to parse the XML file and populate your database. After you have populated your database, you will write DML statements to answer the queries we specify.
Part of this assignment is presenting a front-end view to the results of your queries. Ruby on Rails uses the Model-View-Controller paradigm. Your queries will be specified in a Controller and Views will display records matching each query in your web browser. While you won't be graded on aesthetics, feel free to be as creative as you'd like with the appearance of the Views. See the MVC description below. Administrivia
Key TechnologiesIn this section we describe some of the tools you will be using to complete this assignment.RubyRuby is a dynamic, open-source, scripting language. While learning ruby is obviously not a major focus of this class you will need a cursory understanding in order to complete this assignment. We recommend looking at some of the following resources before you get started. Since this is a new language for some people, please don't hesitate to ask the TAs or newgroup questions about ruby basics, as we don't want you to waste time fighting with the language.XMLThe Extensible Markup Language (XML) is a general-purpose specification for describing hierarchical data. If you know HTML it will probably look very familiar to you. While the full specification and validation of XML documents can be fairly complex, this should not effect you when completing this assignment. Below is an example of the XML schema we will be giving you:<data> Each element is represented between <>. Elements can contain child elements (such as artist with in song), attributes (such as title within song), or just plain text (such as Justin Timberlake inside of <artist></artist>. Some elements have a <start> and </end> tag, and some (like rating) are empty-element tags. We will be using a Ruby library, REXML, to parse the xml and convert it to a relational schema of your design. Example code is provided in the project skeleton. In ruby we have written most of the code for you to parse the XML. You can access attributes (song.attributes['sha1']) and the text of the entries (artist.text.strip) from the objects passed to the lambda functions you will need to write. RubyOnRailsOverviewRuby On Rails is an open-source web development framework written in Ruby. Its goal is to make it easy to rapidly design, implement and deploy database backed, interactive websites. We will be providing you with a partially implemented site so that you can focus on schema/query desgin and not on learning the ins and outs of Ruby/Rails. However, if you are interested in learning more there are a number of great quick start tutorials and screencasts that might help you understand some of the black magic that is going on behind the scenes.Get Michael's live coding example from class Understanding MVCRails is based around the Model-View-Controller design pattern.<wikipedia> It is common to split an application into separate layers that run on different computers: presentation (UI), domain logic, and data access. In MVC the presentation layer is further separated into view and controller. MVC is often seen in web applications, where the view is the actual HTML page, and the controller is the code that gathers dynamic data and generates the content within the HTML. Finally, the model is represented by the actual content, usually stored in a database, and the business rules that transform that content based on user actions. Though MVC comes in different flavors, control flow generally works as follows:
By decoupling models and views, MVC helps to reduce the complexity in architectural design, and to increase flexibility and reuse. </wikipedia> You will see this functionality separated into directories in app/ in the skeleton provided. There are three controllers:
ActiveRecordActiveRecord is an Object-Relational Mapping library used by RubyOnRails. It's fairly full featured and makes a number of assumptions about the layout of your schema. If you are familiar with this library feel free to use any basic (no plug-ins) features you. If not, don't worry, as we have provided a list of these conventions and overrides, along with few simple example below about using ActiveRecord to run arbitrarily complex database queries over the Boats/Sailors example from lecture.Conventions:
Sailors(sid, sname, rating, age) Boats(bid, bname, color) Reserves(sid, bid, day) class Sailor < ActiveRecord:Base set_primary_key :sid has_many :reservations, :foreign_key => :sid end class Boat < ActiveRecord:Base set_primary_key :bid has_many :reservations, :foreign_key => :bid end class Reservation < ActiveRecord:Base set_primary_keys :sid, :bid, :day belongs_to :sailor, :foreign_key => :sid belongs_to :boat, :foreign_key => :bid end Make a new sailor: s = Sailor.new s[:sid] = 1 s[:sname] = "Michael" s[:rating] = 10 s[:age] = 24 s.save Find a sailor by name: s = Sailor.find_by_sname("Michael") Find all red boats: b = Boat.find_all_by_color("red") Find all reservations by a sailor: s = Sailor.find_by_sname("Michael") r = s.reservations These are the basics, but sometimes you want to be able to write more complex queries. For this you can call back on full SQL. If you wanted all sailors who have reserved a boat today you could write. s = Sailor.find_by_sql("SELECT DISTINCT sailor.* FROM sailors JOIN reservations ON sailors.sid = reservations.sid WHERE reservation.day = <today>") You can even add computed collumns. For example to find all sailors along with their reservation counts you could write: s = Sailor.find_by_sql("SELECT sailor.*, count(*) as reservation_count JOIN reservations ON sailors.sid = reservations.sid GROUP BY sailors.sid") s[0] #First sailor in the list s[0][:sname] #name of the first sailor in the list s[0][:reservation_count] #Reservation count for the first sailor in the list Sqlite3SQLite is a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine. Unlike postgres which uses a client server model, sqlite is an in-process database. This simplifies deployment and removes many of the problems people had in previous labs getting postgres to run.Creating a new database is as easy as typing: sqlite3 <db file>. This will give you a SQL prompt where you can CREATE TABLE, SELECT, etc. To exit the interpreter you simply type '.quit'. For this project you will be working on a database called db/development.sqlite3 AssignmentGet the SkeletonIn your home directory on the instructional machines run the following command:tar xvf ~cs186/fa08/hw4/skel.tar You will see a number of directories in the main mytunes directory. Those of interest are:
Design Your SchemaYou will need to design a relational schema to hold the data. Feel free to design it as you see fit. Put the DDL (i.e. CREATE TABLE....) into a file called ./schema.sql in the root directory of the app. DO NOT USE MIGRATIONS. We will be executing this from a script to create your database.To setup the database run: rm -f db/development.sqlite3; sqlite3 db/development.sqlite3 < schema.sql You can then play around with it in a console by running: sqlite3 db/development.sqlite3 Find Functional DependenciesDetermine the set of functional dependencies present in your relational schema. Make any schema changes necessary. Save your functional dependencies in a text file called fd.txtYou do not need to list trivial functional dependencies (X -> X), but you do need to prove your schema is in BCNF by listing other dependencies that do exist and stating why they don't violate any of the BCNF conditions.. Start the ServerYou can start the server by running:script/server -p <port> Pick any high numbered port (1024+). Your rails app will then be accessible from http://<server>:<port>. You can stop it with ctrl-c. You don't need to restart the server everytime you make code changes, rails will auto-reload the files. Helpful debugging information can be found in log/development.log Import the DataWe have provided sample data for you in the form of ./data.xml in the root of your app. You can cut and paste this into the form at http://<server>:<port>/data/reload, after implementing the reload function. Make sure you copy the entire data file.When importing the data you may find that there is extra whitespace surrounding the artist and album names. The ruby strip command can be used to clean this up. (i.e. to get the artist name type "artist.text.strip") Write the QueriesNote: the comments in query_controller.rb for by_artist and by_album should read Fill the songs array with all songs that are by the artist (or album) with name '#{params[:key]}'
Each query should be specified in the query_controller.rb using the given function name (called an action).
Optimize Your SiteToo Many QueriesTry looking at the development log as you use different pages on the website by running `tail -f log/development.log` in a separate terminal window. For each hit to the site you should see a block of text containing the following information:
IndexesSimilarly, some of you actions may be doing scans over large sections of the database to return an answer. Thus you should modify your schema to add indicies to speed up the queries you run. If you add the indexes dynamically at the sqlite console, make sure you also update your schema.sql file so that the indexes will be created when we grade the assignment. Briefly discuss your rationale for adding indexes in the fd.txt file you submit. One useful tool you can use to find out how queries execute is EXPLAIN from inside of a sqlite3 console session. Run this on specific queries that you have harvested from the development log. Since EXPLAIN is not part of any standard, the output will be much different than what you saw in Assignment 3. An example is shown below:sqlite> explain select * from songs WHERE SHA1 = "793d5c8018384ff24983d55b6ca3ad38";;Each row represents an operation that is run on the sqlite virtual machine and a reference of what each of them does can be found here. While it is not necessary to fully understand all of the instructions for the virtual machine, understanding some important operations (like IdxGE) will help you understand when indexes are being utilized vs table scans. You should create any indicies that are necessary to efficiently execute the queries in the assignement. What to turn inCreate a tar file of the things to turn in by running `tar cvf hw4.tar app/ schema.sql`. Then run the submit command to turn in the following:
TipsRunning exampleA running example of the assignment is accessible here:http://sphere.cs.berkeley.edu:3000/query Working ServersWe suggest you spread yourselves out over as many of the instructional machines as possible. While in theory any of the machines of type "Linuxi686" or "SunOSi86pc" should work, some of them are overly busy or seem to have weird firewalls. Below is a list of machines known to work (at least at one point). Please post any additions or corrections to the newsgroup as you encounter them. Note: since ruby is interpreted instead of compiled you do not need to 'make clean' and remake in order to switch machines.
Debugging
Port troubleshootingIf you get an error like TCPServer Error: Address already in use, you may be trying to start your server on someone else's port or you may already have a server running on that port. Check if you have a server running by ps -u user_name (works on SunOS machines). If you can't kill the ruby server using ctrl-c, kill it using its process id: kill -9 pid. |