Due - December 17, 2009 - 12noon
The goal of this assignment
is to practice design and implementation of a large web application.
For this project, you will develop a personalized search engine that
authenticates users, allows them to perform searches, and maintains a
history of all prior searches each user has performed.
Your application will support the following features for a total of 100 points plus possible extra credit:
- Search (35 points)- The
main page of your site will present the user with a text box where
he/she can type a search query. When the user submits the query, your
application will search a backend InvertedIndex for the crawled pages that
contain the query terms and will return an HTML page that lists links
to all of the pages containing the query terms.
- User Registration (15 points) - In
order to track a user's query history, you will allow users to register
with your search application. When a user visits the registration
page, he/she will be asked to enter (at minimum) a username and
password. Your application will store this information and use it for
- User Login/Logout (10 points) - Your site will
allow a user to log in by entering a username and password. If the
user is not registered, you will ask him/her to register in order to
the user's session. If the user logs out, you will clear the session.
- Search History (15 points) - Once
a user is logged in, your site will save any queries the user enters.
Your site will provide a mechanism (e.g., a link) that will allow the
user to view his/her search history.
- Account Maintenance (10 points) - Your site will allow a user to change his/her password and to clear his/her search history.
- Extra Features (15 points + possible extra credit)
- Page Preview (15 points) - In the search results provided by your
site, you will show a few lines of the text from each result page.
This will require that you save a copy of each page you have crawled.
- Page Visit History (15 points) - In addition to tracking a user's
query history, maintain a list of the links a user has followed. This
will require that your search results page provide links that direct
the user back to your site. Your site will then record that the user
has followed the link and redirect the user to the real page.
- Advanced Search (up to 15 points) - Allow the user to require that the
query words appear next to one another in the result documents (i.e.,
allow the user to specify the query in quotation marks.) Allow the
user to specify a set of words such that pages containing the given
words are not returned as part of the result set.
- Results Per Page (5 points) - Allow the user to select the number of results displayed on each page.
- Administrator Interface (10 points) - Provide an administrator
interface that allows an administrator with the ability to enter a new
seed URL to start a new crawl. Newly crawled pages will be added to
- Choose Your Own (up to 15 points) - See me
to suggest a feature and I will tell you how many points you will earn
for implementing it.
- Your user account information and search histories will be stored in a mysql database.
- You will use Servlets to dynamically generate web content and handle requests.
- Your server will take a seed URL as a command line parameter. At
startup, it will begin crawling the web starting at the given URL. You
may restrict the total number of pages crawled simply so you do not run
out of memory.
- You must demonstrate a first release of your project the week of 11/23-11/25. You need not demonstrate all functionality, but points will be deducted from your final score if you fail to show progress at the first release.
- It is recommend that you implement Search, User Registration, and User Login/Logout by the first release.
- You must schedule a demonstration appointment during the final exam period. For your demonstration, you will show the functionality of your project, give an overview of your design, and respond to questions regarding your code. Points will be deducted from your final score for failure to complete this portion of the assignment.