The purpose of this site is to give students access to class notes, homework, projects requirements, projects downloads, etc., and to provide a forum to report problems, bugs, and suggestions aimed at improving the quality of the course.
Objectives
Knowledge discovery is the nontrivial extraction of implicit, previously unknown and potentially useful information from large volumes of data. This course will focus on recent work on data mining, knowledge discovery, and how these topics relate to Bayesian networks. We will also explore the new ways in which the future of distributed computing is being shaped. We will see how tools for electronic commerce are being developed and we will study the enabling technologies that make it possible.
Tentative Topics
Prerequisites
Excellent programming skills, good background in statistics and probability, graduate standing, or consent from the instructor. Students will be expected to explore untouched territories, assimilate complex mathematical concepts, write large pieces of computer code, use cutting edge programming tools, many of which may be in beta release, etc.
Grades
Course Material
Homework
Instructions for each homework will be posted in this site. In general students will have to write computer programs complying with certain specifications, and to post in a designated location of the network, their solution, including executables, code, and .mak or project files, with full access rights set up for Engineering\vargasje. If you fail to set up your solutions appropriately, you will receive no credit for the work, so be careful. I do recommend you to test your set up before telling me that is ready.
Quizes
This is what you would expect. Surprise questions that cover either material discussed during a recent lecture or reading material assigned for the lecture in which the quiz is given. In general quizes will be short and to the point. Students are expected to answer quickly and specifically, with no exercises on rhetorical narrative, just the facts, please.
MidTerm Project
The MidTerm Project will be done by all students individually. You will be assigned a very specific task, and you are expected to perform that task! Most likely I will give you a data set on which you will apply some algorithms to learn the topology of a Bayesian net and/or to do probabilistic inference. Your code will be written in the languages C++, C, and/or Java. You will receive more instructions regarding code documentation, project delivery, expected results, etc.
Final Project
The Final Project will be done by all students. The objective of the final project is to go through the process of designing and developing new and exciting software. Students will write a project proposal by midterm. The proposal will be discussed immediately after the midterm in order to articulate the specific goals to be achieved upon execution of the project. Once the specifics of the project are identified, the team is expected to fully complete all goals. A typical project will consist of (1) having some algorithm develop for data mining, (2) Use the algorithm with real data, and (3) writing a report describing the algorithm, the mode of use, etc. The code could be written in the languages C, C++, and/or Java. You will receive more instructions regarding code documentation, project delivery, expected results, etc.