Instructors: Joe Hellerstein, Alon Halevy and Mike Franklin
Course: Tues/Thurs 12:30-2:00PM, 310 Soda
Announcements
Details of Final Project Presentations and Papers:
- Short, Informal Class Presentations, Tues Dec 9th in class (last day of classes). Bring slides and be prepared to explain what your project is, and what you are currently working on.
- Project Paper due Tues Dec 16th, 6-15 pages (shorter is better, use your discretion), in any conference format. Email your final project to Joe: Hellerstein at cs.berkeley.edu
We have a discussion forum for papers. Please use it.
You will need ACM Digital Library access to download many of our readings. If you're not at a campus network address, you can use the UCB library proxy server or the campus VPN.
Course Description:
For most people, the phrase "Data Management" does not evoke images of a rewarding social environment. But many aspects of Data Management -- representation, capture, cleaning, integration, analysis -- are inherently collaborative, involving teams of designers, developers and/or analysts. So the processes of Data Management should enable and leverage social behavior much more than they traditionally have.
In recent years, some early "Web 2.0" style asynchronous discussion and collaboration sites have been developed for collaborative data analysis and curation. These include Many Eyes, Swivel, and Freebase, among others. These initial efforts into social data services have sparked interest in both the database and HCI communities, as well as in various "user" communities.
These early services are relatively simple, but suggestive of more substantial things to come. Given the masses of facts and figures available in modern society, and the difficulty of making sense of disparate datasets in a fully automatic way, how can software aid users -- particularly groups of users -- to improve the collective understanding, and extract additional value from data?
This seminar is intended to explore research relevant to the topic of collaborative data management. It will begin with readings and lectures on technologies that are relevant to multi-party data management, including data cleaning, data integration, data visualization and information extraction. With this background, the course structure will move into a more collaborative phase of jointly choosing papers to read as a group and discuss.
Students in the course will be expected to stay current with readings, participate vigorously in class discussion, present papers to the group, and engage in a research project relevant to the course.
For most people, the phrase "Data Management" does not evoke images of a rewarding social environment. But many aspects of Data Management -- representation, capture, cleaning, integration, analysis -- are inherently collaborative, involving teams of designers, developers and/or analysts. So the processes of Data Management should enable and leverage social behavior much more than they traditionally have.
In recent years, some early "Web 2.0" style asynchronous discussion and collaboration sites have been developed for collaborative data analysis and curation. These include Many Eyes, Swivel, and Freebase, among others. These initial efforts into social data services have sparked interest in both the database and HCI communities, as well as in various "user" communities.
These early services are relatively simple, but suggestive of more substantial things to come. Given the masses of facts and figures available in modern society, and the difficulty of making sense of disparate datasets in a fully automatic way, how can software aid users -- particularly groups of users -- to improve the collective understanding, and extract additional value from data?
This seminar is intended to explore research relevant to the topic of collaborative data management. It will begin with readings and lectures on technologies that are relevant to multi-party data management, including data cleaning, data integration, data visualization and information extraction. With this background, the course structure will move into a more collaborative phase of jointly choosing papers to read as a group and discuss.
Students in the course will be expected to stay current with readings, participate vigorously in class discussion, present papers to the group, and engage in a research project relevant to the course.