Post date: Jan 1, 2014 10:38:18 PM
I've been collecting up names and resources at CMU that may be of use to folks in this course. Here's a brief list:
CMU recently held a panel discussion on data science as part of the inauguration of our new president, Subra Suresh. CMU has also launched a website highlighting the university's interdisciplinary approach to data science, including various educational programs that support prospective data scientists. Some of the faculty who work in the area include many faculty in programs listed on that CMU page including folks in VLIS, ML, Heinz (public policy and information management); Business Analytics (Tepper); Language Technologies; Education; and Human Computer Interaction. I am hoping to draw guest lecturers to the course from several of these areas.
Of course those working with data at CMU are not limited to the above programs. For example, you may want to take a deeper look at Golan Levin (Art Practice and director of the Studio for Creative Inquiry with a long history of interesting digital visualization work and interactive art); Andy Pavlo (a new member of CSD with interests in database management systems, specifically main memory systems, non-relational systems (NoSQL), transaction processing systems (NewSQL), and large-scale data analytics); Nick Sahindis (Chemical Engineering, working on the Alamo black box simulation and modeling system); The many great projects of the CREATE lab (e.g. TimeLapse and FluxStream); Carolyn Rosé's work on NLP discourse analysis in health and education and her very complementary course on machine learning in practice; Noah Smith's work on statistical modeling for measuring the proportions that politicians evoke different ideologies in campaign speeches, extracting international relations events from news text, and relating campaign contributions to politicians' remarks on Twitter; Jason Hong who's work on Livehoods has helped to reconstruct our view of cities using social media data; Daniel Neill who develops novel statistical and computational methods for discovery of emerging events and other relevant patterns in complex and massive datasets, applied to real-world policy problems ranging from medicine and public health to law enforcement and security.