CS 89.20/189: Data Science for Health (Spring 2020)

Logistics

  • Professor: Temiloluwa Prioleau

  • Time: 10A slot (Tues. & Thurs. 10:10am - 12N)

  • Join the class via Zoom

  • Office Hours: Tues. & Thurs. 1:30pm - 3pm

    • Request a time using this link (preferred) or by email.

    • Zoom link for office hours is in Canvas on the home page

Course Overview

The importance of Data Science for Health is undeniably apparent given today's global pandemic of COVID-19 also known as Coronavirus. As of March 25, 2020, there are 414,179 confirmed cases of COVID-19 worldwide, with 40,712 cases in the last 24-hours (see WHO for more details).

There are many untapped opportunities in the use of data as a source of knowledge to inform health decisions. In this course, we will cover state-of-the-art methods for data acquisition and analysis in a range of application domains such as infectious disease, cancer, diabetes, and mental health. Students will develop their skills by reading, presenting, and critiquing seminal research papers. The course will also include assignments and a group project to reinforce concepts and methods widely used in data science. A formal syllabus is available here.

It is important to note that this course will be conducted like a seminar (i.e. there are no formal lectures). A large amount of self-learning is required. This includes digging into the details of data science methods presented in the readings to know when and how to use such methods in other applications.

Prerequisites

  • COSC 74 (Machine Learning and Statistical Data Analysis) or instructor's permission

  • Experience with Python for data wrangling (COSC 1 alone is not sufficient)

Course Goals

  1. Build data science knowledge from reading (research papers & blogs)

  2. Practice with assignments and a project (all done via jupyter notebook)

  3. Tell about it by writing a publishable paper and presenting your work

What is Data Science

“The ability to take data — to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it”

~Hal Varian~

Class Structure

Every class period will include discussions of two seminal research papers on various Data Science for Health topics from the weekly reading list. The presentation should be 45-mins long and follow the 15-15-15 rule for method-application-discussion. Note that this structure may not be feasible for all research papers (e.g. review papers).

  1. The first 15-mins focuses on any single data science method discussed in the paper. The goal is to teach the audience about this method. Feel free to reference additional resources as needed.

  2. The second 15-mins focuses on highlighting key points from the research paper which is one of many application spaces that can benefit from the data science method.

  3. The last 15-mins is for open discussion which is led by the presenter. Consider providing specific questions to guide discussion on the topic.

An example presentation title could be "Deep Neural Network & Skin Cancer Detection". Every student should sign up for one presentation slot via the link on Canvas.

Important:

  • Given that this class is being taught virtually students will be expected to live present or pre-record their presentation. Pre-recorded videos must be available at least 24-hours in advance of the class time. The video link will shared by the instructor via Canvas. Each student should make appropriate accommodations to ensure they can facilitate the in-class discussions either live (preferred when feasible) or by including in their video questions that should be discussed during the class time.

  • In the event that a student finds the research paper they originally opted to present uninteresting or unfitting for a strong presentation, it is acceptable to find any other unassigned research paper from the weekly list or by searching online. The presentation date should not change but the paper being presented can change if necessary. However, this change request should be communicated to the instructor asap.

  • Students are required to submit their slide deck used for presentation via Canvas. This is due on the day of class presentation.

Useful Resources

  1. Reference Textbooks:

  2. Relevant Websites and Blogs

  3. Keshav 2007: How to Read a Paper, ACM SIGCOMM Computer Communication Review, 37 (3), 83 -84.

  4. Writing Your Own Paper