Data Science for Managers

5-Day Continuing Education Course

September 3 - 7, 2018

College of Management of Technology, EPFL

Lausanne, Switzerland


Course offered in conjunction with EC H2020 AEGIS project on Big Data

Instructors

Prof. Kenneth Younge, Head of the Technology and Innovation Strategy Lab, EPFL

Prof. Chris Tucci, Head of the Corporate Strategy and Innovation Lab, EPFL

Dr. Omid Shahmirzadi, Post-Doc in Computer Science, Instructor for Practicals and Exercises

Fee

CHF 2'800 / participant (fee includes all course materials, 5 lunches, and refreshments)

A certificate of attendance will be awarded at the end of the course

10% special discount for contributing members of EPFL Alumni

Registration

Registration is now closed. If you are interested in this program please contact Cristina Martinucci : Cristina.martinucci@epfl.ch , we will put you on the waiting list.

Overview


Why take this course?

In 2012, the Harvard Business School called data science the sexiest job of the 21st century. In 2015, McKinsey projected that “by 2018, the U.S. alone may face a 50-60 percent gap between supply and requisite demand of deep analytic talent“. Now in 2018, we see a revolution in data-driven business models, automation, and analysis by the fastest growing companies. Why has data science become so critical to organizations?

Today, massive amounts of data are available in all areas of science, government and industry. Exploited sensibly, such raw data can significantly improve the efficiency of research, services and industries in fields such as healthcare, engineering, finance, telecommunications or urban developments - to name just a few. But before organizations can build teams, to then build solutions, managers need to better understand the methods, why they work, why they sometimes do not work, and how to integrate them into their planning. We believe managers will make better decisions, and work better with data science teams, when they have first-hand experience working across a range of such methods.


Objectives

- Learn the foundational methods of data science

- Understand how companies such as Google, Facebook, Amazon, and IBM use such techniques

- Learn how to apply data analysis algorithms to real-world datasets (i.e., learn some programming)

- Understand the implications of data-driven business models and strategic planning


Target Audience

Professionals who want to understand Data Science methods and tools, so they can better manage technical experts in the field.


Requirements

No prior training in Data Science is required, although participants should be familiar with:

- Introductory statistics

- Basic knowledge of linear algebra

- Experience with a computer programming language

- Familiarity with Python will be helpful ( examples and exercises in Python)

- The course will be given in English

- Participants should bring their own laptop to use during the course


Topics

The course will pursue a balance between theory and practice, including visualizations, coding demonstrations, case studies, and assignments. It is assumed that many managers will be out-of-practice with mathematical skills; the course therefore emphasizes mathematical visualizations and logical thinking, over complicated mathematical manipulations.

- Measurement and Sampling

- Data Preprocessing

- Linear Models

- Model Evaluation

- Similarity

- Clustering

- Decision Trees

- Support Vector Machines

- Neural Networks

- Text Data (NLP, Neural Embeddings, Topic Modeling)

- Big Data Storage (SQL, No-SQL, Storage)

- Big Data Processing (Cloud Computing, Containers, VMs, Map/Reduce, Spark)

- Big Data Services (IoT, image recognition, monitoring,...)

- Data-Driven Business Models

- Strategic Planning


Schedule

The week-long course is divided between three full days in the classroom, and two full days of self-directed programming + in-depth review. Each day in the classroom (3 days) will include comprehensive lectures, followed by a practical session and demonstration. In-between days (2 days) are scheduled for working on assignments, followed by an in-depth review at the end of the day. Participants will work to put the ideas and methods of the course into practice with real, functioning Python programs. You will leave the course being able to build, evaluate, and work with real data and real data science models.


Monday - Sept 3, 2018

8:30 - 9:30 WELCOME Introductions and Setup

9:30 - 11:30 LECTURE 1 Core Concepts

11:30 - 13:00 Lunch Q & A

13:00 - 16:00 LECTURE 2 Basic Models

16:00 - 17:00 Break Q & A

17:00 - 18:00 SETUP Python and Jupyter Lab for Data Science

18:00 - 19:00 DEMO 0 Predict Probability of Credit Default (Simple)

Tuesday - Sept 4, 2018

8:30 - 11:30 LECTURE 3 Similarity, Clustering, Trees, Ensembles

11:30 - 13:00 Lunch Q & A

13:00 - 14:00 DEMO 1 Customer Segmentation

14:00 - 15:00 Break Q & A

15:00 - 19:00 PROBLEM 1 Predict Probability of Credit Default (Complex)

Wednesday - Sept 5, 2018

8:30 - 11:30 LECTURE 4 Support Vector Machines, Neural Nets

11:30 - 13:00 Lunch Q & A

13:00 - 14:00 DEMO 2 Image Recognition

14:00 - 15:00 Break Q & A

15:00 - 19:00 PROBLEM 2 Predict Sales Price for Houses

Thursday - Sept 6, 2018

8:30 - 11:30 LECTURE 5 Text Analysis (NLP, Topic Modeling)

11:30 - 13:00 Lunch Q & A

13:00 - 14:00 DEMO 3 Sentiment Prediction of IMDB Reviews

14:00 - 15:00 Break Q & A

15:00 - 19:00 PROBLEM 3 Pre-screen Funding Applications

Friday - Sept 7, 2018

8:30 - 11:00 LECTURE 6 Big Data (Cloud, Storage, Processing, Services)

11:00 - 11:30 DEMO 4 Analyzing Wikipedia Click Stream

11:30 - 13:00 Lunch Q & A

13:00 - 16:00 LECTURE 7 Business Models & Data-Driven Strategy

16:00 - 16:30 CONCLUSION Certificate delivered on conclusion