Data Science for Managers

5-Day Continuing Education Course

February 4 - 8, 2019

College of Management of Technology, EPFL

Lausanne, Switzerland

Course offered in conjunction with EC H2020 AEGIS project on Big Data

Instructors

Prof. Kenneth Younge, Head of the Technology and Innovation Strategy Lab, EPFL

Prof. Chris Tucci, Head of the Corporate Strategy and Innovation Lab, EPFL

Dr. Omid Shahmirzadi, Post-Doc in Computer Science

Fee

4'700 CHF per participant. Fee includes all course materials, 5 lunches, snacks, and refreshments

500 CHF discount for additional participants from the same company

470 CHF 10% discount for contributing members of EPFL Alumni

Companies with several participants should please contact Cristina Martinucci ( Cristina.martinucci@epfl.ch ) to coordinate invoicing.

Discounts cannot be combined. One discount only.

Registration

Please click here to register.

Please note that if you start a "New Account" during the registration, but the system already recognizes your email, then the process will stop until a live human can start the registration. Please contact Cristina Martinucci ( Cristina.martinucci@epfl.ch ) if you have questions or any problems with registration.

Certificate

A certificate of attendance will be awarded at the end of the course.

Overview


Why take this course?

The Harvard Business School calls data science the sexiest job of the 21st century, and McKinsey projects that AI could deliver an additional $13 trillion GDP to the global economy by 2030. Currently we see a revolution in data-driven business models, automation, and analysis by the fastest growing companies.

Why has data science become so critical to organizations?

Today, massive amounts of data are available in all areas of science, government and industry. Exploited sensibly, such raw data can significantly improve the efficiency of research, services and industries in fields such as healthcare, engineering, finance, telecommunications or urban developments - to name just a few. But before organizations can build teams, to then build solutions, managers need to better understand the methods, why they work, why they sometimes do not work, and how to integrate them into their planning. We believe managers will make better decisions, and work better with data science teams, when they have first-hand experience working across a range of such methods.


Objectives

- Learn the foundational concepts and methods of data science

- Learn how to apply data analysis algorithms to real-world data sets (i.e., learn some programming)

- Understand the implications of data-driven business models and strategic planning


Target Audience

Professionals and managers who want to understand Data Science methods and tools, so they can better manage data science projects and technical experts in the field.


Requirements

No prior training in Data Science is required, although participants should be familiar with:

- Introductory statistics

- Basic knowledge of linear algebra

- Experience with a computer programming language

- At least some familiarity with Python is strongly recommended (all demos and exercises in Python)

- The course will be given in English

- Participants should bring their own laptop to use during the course


Topics

The course will pursue a balance between theory and practice, including visualizations, coding demonstrations, case studies, and assignments. It is assumed that many managers will be out-of-practice with mathematical skills; the course therefore emphasizes mathematical visualizations and logical thinking, over complicated mathematical manipulations.

- Measurement and Sampling

- Data Pre-processing

- Linear Models

- Model Evaluation

- Similarity

- Clustering

- Decision Trees

- Support Vector Machines

- Neural Networks

- Text Data (NLP, Neural Embeddings, Topic Modeling)

- Big Data Storage (SQL, No-SQL, Storage)

- Big Data Processing (Cloud Computing, Containers, VMs, Map/Reduce, Spark)

- Data-Driven Business Models

- Strategic Planning


Schedule

The week-long course is divided between three full days in the classroom, and two full days of self-directed programming + in-depth review. Each day in the classroom (3 days) will include comprehensive lectures, followed by a practical session and demonstration. In-between days (2 days) are scheduled for working on assignments, followed by an in-depth review at the end of the day. Participants will work to put the ideas and methods of the course into practice with real, functioning Python programs. You will leave the course being able to build, evaluate, and work with real data and real data science models.


Monday

8:30 - 9:30 WELCOME Introductions and Setup

9:30 - 11:30 LECTURE 1 Core Concepts

11:30 - 13:00 Lunch Q & A

13:00 - 16:00 LECTURE 2 Linear Models

16:00 - 17:00 Break Q & A

17:00 - 18:00 SETUP Python and Jupyter Lab for Data Science

18:00 - 19:00 DEMO 0 Predict Probability of Credit Default (Simple)

Tuesday

8:30 - 11:30 LECTURE 3 Similarity, Clustering, Trees, Ensembles

11:30 - 13:00 Lunch Q & A

13:00 - 14:00 DEMO 1 Customer Segmentation

14:00 - 15:00 Break Q & A

15:00 - 19:00 PROBLEM 1 Predict Probability of Credit Default (Complex)

Wednesday

8:30 - 11:30 LECTURE 4 Support Vector Machines, Neural Nets

11:30 - 13:00 Lunch Q & A

13:00 - 14:00 DEMO 2 Image Recognition

14:00 - 15:00 Break Q & A

15:00 - 19:00 PROBLEM 2 Predict Sales Price for Houses

Thursday

8:30 - 11:30 LECTURE 5 Text Analysis (NLP, Topic Modeling)

11:30 - 13:00 Lunch Q & A

13:00 - 14:00 DEMO 3 Sentiment Prediction of IMDB Reviews

14:00 - 15:00 Break Q & A

15:00 - 19:00 PROBLEM 3 Pre-screen Funding Applications

Friday

8:30 - 11:00 LECTURE 6 Big Data (Cloud, Storage, Processing, Services)

11:00 - 11:30 DEMO 4 Analyzing Wikipedia Click Stream

11:30 - 13:00 Lunch Q & A

13:00 - 16:00 LECTURE 7 Business Models & Data-Driven Strategy

16:00 - 16:30 CONCLUSION Certificate delivered on conclusion

Preparation for the Course

The course will require basic Python programming skills to load data, pre-process data, and move data to libraries for in-depth modeling and analysis. In practical sessions we will review all of the programming methods that you will need to use for the course, and TAs will be available throughout the course to help you with coding questions. Nevertheless, it is a good idea to prepare yourself with a basic understanding of the Python programming language and syntax before the start of class.

We have prepared a brief review of the core concepts from Python that you will need in the course, which we will send to you in the weeks before the start of class. We will review the most important aspects of Python programming at the beginning of the first practical session, but we recommend that you take time before the start of class to familiarize yourself with Python. We recommend the 4-hour tutorial by DataCamp and the comprehensive guide for Python from Python.org.