Big Data Processing Platform Management

Class overview

The goals of this class are to learn three things : 1) big data platform management including Hadoop ecosystem 2) big data visualization and analytic tool 3) A bit concepts of parallel and distributed computing.

The first priority will be on processing with big data with Hadoop ecosystems and tools where we spend about half of the time on it. Second will be distributed computing programming where we discuss along since the Hadoop FS relies on it . Third will be Elastic ecosystem where it includes processing pipeline and visualization which can be connected to the first part we learn.

We will learn by doing mostly. There will NOT be a lot of theories much. But we mention some and its design principle as we go on.

Students will learn to practice the skill set and get used to ecosystem so that in the future they can adapt themselves these similar things.

I am sure that as they finish the course, they will get or earn more skill of problem-solving and ability to learn new things.

Note, there will be not many TA that can help you on setting up.

You will have to operate your VM yourself.

Mostly, every week you will have hands-on lab or programming lab to finish up which may take around 1.5-2.0 hr depending on your skill. You may spend more time on exploring for solutions and studying/understanding them though.

The main principle is learn how to and understand why it works; not just copy and cut-paste, since we won't improve your knowledge.

This class is not suitable for:

1) Those who are not what to learn by practicing, and who are not interested in a system administration

2) Those who expects to learn a lot of theory.

3) Those who do not like programming. We can practice though if your background is not strong.

4) Who expects to need some help individually e.g. for UNIX command line, setting problem, syntax error etc. since we do not have enough TA.

Learning strategy:

- In-class meeting is for lecture and assignment discussions.

-Students are required to manipulate their VMs themselves.

-homework due every 2-3 weeks and quizzes using Quizziz every 3 weeks.

(programming assignment along with the VM setting up)