Mr.IAP Course Exercises

Exercises and what you need to know to do them









Please read and follow the set-up instructions (addendum for Athena users) before attempting these exercises! Printed copies of the instructions will be available in class. Hadoop documentation is available here.

Exercise 1 - WordCount

The basic WordCount exercise is a part of set-up instructions and is intended to help you learn to interact with the cluster. For those who finish it early and are eager to write their own MapReduces, it has some potential extensions. 

Exercise 2 - PageRank

In this exercise, you will write MapReduces from scratch to compute PageRank of articles in Wikipedia.

Exercise 3 - Word Context Enthropy

This exercise aims to teach basic MapReduce optimization techniques. You will be given complete, albeit absolutely impractical implementation, and apply to it a number of optimizations you've seen in the lectures.

Exercise 4 - k-means clustering

This is an optional exercise for those who're done with the first 3. Details to follow shortly.