Exercises and what you need to know to do them
Please read and follow the set-up instructions (addendum for Athena users) before attempting these exercises! Printed copies of the instructions will be available in class. Hadoop documentation is available here.
Exercise 1 - WordCount
The basic WordCount exercise is a part of set-up instructions and is intended to help you learn to interact with the cluster. For those who finish it early and are eager to write their own MapReduces, it has some potential extensions.
Exercise 2 - PageRank
In this exercise, you will write MapReduces from scratch to compute PageRank of articles in Wikipedia.
Exercise 3 - Word Context Enthropy
This exercise aims to teach basic MapReduce optimization techniques. You will be given complete, albeit absolutely impractical implementation, and apply to it a number of optimizations you've seen in the lectures.
This is an optional exercise for those who're done with the first 3. Details to follow shortly.
|Unless otherwise noted, all materials on this website are created by Google, Inc and are licensed under Creative Commons Attribution 2.5 License.|