The implementation of our MapReduce mainly follows this original MapReduce paper.
Our MapReduce system is able to tolerate faults and support multithreading.
Figure 1: Execution overview(from MapReduce Paper)
We use RPC to communicate between different machines.
As we all know, Google MapReduce uses GFS as its file-sharing system underneath. However, when it comes to cars, it's usually not applicable. It's not wise to do this for the following reasons. First, it is not secure to do it, because there may be some important files that car owners don't want to expose to others or the public for privacy reasons. Second, if some malicious files are shared for some reason and accidentally executed later, it might have a catastrophic impact on the computer in cars. Therefore, we chose to use a point-to-point way to transfer files. We use this module we found in Github that uses mDNS to transfer files within a local area network. Files can be downloaded simply by a string in the local area network. When it sends or receives files, it blocks itself until succeeds. Therefore, our MapReduce integrates this property by using multithreading on the sender side but blocking on the receiver side. Because we can simply send out a file by using a string, we use this property to provide another security feature, which is that we make this string a password, so only the receiver who has the password can download the file.
Future work for file transfer: implements a file transfer module by using RPC between machines. You can implement some system calls like open, read, write and close, which I believe will help a lot. And, wrap them up as send(IP address) and receive(IP address).
Let's talk about MapReduce now. The MapReduce has mapTasks queue and reduceTasks queue. Each task has four states: unassigned, assigned, running, and finished. Each worker finishes one task through 7 phases, which are GetRole, GetTask, DownloadFiles, StartTask, RunningTask(execute either mapperTask or ReducerTask), SendFiles, and TaskDone.
After calling GetRole, the worker will get all the information it needs, like ID, role, uploadPassword, downloadPassword, and etc. The coordinator will assign this task to this worker and only this worker will execute this task normally. Therefore, this task's state transforms to assigned.
Later on, the worker machine will ask the coordinator to send out the files, which is the GetTask. Then the worker machine is going to wait to receive the files. Once the worker machine receives files, it will call startTask to notify the coordinator that it just started to run and creates a timestamp for it. By this time, the coordinator will change the state of this task to the running state.
Then, the worker is ready for executing the task, because it has everything it needs now based on its role and files/data. After it finishes the task, it will send the result back to the coordinator and notify it that the task is done. The lifespan of the task ends with it.
So we mentioned our MapReduce provides fault tolerance, but how did we make it happen? Simply put, we used two tricks to handle this situation. The first one is timeout. Each task has a time limit to be finished. If it's not, we are going to reset the task to the unassigned state by implementing a background service, which will periodically check every task in the assigned and running state. Therefore, other workers are able to execute the task later, if the task expired.
The second trick we used is just the design. Each worker will continue to ask the coordinator for new tasks if they finish the one they have right now until the job is finished. Therefore, even though some worker machines crash, the job will continue to execute.
If you are smart, you may already notice there is a bottleneck for the current version of MapReduce. All the file transferring is through the coordinator. What you guys can improve is sending the file from the Mapper to the Reducer directly.
On the top of the coordinator program is the client program. It will send out the program, set up the environment, and clean the mess after the execution is done. While on top of the worker program, we created a daemon program. It will wait for the request from the coordinator, set up the working environment, start the worker program, and clean all the temporary files and folders that have been created during execution.
Yes, our MapReduce is simple like this. We can do this and we believe you guys can also push it further and make it better.
Implementing your own file-transferring module.
Resolving the bottleneck of the coordinator as the center of file transferring.