Problem#1:Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, recieved org.apache.hadoop.io.Text.
Sol: add following in driver class, because we are using FileOutputFormat which expects LongWritable as key by default and our mapper is emiting as Text.:
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
Use Hadoop client to submit jobs on hadoop cluster.
Problem#2:Connection refused error while running hadoop dfs -ls from webserver(192.168.56.40), target hadoop cluster name node (192.168.56.21).
Sol: entry added in /etc/hosts was incorrect(target node name was added with internal ip, which was same on webserver too.)
#How to list jobs and their status :
mapred job -list
mapred job -status <jobId> ## status about progress of job
#How to kill any job:
mapred job -kill <jobId>
#How to change Log levels .
JobConf conf = new JobConf(); ... conf.set("mapreduce.map.log.level", "DEBUG"); conf.set("mapreduce.reduce.log.level", "TRACE"); ...
use of chainMapper and chain Reducer
multi mapper example: http://dailyhadoopsoup.blogspot.in/2014/01/mutiple-input-files-in-mapreduce-easy.html