1. Copy hadoop package and job submission file from /scratch/00791/xwj/hadoop-test. There should be two files, hadoop-0.20-2-new.tar.gz and job.hadoop.new
login1% cp /scratch/00791/xwj/hadoop-test/* ./
login1% ls
hadoop-0.20.2-new.tar.gz hadoop-0.20.2.tar.gz job.hadoop job.hadoop.new
2. Unpack hadoop-0.20-2.new.tar.gz. It will be extracted at ./hadoop-0.20-2.new directory.
tar -xzvf hadoop-0.20.2-new.tar.gz
3. Set the following environment variables:
setenv JAVA_HOME /share/apps/teragrid/jdk1.6.0_19-64bit/
setenv HADOOP_CONF_DIR ${HOME}/.hadoop2/conf/
setenv HADOOP_LOG_DIR ${HOME}/.hadoop2/logs/
setenv HADOOP_SLAVES ${HADOOP_CONF_DIR}/slaves
setenv HADOOP_PID_DIR /hadoop/pids
If you use bash instead of csh(default), you you should change the setenv lines to
export JAVA_HOME=/share/apps/teragrid/jdk1.6.0_19-64bit/
TACC UNIX TIP!
You can change your shell to bash by doing chsh -s /bin/bash
Here's the list of shells you can choose from:
@login1:~$ cat /etc/shells
/bin/sh
/bin/bash #most common
/sbin/nologin
/bin/tcsh
/bin/csh #Default
/bin/ksh
/usr/bin/ksh
/bin/zsh
Now you have successfully install Hadoop on TACC computer. Go on to the next page: Submit a job request