ココから最新版をダウンロードします
環境を設定します
(1).sshがインストールされているか確認
$ ssh -V
OpenSSH_5.2p1, OpenSSL 0.9.8l 5 Nov 2009
$ chkconfig --list sshd
インストールされていなければ
$ sudo aptitude install openssh-server
毎回パスワード入れたくないので
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
(2).rsync がインストールされているか確認
rsync --version
rsync version 2.6.9 protocol version 29
Copyright (C) 1996-2006 by Andrew Tridgell, Wayne Davison, and others.
<http://rsync.samba.org/>
Capabilities: 64-bit files, socketpairs, hard links, symlinks, batchfiles,
inplace, IPv6, 64-bit system inums, 64-bit internal inums
rsync comes with ABSOLUTELY NO WARRANTY. This is free software, and you
are welcome to redistribute it under certain conditions. See the GNU
General Public Licence for details.
(3).openjdkのインストール
apt-get install openjdk-6-jdk
Ubuntuでの設定
(1).java check
java -version
(2).ssh check
ssh -V
(3).rsync check
rsync --version
(4).hadoop user追加
sudo adduser hadoop
su hadoop
(5).sshキー生成
sudo apt-get install ssh
ssh-keygen -t rsa -P ""
cat .ssh/id_rsa.pub >> .ssh/authorized_keys
localhostへログイン できるか確認する。
ssh localhost
(6).環境設定
/usr/localにダウンロードしたhadoopの環境を作る。
デー タストア用のディレクトリも作成する。
$ sudo su
$ cd /usr/local
$ cp -r /home/hadoop/download/hadoop-0.21.0 .
$ chown -R hadoop:hadoop hadoop-0.21.0
$ ln -s hadoop-0.21.0 hadoop
$ mkdir hadoop-datastore
$ chown -R hadoop:hadoop hadoop-datastore/
(7)./usr/local/hadoop/conf/hadoop-env.shを編 集する。
$ vim /usr/local/hadoop/conf/hadoop-env.sh
# The java implementation to use. Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk
#export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HADOOP_PID_DIR=/var/hadoop/pids
$mkdir -p /var/hadoop/pids
$chmod -R 777 /var/hadoop
※For Mac export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Home
(8).3つのxmlファイルを変更する。(~/hadoop/conf /)
参考サイト
Hadoop Official Site - Single Node Step
Over View(Hadoop-common 0.21.0 API)
#core-site.xml
$ vim /usr/local/hadoop/conf/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
#hdfs-site.xml
$ vim /usr/local/hadoop/conf/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
#mapred-site.xml
$ vim /usr/local/hadoop/conf/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
(9)..bashrcの設定
$ vim ~/.bashrc
export HADOOP_COMMON_HOME=/usr/local/hadoop
(10).hdfs をフォーマットする(初回のみ)
/usr/local/hadoop$ /usr/local/hadoop/bin/hadoop namenode -format
10/05/07 20:36:51 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hoge-desktop/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
10/05/07 20:36:52 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
10/05/07 20:36:52 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/07 20:36:52 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/07 20:36:52 INFO common.Storage: Image file of size 96 saved in 0 seconds.
10/05/07 20:36:52 INFO common.Storage: Storage directory /usr/local/hadoop-datastore/hadoop-hadoop/dfs/name has been successfully formatted.
10/05/07 20:36:52 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hoge-desktop/127.0.1.1
************************************************************/
(11).hadoop 起動
/usr/local/hadoop$ bin/start-all.sh
starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-hoge-desktop.out
localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-hoge-desktop.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-hoge-desktop.out
starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-hoge-desktop.out
localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-hoge-desktop.out
(12).hadoop 停止
/usr/local/hadoop$ bin/stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: stopping datanode
localhost: stopping secondarynamenode
エイリアスの設定
あらかじめ.bashrcへ以下のコマンドをエイリアス登録しておくと便利(参考サイト)
$ vim .bashrc
export PATH=$PATH:/usr/local/hadoop/bin
alias dfsls='/usr/local/hadoop/bin/hadoop dfs -ls'
alias dfslsr='/usr/local/hadoop/bin/hadoop dfs -lsr'
alias dfsrm='/usr/local/hadoop/bin/hadoop dfs -rm'
alias dfscat='/usr/local/hadoop/bin/hadoop dfs -cat'
alias dfsrmr='/usr/local/hadoop/bin/hadoop dfs -rmr'
alias dfsmkdir='/usr/local/hadoop/bin/hadoop dfs -mkdir'
alias dfsput='/usr/local/hadoop/bin/hadoop dfs -put'
alias dfsget='/usr/local/hadoop/bin/hadoop dfs -get'
$ source .bashrc
参 考サイト: Hadoopインストールメモ、HadoopWiki、
参考サイト:hadoop official site、IBMの説明、日本語構築、日本語Official Site、作成例、MapReduce説明、