cloudera hadoop 0.18 升級到 official hadoop 0.20

cloudera hadoop 0.18 升級到 official hadoop 0.20

零、安裝 cloudera 0.18 版

0-1. 加入cloudera source list

加入 cloudera 的 list 到 /etc/apt/source.lists.d [詳細步驟]

0-2. 安裝 cloudera hadoop 0.18

安裝

    • $ sudo su - # apt-get install hadoop-0.18 hadoop-0.18-namenode hadoop-conf-pseudo # apt-get install hadoop-0.18-datanode hadoop-0.18-jobtracker hadoop-0.18-tasktracker

0-3. hadoop namenode 格式化

hadoop namenode format

# su -s /bin/bash - hadoop -c 'hadoop namenode -format'

ps : 在cloudera 版本裡 ,使用者 "hadoop" 是建構者,但是卻是虛擬的,因此透過hadoop身份來執行程式需要用下面格式 COMMAND 代表 hadoop的指令

# su -s /bin/bash - hadoop -c " COMMAND "

0-4. hadoop 啟動

# /etc/init.d/hadoop-namenode start

如果 執行 jps 有看到 namenode 已經啟動代表安裝沒問題,可以繼續把 datanode 啟動

# /etc/init.d/hadoop-jobtracker start

# /etc/init.d/hadoop-datanode start

# /etc/init.d/hadoop-tasktracker start

0-5. hadoop 操作

可以上傳一些資料到hdfs 內,並進行一些操作,以便驗證之後升級完成後,資料是否還在或正確

一、開始升級

1.1 停止所有服務

步驟簡化

1.2 下載並解壓縮

hadoop 0.20 到 /opt/hadoop 內

步驟簡化

1.3 設定 config 檔

hadoop-env.sh <-- 插入以下資訊

export JAVA_HOME=/usr/lib/jvm/java-6-sun

export HADOOP_HOME=/opt/hadoop

export HADOOP_CONF_DIR=/opt/hadoop/conf

export HADOOP_LOG_DIR=/opt/hadoop/logs

core-site.xml <-- 取代入以下資訊

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:8020</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/var/lib/hadoop-0.18/cache/${user.name}</value>

</property>

</configuration>

注意:var/lib/hadoop-0.18/cache/ 代表原本放的資料夾,即使要升級成0.20 ,但原本資料還是在這裡,因此升級後,資料還是會存在這個 hadoop-0.18 資料夾中,如果覺得看得刺眼,可以用鍊結的方式

$ ln -sf /var/lib/hadoop-0.18 /var/lib/hadoop

hdfs-site.xml <-- 取代入以下資訊

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

</configuration>

mapred-site.xml <-- 取代入以下資訊

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

</property>

</configuration>

1.4 執行升級指令

upgrade cloudera-hadoop-0.18 ---> official-hadoop 0.20

    • # su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop namenode -upgrade "

畫面如下

原本執行 su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop namenode -upgrade " 的 console端會被handled住如下

此時,我們的namenode 會呈現safe mode 並且等待 datanode 連接

1.5 啟動所有的 datanode

當datanode 啟動之後,升級的動作才正式開始,開啟另外一個console 介面來執行以下指令

# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh start datanode"

如果datanode都啟動的話,可以看到最原始的console端開始在運作,將 舊得版本的hdfs 的 block 轉到新的hdfs 的block ,此時請不要任意中斷程序。

10/01/06 18:08:09 INFO ipc.Server: IPC Server handler 9 on 8020: starting 10/01/07 17:52:26 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage DS-1591516806-127.0.1.1-50010-1262769597737 10/01/07 17:52:26 INFO net.NetworkTopology: Adding a new node: /default-rack/127.0.0.1:50010 10/01/07 17:52:27 INFO hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 0.1000 has not reached the threshold 0.9990. Safe mode will be turned off automatically. 10/01/07 17:52:27 INFO hdfs.StateChange: STATE* Safe mode extension entered. The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 29 seconds. 10/01/07 17:52:47 INFO hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 9 seconds. 10/01/07 17:52:57 INFO namenode.FSNamesystem: Total number of blocks = 10 10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of invalid blocks = 0 10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of under-replicated blocks = 0 10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of over-replicated blocks = 0 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Leaving safe mode after 85493 secs. 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Safe mode is OFF. 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Network topology has 1 racks and 1 datanodes 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks

當我們看到 Safe mode will be turned off automatically in X seconds. 時,代表程序差不多跑完了,最後處理 MapReduce 的部份

1.6 啟動 MapReduce

接著啟動 MapReduce.(jobtracker, tasktracker)

# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh start jobtracker"

# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh stop tasktracker"

注意: 用上面的方式啟動 jobtracker 與 tasktracker ,與用 start-all.sh , 或 start-dfs.sh 或 start-mapred.sh 等方法的差別在: 可以略過ssh 跟你要密碼。 (感謝 jazz 大大提供指教)

最原始的console端會出現以下 mapred

10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=listStatus src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=null 10/01/07 18:03:12 INFO namenode.FSNamesystem: Number of transactions: 1 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=delete src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=null 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=mkdirs src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=hadoop:supergroup:rwxr-xr-x 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=setPermission src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=null perm=hadoop:supergroup:rwx-wx-wx 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=create src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info dst=null perm=hadoop:supergroup:rw-r--r-- 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=setPermission src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info dst=null perm=hadoop:supergroup:rw------- 10/01/07 18:03:12 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info. blk_-884931960867849873_1011 10/01/07 18:03:12 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-884931960867849873_1011 size 4 10/01/07 18:03:12 INFO hdfs.StateChange: DIR* NameSystem.completeFile: file /var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info is closed by DFSClient_-927783387

到目前為止,升級已經完成,我們可以對新的hdfs 進行一些操作,來試用一下新的系統。

如果確認這個版本是ok的,就可以進行下一個步驟做完美的 ending 。

1.7 結束升級

finalize the hdfs upgrade

# su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop dfsadmin -finalizeUpgrade"

最原始的console 會看到印出以下資訊

10/01/07 18:10:08 INFO common.Storage: Finalizing upgrade for storage directory /var/lib/hadoop-0.18/cache/hadoop/dfs/name. cur LV = -18; cur CTime = 1262772483322 10/01/07 18:10:08 INFO common.Storage: Finalize upgrade for /var/lib/hadoop-0.18/cache/hadoop/dfs/name is complete.

10/01/07 18:10:08 INFO common.Storage: Finalize upgrade for /var/lib/hadoop-0.18/cache/hadoop/dfs/name is complete. 代表著什麼就不言而語了

注意:前面有說過 /var/lib/hadoop-0.18/ 是原始的目錄名稱,即使裡面已經裝了新資料了

二、執行結果