cloudera hadoop 0.18 升級到 official hadoop 0.20
cloudera hadoop 0.18 升級到 official hadoop 0.20
零、安裝 cloudera 0.18 版
0-1. 加入cloudera source list
加入 cloudera 的 list 到 /etc/apt/source.lists.d [詳細步驟]
0-2. 安裝 cloudera hadoop 0.18
安裝
$ sudo su - # apt-get install hadoop-0.18 hadoop-0.18-namenode hadoop-conf-pseudo # apt-get install hadoop-0.18-datanode hadoop-0.18-jobtracker hadoop-0.18-tasktracker
0-3. hadoop namenode 格式化
hadoop namenode format
# su -s /bin/bash - hadoop -c 'hadoop namenode -format'
ps : 在cloudera 版本裡 ,使用者 "hadoop" 是建構者,但是卻是虛擬的,因此透過hadoop身份來執行程式需要用下面格式 COMMAND 代表 hadoop的指令
# su -s /bin/bash - hadoop -c " COMMAND "
0-4. hadoop 啟動
# /etc/init.d/hadoop-namenode start
如果 執行 jps 有看到 namenode 已經啟動代表安裝沒問題,可以繼續把 datanode 啟動
# /etc/init.d/hadoop-jobtracker start
# /etc/init.d/hadoop-datanode start
# /etc/init.d/hadoop-tasktracker start
0-5. hadoop 操作
可以上傳一些資料到hdfs 內,並進行一些操作,以便驗證之後升級完成後,資料是否還在或正確
一、開始升級
1.1 停止所有服務
步驟簡化
1.2 下載並解壓縮
hadoop 0.20 到 /opt/hadoop 內
步驟簡化
1.3 設定 config 檔
hadoop-env.sh <-- 插入以下資訊
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/conf
export HADOOP_LOG_DIR=/opt/hadoop/logs
core-site.xml <-- 取代入以下資訊
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop-0.18/cache/${user.name}</value>
</property>
</configuration>
注意:var/lib/hadoop-0.18/cache/ 代表原本放的資料夾,即使要升級成0.20 ,但原本資料還是在這裡,因此升級後,資料還是會存在這個 hadoop-0.18 資料夾中,如果覺得看得刺眼,可以用鍊結的方式
$ ln -sf /var/lib/hadoop-0.18 /var/lib/hadoop
hdfs-site.xml <-- 取代入以下資訊
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml <-- 取代入以下資訊
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
1.4 執行升級指令
upgrade cloudera-hadoop-0.18 ---> official-hadoop 0.20
# su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop namenode -upgrade "
畫面如下
原本執行 su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop namenode -upgrade " 的 console端會被handled住如下
此時,我們的namenode 會呈現safe mode 並且等待 datanode 連接
1.5 啟動所有的 datanode
當datanode 啟動之後,升級的動作才正式開始,開啟另外一個console 介面來執行以下指令
# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh start datanode"
如果datanode都啟動的話,可以看到最原始的console端開始在運作,將 舊得版本的hdfs 的 block 轉到新的hdfs 的block ,此時請不要任意中斷程序。
10/01/06 18:08:09 INFO ipc.Server: IPC Server handler 9 on 8020: starting 10/01/07 17:52:26 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage DS-1591516806-127.0.1.1-50010-1262769597737 10/01/07 17:52:26 INFO net.NetworkTopology: Adding a new node: /default-rack/127.0.0.1:50010 10/01/07 17:52:27 INFO hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 0.1000 has not reached the threshold 0.9990. Safe mode will be turned off automatically. 10/01/07 17:52:27 INFO hdfs.StateChange: STATE* Safe mode extension entered. The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 29 seconds. 10/01/07 17:52:47 INFO hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 9 seconds. 10/01/07 17:52:57 INFO namenode.FSNamesystem: Total number of blocks = 10 10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of invalid blocks = 0 10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of under-replicated blocks = 0 10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of over-replicated blocks = 0 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Leaving safe mode after 85493 secs. 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Safe mode is OFF. 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Network topology has 1 racks and 1 datanodes 10/01/07 17:52:57 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks
當我們看到 Safe mode will be turned off automatically in X seconds. 時,代表程序差不多跑完了,最後處理 MapReduce 的部份
1.6 啟動 MapReduce
接著啟動 MapReduce.(jobtracker, tasktracker)
# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh start jobtracker"
# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh stop tasktracker"
注意: 用上面的方式啟動 jobtracker 與 tasktracker ,與用 start-all.sh , 或 start-dfs.sh 或 start-mapred.sh 等方法的差別在: 可以略過ssh 跟你要密碼。 (感謝 jazz 大大提供指教)
最原始的console端會出現以下 mapred
10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=listStatus src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=null 10/01/07 18:03:12 INFO namenode.FSNamesystem: Number of transactions: 1 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=delete src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=null 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=mkdirs src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=hadoop:supergroup:rwxr-xr-x 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=setPermission src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=null perm=hadoop:supergroup:rwx-wx-wx 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=create src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info dst=null perm=hadoop:supergroup:rw-r--r-- 10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop ip=/127.0.0.1 cmd=setPermission src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info dst=null perm=hadoop:supergroup:rw------- 10/01/07 18:03:12 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info. blk_-884931960867849873_1011 10/01/07 18:03:12 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-884931960867849873_1011 size 4 10/01/07 18:03:12 INFO hdfs.StateChange: DIR* NameSystem.completeFile: file /var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info is closed by DFSClient_-927783387
到目前為止,升級已經完成,我們可以對新的hdfs 進行一些操作,來試用一下新的系統。
如果確認這個版本是ok的,就可以進行下一個步驟做完美的 ending 。
1.7 結束升級
finalize the hdfs upgrade
# su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop dfsadmin -finalizeUpgrade"
最原始的console 會看到印出以下資訊
10/01/07 18:10:08 INFO common.Storage: Finalizing upgrade for storage directory /var/lib/hadoop-0.18/cache/hadoop/dfs/name. cur LV = -18; cur CTime = 1262772483322 10/01/07 18:10:08 INFO common.Storage: Finalize upgrade for /var/lib/hadoop-0.18/cache/hadoop/dfs/name is complete.
10/01/07 18:10:08 INFO common.Storage: Finalize upgrade for /var/lib/hadoop-0.18/cache/hadoop/dfs/name is complete. 代表著什麼就不言而語了
注意:前面有說過 /var/lib/hadoop-0.18/ 是原始的目錄名稱,即使裡面已經裝了新資料了