我的開發專案

我的部落格

最新協作平台活動

Wei-Yu, Chen 's Note‎ > ‎Hadoop‎ > ‎

cloudera hadoop 0.18 升級到 official hadoop 0.20

cloudera hadoop 0.18 升級到 official hadoop 0.20

零、安裝 cloudera 0.18 版

0-1. 加入cloudera source list


加入 cloudera 的 list 到 /etc/apt/source.lists.d [詳細步驟]

0-2. 安裝 cloudera hadoop 0.18


安裝
  • $ sudo su -
    # apt-get install hadoop-0.18 hadoop-0.18-namenode  hadoop-conf-pseudo
    # apt-get install hadoop-0.18-datanode hadoop-0.18-jobtracker hadoop-0.18-tasktracker
    

0-3. hadoop namenode 格式化


hadoop namenode format
# su -s /bin/bash - hadoop -c 'hadoop namenode -format'

ps : 在cloudera 版本裡 ,使用者 "hadoop" 是建構者,但是卻是虛擬的,因此透過hadoop身份來執行程式需要用下面格式 COMMAND 代表 hadoop的指令
# su -s /bin/bash - hadoop -c " COMMAND "

0-4. hadoop 啟動

# /etc/init.d/hadoop-namenode start
如果 執行 jps 有看到 namenode 已經啟動代表安裝沒問題,可以繼續把 datanode 啟動

# /etc/init.d/hadoop-jobtracker start
# /etc/init.d/hadoop-datanode start
# /etc/init.d/hadoop-tasktracker start

0-5. hadoop 操作


可以上傳一些資料到hdfs 內,並進行一些操作,以便驗證之後升級完成後,資料是否還在或正確

一、開始升級


1.1 停止所有服務


步驟簡化

1.2 下載並解壓縮 

hadoop 0.20 到 /opt/hadoop 內

步驟簡化

1.3 設定 config 檔

hadoop-env.sh  <-- 插入以下資訊

export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/conf
export HADOOP_LOG_DIR=/opt/hadoop/logs

core-site.xml   <-- 取代入以下資訊

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:8020</value>
  </property>
  <property>
     <name>hadoop.tmp.dir</name>
     <value>/var/lib/hadoop-0.18/cache/${user.name}</value>
  </property>
</configuration>

            注意:
var/lib/hadoop-0.18/cache/  代表原本放的資料夾,即使要升級成0.20 ,但原本資料還是在這裡,因此升級後,資料還是會存在這個 hadoop-0.18 資料夾中,如果覺得看得刺眼,可以用鍊結的方式
$ ln -sf /var/lib/hadoop-0.18 /
var/lib/hadoop

hdfs-site.xml  <-- 取代入以下資訊

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
     <name>dfs.permissions</name>
     <value>false</value>
  </property>
</configuration>

mapred-site.xml  <-- 取代入以下資訊

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>
</configuration>


1.4 執行升級指令


upgrade cloudera-hadoop-0.18 ---> official-hadoop 0.20
  • # su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop namenode -upgrade "
    
 畫面如下



原本執行 su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop namenode -upgrade " 的 console端會被handled住如下


此時,我們的namenode 會呈現safe mode 並且等待 datanode 連接

1.5 啟動所有的 datanode


當datanode 啟動之後,升級的動作才正式開始,開啟另外一個console 介面來執行以下指令
# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh start datanode"

如果datanode都啟動的話,可以看到最原始的console端開始在運作,將 舊得版本的hdfs 的 block 轉到新的hdfs 的block ,此時請不要任意中斷程序。

10/01/06 18:08:09 INFO ipc.Server: IPC Server handler 9 on 8020: starting
10/01/07 17:52:26 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 127.0.0.1:50010 storage DS-1591516806-127.0.1.1-50010-1262769597737
10/01/07 17:52:26 INFO net.NetworkTopology: Adding a new node: /default-rack/127.0.0.1:50010
10/01/07 17:52:27 INFO hdfs.StateChange: STATE* Safe mode ON. 
The ratio of reported blocks 0.1000 has not reached the threshold 0.9990. Safe mode will be turned off automatically.
10/01/07 17:52:27 INFO hdfs.StateChange: STATE* Safe mode extension entered. 
The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 29 seconds.
10/01/07 17:52:47 INFO hdfs.StateChange: STATE* Safe mode ON. 
The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe mode will be turned off automatically in 9 seconds.
10/01/07 17:52:57 INFO namenode.FSNamesystem: Total number of blocks = 10
10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of invalid blocks = 0
10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of under-replicated blocks = 0
10/01/07 17:52:57 INFO namenode.FSNamesystem: Number of  over-replicated blocks = 0
10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Leaving safe mode after 85493 secs.
10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Safe mode is OFF.
10/01/07 17:52:57 INFO hdfs.StateChange: STATE* Network topology has 1 racks and 1 datanodes
10/01/07 17:52:57 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks

當我們看到 Safe mode will be turned off automatically in X seconds. 時,代表程序差不多跑完了,最後處理 MapReduce 的部份

1.6 啟動 MapReduce 

接著啟動 MapReduce.(jobtracker, tasktracker)

# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh start jobtracker"
# su -s /bin/bash - hadoop -c "/opt/hadoop/bin/hadoop-daemon.sh stop tasktracker"

注意: 用上面的方式啟動 jobtracker 與 tasktracker  ,與用 start-all.sh   ,   或  start-dfs.sh   或   start-mapred.sh 等方法的差別在: 可以略過ssh 跟你要密碼。 (感謝 jazz 大大提供指教)


最原始的console端會出現以下 mapred 

10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop  ip=/127.0.0.1 cmd=listStatus  src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=null
10/01/07 18:03:12 INFO namenode.FSNamesystem: Number of transactions: 1 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0 
10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop  ip=/127.0.0.1 cmd=delete  src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=null
10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop  ip=/127.0.0.1 cmd=mkdirs  src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=nullperm=hadoop:supergroup:rwxr-xr-x
10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop  ip=/127.0.0.1 cmd=setPermission src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system dst=null  perm=hadoop:supergroup:rwx-wx-wx
10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop  ip=/127.0.0.1 cmd=create  src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info dst=null  perm=hadoop:supergroup:rw-r--r--
10/01/07 18:03:12 INFO FSNamesystem.audit: ugi=hadoop,hadoop  ip=/127.0.0.1 cmd=setPermission src=/var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info dst=null  perm=hadoop:supergroup:rw-------
10/01/07 18:03:12 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info. blk_-884931960867849873_1011
10/01/07 18:03:12 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:50010 is added to blk_-884931960867849873_1011 size 4
10/01/07 18:03:12 INFO hdfs.StateChange: DIR* NameSystem.completeFile: file /var/lib/hadoop-0.18/cache/hadoop/mapred/system/jobtracker.info is closed by DFSClient_-927783387
到目前為止,升級已經完成,我們可以對新的hdfs 進行一些操作,來試用一下新的系統。

如果確認這個版本是ok的,就可以進行下一個步驟做完美的 ending 。


1.7 結束升級


finalize the hdfs upgrade
# su -s /bin/bash - hadoop -c " /opt/hadoop/bin/hadoop dfsadmin -finalizeUpgrade"

最原始的console 會看到印出以下資訊

10/01/07 18:10:08 INFO common.Storage: Finalizing upgrade for storage directory /var/lib/hadoop-0.18/cache/hadoop/dfs/name.
   cur LV = -18; cur CTime = 1262772483322
10/01/07 18:10:08 INFO common.Storage: Finalize upgrade for /var/lib/hadoop-0.18/cache/hadoop/dfs/name is complete.
10/01/07 18:10:08 INFO common.Storage: Finalize upgrade for /var/lib/hadoop-0.18/cache/hadoop/dfs/name is complete.  代表著什麼就不言而語了

注意:前面有說過 /var/lib/hadoop-0.18/ 是原始的目錄名稱,即使裡面已經裝了新資料了

二、執行結果