Hadoop HBase
HADOOP HBASE (work in Progress ..)
HADOOP HBase is a non-relational (NoSQL) database that runs on top of HDFS. It is columnar and provides fault-tolerant storage and quick access to large quantities of sparse data. It also adds transactional capabilities to Hadoop, allowing users to conduct updates, inserts and deletes.
Check Hbase Status
hbase shell
hbase(main):002:0> status
1 active master, 0 backup masters, 4 servers, 0 dead, 0.5000 average load
Took 0.1535 seconds
Create a table [3] and input data
Get the hbase shell prompt
hbase shell
This example is to create the hbase "Sales" table, fill it out and then the java/jar executable will transfer this table to the empty "sales_StoreWise" table.
Create "Sales" table with column family (cf) sales:
create 'sales','cfSales'
Input Sales data:
put 'sales','store1#Item1', 'cfSales:Sales','200'
put 'sales','store1#Item2', 'cfSales:Sales','100'
put 'sales','store1#Item3', 'cfSales:Sales','150'
put 'sales','store2#Item1', 'cfSales:Sales','150'
put 'sales','store2#Item2', 'cfSales:Sales','250'
put 'sales','store2#Item3', 'cfSales:Sales','210'
Check if the data has been saved:
scan 'sales'
output:
...
store2#Item3 column=cfSales:Sales, timestamp=1409915022947, value=210
6 row(s) in 0.0780 seconds
Count the number of rows using hbase mapreduce where sales is the table and cfsales:Sales is the column.
hbase org.apache.hadoop.hbase.mapreduce.RowCounter sales cfSales:Sales
output:
...
org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
ROWS=6
...
Create another table "sales_StoreWise" to transfer the data from "sales" table
create 'sales_StoreWise','cfAggregateSales'
Quit hbase
exit
Copy the directory HBase from /home/sxg125/hadoop-projects/
cp -r /home/sxg125/hadoop-projects/HBase .
Change directory to HBase and compile it
/usr/java/jdk1.7.0_67-cloudera/bin/javac -cp /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/client-0.20/\* -cp `hbase classpath` -d hbase_classes HbaseMapRed.java
Create jar file "hbase.jar"
/usr/java/jdk1.7.0_67-cloudera/bin/jar -cvf hbase.jar -C hbase_classes/ .
output:
added manifest
adding: HbaseMapRed.class(in = 1603) (out= 820)(deflated 48%)
adding: HbaseMapRedMap.class(in = 2064) (out= 878)(deflated 57%)
adding: HbaseMapRedReduce.class(in = 2245) (out= 985)(deflated 56%)
create a output directory for hbase output:
hadoop fs -mkdir /user/<user>/hadoop-hbase/hbase-output
HADOOP jar command needs to be figured out
hadoop jar hbase.jar HbaseMapRed