Hadoop HBase

HADOOP HBASE (work in Progress ..)

HADOOP HBase is a non-relational (NoSQL) database that runs on top of HDFS. It is columnar and provides fault-tolerant storage and quick access to large quantities of sparse data. It also adds transactional capabilities to Hadoop, allowing users to conduct updates, inserts and deletes.

Check Hbase Status

hbase shell

hbase(main):002:0> status

1 active master, 0 backup masters, 4 servers, 0 dead, 0.5000 average load

Took 0.1535 seconds

Create a table [3] and input data

Get the hbase shell prompt

hbase shell

This example is to create the hbase "Sales" table, fill it out and then the java/jar executable will transfer this table to the empty "sales_StoreWise" table.

Create "Sales" table with column family (cf) sales:

create 'sales','cfSales'

Input Sales data:

put 'sales','store1#Item1', 'cfSales:Sales','200'

put 'sales','store1#Item2', 'cfSales:Sales','100'

put 'sales','store1#Item3', 'cfSales:Sales','150'

put 'sales','store2#Item1', 'cfSales:Sales','150'

put 'sales','store2#Item2', 'cfSales:Sales','250'

put 'sales','store2#Item3', 'cfSales:Sales','210'

Check if the data has been saved:

scan 'sales'

output:

...

store2#Item3 column=cfSales:Sales, timestamp=1409915022947, value=210

6 row(s) in 0.0780 seconds

Count the number of rows using hbase mapreduce where sales is the table and cfsales:Sales is the column.

hbase org.apache.hadoop.hbase.mapreduce.RowCounter sales cfSales:Sales

output:

...

 org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters

                ROWS=6

...

Create another table "sales_StoreWise" to transfer the data from "sales" table

create 'sales_StoreWise','cfAggregateSales'

Quit hbase

exit

Copy the directory HBase from /home/sxg125/hadoop-projects/

cp -r /home/sxg125/hadoop-projects/HBase .

Change directory to HBase and compile it

/usr/java/jdk1.7.0_67-cloudera/bin/javac -cp /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/client-0.20/\* -cp `hbase classpath` -d hbase_classes HbaseMapRed.java

Create jar file "hbase.jar"

/usr/java/jdk1.7.0_67-cloudera/bin/jar -cvf hbase.jar -C hbase_classes/ .

output:

added manifest

adding: HbaseMapRed.class(in = 1603) (out= 820)(deflated 48%)

adding: HbaseMapRedMap.class(in = 2064) (out= 878)(deflated 57%)

adding: HbaseMapRedReduce.class(in = 2245) (out= 985)(deflated 56%)

create a output directory for hbase output:

hadoop fs -mkdir /user/<user>/hadoop-hbase/hbase-output

HADOOP jar command needs to be figured out

hadoop jar hbase.jar HbaseMapRed