HADOOP HBase is a non-relational (NoSQL) database that runs on top of HDFS. It is columnar and provides fault-tolerant storage and quick access to large quantities of sparse data. It also adds transactional capabilities to Hadoop, allowing users to conduct updates, inserts and deletes.
hbase shell
hbase(main):002:0> status
1 active master, 0 backup masters, 4 servers, 0 dead, 0.5000 average load
Took 0.1535 seconds
Get the hbase shell prompt
hbase shell
This example is to create the hbase "Sales" table, fill it out and then the java/jar executable will transfer this table to the empty "sales_StoreWise" table.
Create "Sales" table with column family (cf) sales:
create 'sales','cfSales'
Input Sales data:
put 'sales','store1#Item1', 'cfSales:Sales','200'
put 'sales','store1#Item2', 'cfSales:Sales','100'
put 'sales','store1#Item3', 'cfSales:Sales','150'
put 'sales','store2#Item1', 'cfSales:Sales','150'
put 'sales','store2#Item2', 'cfSales:Sales','250'
put 'sales','store2#Item3', 'cfSales:Sales','210'
Check if the data has been saved:
scan 'sales'
output:
...
store2#Item3 column=cfSales:Sales, timestamp=1409915022947, value=210
6 row(s) in 0.0780 seconds
Count the number of rows using hbase mapreduce where sales is the table and cfsales:Sales is the column.
hbase org.apache.hadoop.hbase.mapreduce.RowCounter sales cfSales:Sales
output:
...
org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
ROWS=6
...
Create another table "sales_StoreWise" to transfer the data from "sales" table
create 'sales_StoreWise','cfAggregateSales'
Quit hbase
exit
Copy the directory HBase from /home/sxg125/hadoop-projects/
cp -r /home/sxg125/hadoop-projects/HBase .
Change directory to HBase and compile it
/usr/java/jdk1.7.0_67-cloudera/bin/javac -cp /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/client-0.20/\* -cp `hbase classpath` -d hbase_classes HbaseMapRed.java
Create jar file "hbase.jar"
/usr/java/jdk1.7.0_67-cloudera/bin/jar -cvf hbase.jar -C hbase_classes/ .
output:
added manifest
adding: HbaseMapRed.class(in = 1603) (out= 820)(deflated 48%)
adding: HbaseMapRedMap.class(in = 2064) (out= 878)(deflated 57%)
adding: HbaseMapRedReduce.class(in = 2245) (out= 985)(deflated 56%)
create a output directory for hbase output:
hadoop fs -mkdir /user/<user>/hadoop-hbase/hbase-output
HADOOP jar command needs to be figured out
hadoop jar hbase.jar HbaseMapRed