BIG DATA - HBase

What is HBase?

HBase is a distributed column-oriented database built on top of the Hadoop file system. It is an open-source project and is horizontally scalable.It leverages the fault tolerance provided by the Hadoop File System (HDFS)..

It is a part of the Hadoop ecosystem that provides random real-time read/write access to data in the Hadoop File System.

HBase achieves high throughput and low latency by providing faster Read/Write Access on huge data sets. Therefore, HBase is the choice for the applications which require fast & random access to large amount of data.

HBase-ARCHITECTURE

In HBase, tables are split into regions and are served by the region servers. 

Regions are vertically divided by column families into “Stores”. Stores are saved as files in HDFS. Shown below is the architecture of HBase.

HBase has three major components:

HBase is a NoSQL database

HBase Configuration---To Performance Tuning

In order to fine-tune our HBase Cluster setup, there are many configuration properties are available in HBase: