Hadoop Ingestion Approaches -1


Overview

In this blog we will discuss the Hadoop ingestion approaches in details and will try to figure out the challenges in each section. We will also try to identify pattern and approaches and figure out some rules that will help us identify when to use which ingestion approach. The goal of this blog is to provide a realistic approach in dealing with variety of data format and have a baseline created to deal with the variety of data formats using the different ingestion techniques. By now we all agree that ingestion is one of the most important and most critical phase of Hadoop solution. In any business scenario of any solution architecture which requires a Hadoop solution, data ingestion is the primary and most important process before starting with actual analytics. And in more often than not the ingestion approaches and the processes makes or breaks a solution offering. In this blog we will go through the different ingestion approaches and their charactestics:

A Basic Ingestion Big Data Approach

This section is a simple Ingestion Technology stack explaining different scenarios of processes that are associated with a Big Data solution; this workflow provides a high level component view of the services and their capabilities. We will go through the technology stack and will identify the advantages of each component and their capabilities.