Analytics Use Case Demo : Using Spark

AIM

To demonstrate an use case for Analytics Use Case Demo : Using Spark showcasing data ingestion, data processing and data visualization.

The main Idea of the use case demo is to create a Big Data Hadoop platform to showcase different Data Architecture processes to get business incites from a large data set.

Objective

The major Objective of the analytics process are the following:

  1. Create a 4 node Cloudera Cluster with Spark , Hive .

  2. Setup Flume Ingestion Box for data collection from external source.

  3. Implement Flume Agent to ingest data into Hadoop cluster .

  4. Implement Data Analytics using Spark to analyze the ingested data.

  5. Load Analyzed data into Data warehouse.

  6. Install tableau and setup data visualization.

Technology Stack

The following diagram shows the technology stack for the implementation. This provides an incite of the technologies used in the use case and the roadmap for each.