Big Data

GreenPlum: Distributed and resilient SQL database; built on PostgreSQL

Hadoop: a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models

HBase: Hadoop Database; a distributed, scalable, big data store

Spark: a fast and general-purpose cluster computing system

Hive: data warehouse software that facilitates querying (SQL dialect) and managing large datasets residing in distributed storage

Pig: a platform for analyzing large data sets that consists of a high-level language (Pig Latin) for expressing data analysis programs, coupled with infrastructure for evaluating these programs

ZooKeeper: Distributed configuration/coordination for distributed applications

Akka: a toolkit and runtime for building highly concurrent, distributed, and fault tolerant event-driven applications on the JVM

Ambari™: A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters

Page updated

Google Sites

Report abuse