Big Data Frameworks
Apache Hadoop Ecosystem slides-fr
eng. Apache Hadoop MapReduce Framework
+ MR Labs
+ Apache Pig Latin lab: tpch files, Q16 pig script
Stream Processing slides-eng
Apache Spark slides-eng
+ Spark/Java workflows
+ Analytics of Chicago Crime Dataset with Scala programming language and Zeppelin notebook
+ Analytics of NYC Cabs dataset with PySpark programming language and Zeppelin notebook
+ Stream processing lab with Java programming language
AWS Educate slides
AWS EMR Spark aws-cli s3 ec2Keypair emr emr-sparksql emr-spark-workflow emr-graphframes
AWS EMR Hadoop slides
Bulk Synchronous Parallelism
Pregel Apache Giraph