Big Data and Business Analytics
Course Materials (PPTs/PDFs/Handouts)
Module 1: Introduction to Data Analytics
Need for Business Intelligence
Data Management, Data Visualization, Data Warehousing
ETL Data Processing Chain From Business Intelligence to Business Analytics
Module 2: Business Analytics Cycle
Introduction, Analytical Tools & Methods,
Integration Social Analytics, Operational Analytics Big Data Analytics,
Hadoop, Informatics, Cognos etc.
Business application of big data analytics
Module 3: Data Mining & decision Making
Predictive Analysis
Forecasting Optimization
Simulation Gamification.
Module 4: Business Metrics in Action
Data science in Startups
Basics of Problem-Solving
Design Patterns in Statistical Computing
Excel for Data Science.
Module 5: Data Driven Prediction Methods
NLP, Regression,
Correlation, Cluster Analysis,
Artificial Neural Networks,
BI Tools and Applications.
Module 6: Case Studies and presentations
Course Materials (PPTs/PDFs/Handouts) - OLD
Ch1 [3 Hrs]: Introduction to Big Data [PDF]
Ch2 [2 Hrs]: Introduction to Hadoop [PDF]
Ch3 [4 Hrs]: NoSQL [PDF] [MongoDB Install] [Introduction to MongoDB]
Ch4 [6 Hrs]: MapReduce and the New Software tack [PDF]
Ch5 [3 Hrs]: Finding Similar Items [PDF] [e.g Edit Distance]
Ch6 [6 Hrs]: Mining Data Streams [PDF] [e.g Flajolet Martin Algorithm] [e.g Bloom Filter]
Ch7 [5 Hrs]: Link Analysis [PDF] [e.g PageRank] [PageRank Examples][e.g TSPR][e.g HITS]
Ch8 [5 Hrs]: Frequent Itemsets [PDF]
Ch9 [5 Hrs]: Clustering [PDF]
Ch11 [5 Hrs]: Mining Social-Network Graphs [PDF]
Course Lab Experiments
Guidelines for Writing Lab Experiments click here.
Lab 1 [Ch1]: To draw and explain Hadoop Architecture and Ecosystem with the help of a case study using WorkCount example. To define and install Hadoop.
Lab 2 [Ch2]:To implement the following file management tasks in Hadoop System (HDFS): Adding files and directories, Retrieving files, Deleting files
Lab 3 [Ch2]:To run a basic Word Count MapReduce program to understand MapReduce Paradigm: To count words in a given file, To view the output file, and To calculate execution time.
Lab 4 [Ch3]: To perform NoSQL database using mongodb to create, update and insert.
Lab 5 [Ch5]: To study and implement basic functions and commands in R Programming.
Lab 6 [Ch5]: To finding similar documents with Cosine Similarity in R.
Lab 7 [Ch6]: To implement Bloom Filters for filter on Stream Data in C++/java.
Lab 8 [Ch6]: To implement Flajolet-Martin Algorithm for counting distinct elements in Stream Data.
Lab 9 [Ch8]: To implement clustering program using R programming.
Lab 10 [Ch1]: To build WordCloud, a text mining method using R for easy to understand and visualization than a table data.
Lab 11 [Ch5 and Ch10]: To find Term Frequency and Inverse Document Frequency (tf-idf) Matrix for Recommendation Systems and Plot TF Using R used.
Lab 12 [Ch7]:
References
Text books
Big Data Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses by Michael Minelli
Work-study by ILO
Reference books
Business Analytics: Data Analysis and Decision Making by S. Christian Albright
Big Data: Using Smart Big Data, Analytics and Metrics to Make Better Decisions and Improve Performance by Bernard Marr