Anoop Chaturvedi - Data Mining

Data Mining

Multivariate Data Mining- Methods and Applications

The forty hours course is for the students in Bachelor's and Master's programmes and covers the topics of Data Mining

Title of Course: Multivariate Data Mining- Methods and Applications

Name of Instructor: Anoop Chaturvedi

DOI: 10.13140/RG.2.2.14073.58729

Duration of the Course: 40 hours

Affiliation: Professor (Superannuated in June 2019), Department of Statistics, University of Allahabad,

Email: anoopchaturv@gmail.com

Mobile: +91 9415214134

Suggested Books:

List of reference material/ books:

(i) Izenman, A.J., (2008), Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning, Springer.

(ii) James, G., Witten D., Hastie T., Tibshirani R., (2013), An Introduction to Statistical Learning with applications to R, Springer.

(iii) Everitt B.S., Landau S., Leese M., Stahl D. (2011), Cluster Analysis, 5th Edition, Wiley.

(iv) Han, J. and Kamber, M (2006). Data Mining: Concepts and Techniques, 2nd edition, Morgan Kaufmann.

(v) Dunham, M. H. (2003). Data Mining: Introductory and Advanced Topics, Pearson Education.

Title Lecture slides download link (Ctrl+click) Lecture videos download link

Introduction L 1 Lecture 1
Data Mining, Machine Learning and Artificial Intelligence L 2 Lecture 2
Machine Learning Rules L 3 Lecture 3
Matrix Algebra L 4 Lecture 4
Multivariate Analysis L 5 Lecture 5
Multiple Regression Model: Introduction L 6 Lecture 6
Properties of Estimators and Model Selection Criterion L 7 Lecture 7
Model Assessment for Multiple Regression L 8 Lecture 8
Multicollinearity and Variables Selection L 9 Lecture 9
Shrinkage Estimation L 10 Lecture 10
Principal Component and Least Angle Regression L 11 Lecture 11
Regression Methods for Classification L 12 Lecture 12
Principal Component Analysis L 13 Lecture 13
Statistical Analysis of PCA L 14 Lecture 14
Sample PCA and Applications L 15 Lecture 15
Sparse PCA and Nonlinear Dimensionality Reduction L 16 Lecture 16
Kernel Principal Component Analysis L 17 Lecture 17
Latent Variable Model for Blind Source Separation L 18 Lecture 18
ICA Algorithms and Exploratory Factory Analysis L 19 Lecture 19
Introduction to Artificial Neural Network L 20 Lecture 20
McCulloch- Pitts Neuron and Single-Layer Perceptron L 21 Lecture 21
Rosenblatt’s single-layer perceptron L 22 Lecture 22
Multi-layer perceptron L 23 Lecture 23
Backpropagation of Errors Algorithm L 24 Lecture 24
Convolutional Neural Networks L 25 Lecture 25
Recurrent neural network and Projection Pursuit L 26 Lecture 26
Cluster Analysis: An Introduction L 27 Lecture 27
Hierarchical Clustering Techniques L 28 Lecture 28
Centroid and Non-hierarchical Clustering Methods L 29 Lecture 29
Partition around medoids (PAM) Clustering Algorithm L 30 Lecture 30
Self-Organizing Map L 31 Lecture 31
Clustering based upon Mixture Models L 32 Lecture 32
Recursive Partitioning: Decision Trees L 33 Lecture 33
Training and Pruning Decision Trees L 34 Lecture 34
Regression Trees L 35 Lecture 35
Committee Machine and Random Forests L 36 Lecture 36
Support Vector Machine for Linear Separable Cases L 37 Lecture 37
SVM for Linearly Non-Separable Cases L 38 Lecture 38
Block Clustering L 39 Lecture 39
Plaid Models for Block Clustering L 40 Lecture 40

Brief Description of Lectures:

Introduction: Introduction to data mining, its applications in various fields, Outline of the course
Data Mining, Machine Learning and Artificial Intelligence Basics of Data mining, Data mining and knowledge discovery, Artificial Intelligence,
Machine Learning Rules Machine Learning Rules, Supervised, unsupervised learning, Batch Learning and Online learning, Reinforcement learning, resubstitution Estimate, Generalizations for improving resubstitution estimates, Training, learning and test sets, Bootstrap, Ockham’s (or Occam’s) razor principle, methods for reducing the effects of overfitting, Sampling Design for obtaining data,
Matrix Algebra Introduction to vectors, Operations of vectors, Different types of vectors, different types of matrices, matrix operations, Eigen values and eigen vectors, different results related to orthogonal matrices, idempotent matrices, quadratic forms, Matric norms
Multivariate Analysis Multivariate probability distributions, Multivariate normal distribution, marginal and conditional distributions, Expectation of some quadratic forms
Multiple Regression Model: Introduction General structure of regression problem, Multiple linear models, Estimation of parameters, model in deviation form
Properties of Estimators and Model Selection Criterion Properties of estimators and model selection criterion, R square, adjusted R square, AIC, BIC
Model Assessment for Multiple Regression Model Assessment for random and fixed X, Prediction error, apparent error rate or resubstitution error rate, resampling methods, V-fold cross validation, Optimism corrected bootstrap estimate of PE
Multicollinearity and Variables Selection Multicollinearity problem and its implications and measures, stepwise variable selection regression, backward, and forward methods, hybris stepwise method
Shrinkage Estimation Shrinkage estimation, penalized regression estimators, LASSO and Ridge regression
Principal Component and Least Angle Regression Principal Component regression and Least Angle Regression methods
Regression Methods for Classification Formulation of probability models, LOGIT and PROBIT Models for classification
Data Mining Methods for High Dimensional data: Principal Component Analysis The Curse of Dimensionality, Basics and objectives of Principal Component Analysis for linear feature space, Advantages and Disadvantages
Statistical Analysis of PCA Population PCA, Least-Squares Optimality of PCA, Eckart-Young Theorem, Courant–Fischer Min-Max theorem, PCA as a Variance-Maximization Technique
Sample PCA and Applications Sample PCA, Tools for selecting the number of principal components, and Real data applications of PCA. Principal Component Analysis for Data Visualization
Sparse PCA and Nonlinear Dimensionality Reduction Sparse and robust methods for PCA, PCA for outlier detection, nonlinear dimensionality reduction, polynomial PCA, Basic elements of Nonparametric Density Estimation
Kernel Principal Component Analysis PCA for non-linear feature space, Kernel PCA,
Latent Variable Model for Blind Source Separation Latent variable models for blind source separation: cocktail party problem, independent component analysis (ICA) and its applications, linear mixing, and noiseless ICA
ICA Algorithms and Exploratory Factory Analysis FastICA algorithm for determining single source component, deflation, and parallel FastICA algorithm for extracting multiple independent source components, Applications to the real dataset, Exploratory factor analysis model
Introduction to Artificial Neural Network Basics and Structure of ANN, its various applications, ANN design and brain activity
McCulloch- Pitts Neuron and Single-Layer Perceptron Threshold logic unit, McCulloch-Pitts Neuron and its limitations, Hebb learning rule, Different types of neural networks
Rosenblatt’s single-layer perceptron Feedforward single layer network, Rosenblatt’s Single layer perceptron, single unit perceptron, Algorithm for implementing Rosenblatt’s single layer perceptron, perceptron convergence theorem
Multi-layer perceptron Multilayer perceptron, Learning networks, Multiclass classification rule
Backpropagation of Errors Algorithm Backpropagation of Errors Algorithm-Single hidden layer, Online learning mode, Stochastic learning mode, Batch learning mode
Convolutional Neural Networks Convolution Neural Network (CNN), its architecture, and applications
Recurrent neural network and Projection Pursuit Recurrent neural network (RNN) and CNN, Basics of RNN, Elman and Jordan networks, Projection pursuit regression, generalized additive model
Cluster Analysis: An Introduction Basic elements and objectives of cluster analysis, various similarity and distance measures
Hierarchical Clustering Techniques Distance measures for quantitative variables, Hierarchical clustering method, Agglomerative Hierarchical Clustering, Single linkage Clustering, Complete Linkage Clustering, Average Linkage Clustering
Centroid and Non-hierarchical Clustering Methods Centroid Linkage Clustering, Steps for implementation of Agglomerative Hierarchical Clustering, Ward’s Hierarchical Clustering method, Partitioning Clustering: K-means clustering
Partition around medoids (PAM) Clustering Algorithm PAM Clustering Algorithm, K-medoid and PAM clustering algorithm, Fuzzy analysis, Selecting number of clusters, Silhouette and average Silhouette Method
Self-Organizing Map Self-organizing maps (SOM) or Kohonen neural network, on-line and batch versions of SOM algorithm, distance weight version, U-matrix, Hierarchical SOM, Quality measures
Clustering based upon Mixture Models Density-Based Clustering Methods, Clustering based on Gaussian Mixture Models, Expectation-Maximization Clustering algorithm
Recursive Partitioning: Decision Trees Components of decision tree classification and basic terminology, Attribute Selection Measures: information gain and entropy, Gini index and node impurity function, choosing the best split, pruning algorithm for classification trees., Recursive partitioning to grow a tree, mating
Training and Pruning Decision Trees Overfitting and pruning the tree, cost complexity pruning measure, Choosing the best pruned tree, Cross validation for selecting best subtree
Regression Trees Regression trees: Background, Basic terminology, Recursive partitioning for regression data, terminal node value and splitting strategy, pruning the tree and best pruned subtree
Committee Machine and Random Forests Committee Machine: Bagging tree-based classifiers and regression tree predictors, Boosting, ADABOOST algorithm for binary classification. Random Forests algorithm for regression or classification
Support Vector Machine for Linear Separable Cases Support vector machine (SVM) with linear separable case, obtaining optimal separating hyperplane for linear separable case, Karuh, Kuhn, Tucker conditions, Multiclass SVM as a series of binary problems
Support Vector Machine for Linearly Non-Separable Cases SVM for nonlinearly separable datasets, nonlinear SVM, kernel trick for nonlinearly separable datasets, SVM for regression, -insensitive loss function and its optimization
Block Clustering Basics of Block clustering, Hartigan’s block-clustering algorithm, Bi-clustering, two-way ANOVA model for bi-clustering
Plaid Models for Block Clustering Plaid models for bi-clustering with examples