222TEC001 : Foundations of Data Science (2024)

Module 3: Algorithms and Nonlinear Classifiers (7 hours)
- Linear discriminant-based algorithm: perceptron, perceptron algorithm
- Support vector machines
- Nonlinear classifiers, the XOR problem
- Multilayer perceptrons
- Backpropagation algorithm
Module 4: Unsupervised Learning and Ensemble Methods (8 hours)
- Unsupervised learning
- Clustering, examples, criterion functions for clustering
- Proximity measures, algorithms for clustering
- Ensemble methods: boosting, bagging
- Basics of decision trees, random forest, examples
Module 5: Deep Learning Networks (7 hours)
- Introduction to deep learning networks
- Deep feedforward networks
- Basics of convolutional neural networks (CNN)
- CNN basic structure, Hyper-parameter tuning, Regularization - Dropouts
- Initialization, CNN examplesassification, recognition and segmentation, speech recognition, automatic language translation and auto corrections, recommendation engines.

References

Bishop, C. M. (2006). Pattern Recognition and Machine Learning. New York: Springer.
Theodoridis, S., & Koutroumbas, K. (2003). Pattern Recognition. San Diego: Academic Press.
Hastie, T., Tibshirani, R., & Friedman, J. The Elements of Statistical Learning. Springer.
Duda, R. O., Hart, P. E., & Stork, D. G. Pattern Classification. New York: Wiley.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

CO-PO Mapping and Rubrics for Evaluation

CO-PO Mapping

Evaluation Pattern

Continuous Internal Evaluation: 40 marks
- Quiz 1: 2.5 marks
- Quiz 2: 2.5 marks
- Seminar: 5 marks
- Course Project: 20 marks
- Internal Examination: 10 marks
End-semester Examination: 60 marks

Course Project

As part of the course project, you are required to implement a machine learning algorithm using publicly available datasets. You are encouraged to select a topic from the provided list, although you are also free to explore any other area of interest. Please ensure you receive approval from the Course Instructor for your chosen topic. Additionally, you are required to submit a declaration confirming that your selected topic is distinct from the domain of your mini-project.Should the quality of your work meet the standards for academic publication, the Course Instructors may advise a joint submission for possible publication. This collaborative effort reflects the shared intellectual contribution and is a testament to the project's academic merit.
Suggested Project Topics:
1. Classification of phonological categories in imagined speech Reference Paper Dataset
2. Classification of motor imagery from EEG Reference Paper Dataset
3. Classification of imagined words from EEG Reference Paper Dataset
4. Modeling wine preferences by data mining from physicochemical properties Reference Paper Dataset
5. Breast cancer histopathological image classification using AlexNet Reference Paper Dataset
6. Music genre classification with convolutional neural networks Reference Paper Dataset
7. Sentiment classification system of twitter data for US airline service analysis Reference Paper Dataset
8. Classification of emotions from EEG Reference Paper Dataset
9. Online handwriting recognition system for Tamil Reference Paper Dataset
10. Real-time credit card fraud detection Reference Paper Dataset
11. Speech emotion recognition Reference Paper Dataset
12. Boston house price prediction using regression models Reference Paper Dataset
13. Breast cancer diagnosis Reference Paper Dataset
14. News Classification Reference Paper Dataset
15. Classification of sentiment polarity of cars and hotel reviews Reference Paper Dataset
16. Classification of fashion categories Reference Paper Dataset
17. Prediction of Titanic survival rate Reference Paper Dataset

Course Seminar

Fundamentals of Linear Regression

Introduction to regression analysis
Linear regression model and its assumptions
Understanding error functions

Exploring Multivariate Regression

Extension from simple to multivariate regression
Application areas of multivariate regression
Addressing bias and variance

Introduction to Classification Techniques

Overview of classification in machine learning
Bayes’ decision theory basics
Discriminant functions and decision surfaces

Bayesian Classification Methods

Principles of Bayesian classification
Applying Bayesian classification to normal distributions
Real-world applications of Bayesian classification

The Perceptron Algorithm: A Linear Discriminant Approach

Concept and history of the perceptron
The perceptron learning algorithm
Limitations and use cases

Understanding Support Vector Machines (SVM)

Basics of SVM
Kernel trick and SVM optimization
SVM in classification tasks

Nonlinear Classifiers and the XOR Problem

Challenges with linear classifiers
Introduction to the XOR problem
Solutions via nonlinear classifiers

Dive into Multilayer Perceptrons (MLP)

Structure and functioning of MLP
Importance of backpropagation algorithm
MLP in complex problem-solving

Unsupervised Learning: An Overview

Distinction between supervised and unsupervised learning
Applications of unsupervised learning

Clustering Techniques and Their Applications

Understanding clustering and its criteria
Proximity measures and clustering algorithms
Examples of clustering applications

Ensemble Learning Methods: Boosting and Bagging

Concept of ensemble learning
Differences and similarities between boosting and bagging
Decision trees and random forests as examples

Introduction to Deep Learning Networks

Evolution and significance of deep learning
Key components of deep networks

Deep Feedforward Networks: Architecture and Applications

Understanding deep feedforward networks
Architectural nuances and application areas

Basics of Convolutional Neural Networks (CNN)

Introduction to CNNs and their unique architecture
Key operations in CNNs (convolution, pooling)

Hyper-parameter Tuning in CNNs

Importance of hyper-parameter tuning
Strategies for effective tuning
Regularization techniques, including dropouts

Initialization Techniques for Deep Learning

Role of initialization in model performance
Popular initialization methods and their impacts

CNN Applications: Image Classification and Beyond

CNNs in image classification
Extending CNN applications to recognition and segmentation

Speech Recognition with Deep Learning

Application of deep learning in speech recognition
Challenges and solutions in the field

Rubrics (Total: 100 points)

Content (40 marks)

Accuracy (10 marks): Information is factually correct, well-researched.
Relevance (10 marks): Content is directly related to the seminar topic and objectives.
Depth (10 marks): Presentation covers the topic comprehensively, including background information and current trends.
Originality (10 marks): The presentation provides unique insights or a novel approach to the topic.
Organization (20 marks)

Structure (10 marks): Clear introduction, body, and conclusion; logical flow of ideas.

Pacing (10 marks): Time is well-managed, with neither rushed nor excessively slow segments.

Audio and Voice Delivery (20 marks)

Clarity (10 marks): Speaker articulates clearly, with good diction and appropriate volume.
Engagement (10 marks): Speaker uses tone variation and pauses effectively to maintain interest.
Visual Aids (10 marks)

Quality (5 marks): Slides or visual aids are legible, aesthetically pleasing, and free from excessive text.

Usefulness (5 marks): Visual aids enhance understanding of the topic and are relevant to the content discussed.
Understanding and Knowledge (10 marks)

Grasp of Topic (5 marks): Speaker demonstrates a strong understanding of the subject matter.

Responses to Hypothetical Questions (5 marks): Speaker anticipates and addresses potential questions in the presentation.

Technical Quality (10 marks)

Video/Audio Quality (10 marks): The audio is clear without background noise, and the video (if any visual elements are present) is steady and well-lit.

Please upload the slides and the videos here. Do not create a separate folder for each student. Format for filename: <RollNo>_<Name>_<Slides/Video>

Quizzes

Quiz 1

Internals

TBA

Course Materials

Page updated

Google Sites

Report abuse