M.SC. (DATA SCIENCE) I-YEAR, II-SEMESTER
MDS-201: PAPER- I: ADVANCED MACHINE LEARNING TECHNIQUES
UNIT–I
Classification Techniques: Linear classifies, Multiple Linear regression, Logistic regression, Linear Discriminant Function (for binary outputs) with minimum squared error, Linear discriminant function using likelihood ratios based on Multivariate normal populations, Bayes Mis-classification; Naïve Bayes classifier, Support Vector Machines, Decision Tree algorithms,
UNIT–II
Random Forest algorithm, Bagging, Gradient boosting, Ada-Boosting and XG-Boosting algorithm, KNN algorithm, Market-Basket Analysis. Definitions, derivations, methods, properties, and applications related to multivariate analysis techniques: Principal component analysis, Factor analysis, Multidimensional Scaling, Canonical Correlations and Canonical Variables,
UNIT–III
Conjoint Analysis, Path analysis, Correspondence analysis: Feature extraction and Feature selection techniques, Inter and intra class distance measures, Probabilistic distance measures. Cluster Analysis: Introduction, similarities and dissimilarities, Hierarchical and non-hierarchical clustering methods, Single, complete and average linkages, k-means, and k-Nearest Neighbour clustering methods.
UNIT–IV
Categorical Data Analysis techniques and their implementation to data sets and applications to various domains; Time series data Analysis.
Suggested Readings
1. Johnson, R.A, and Dean W. Wichern: Applied Multivariate Statistical Analysis.
2. Morrison, D: An Introduction to Multivariate Analysis.
3. Seber: Multivariate Observations
4. Anderson: An Introduction to Multivariate Analysis.
5. Bishop: Analysis of Categorical data.
MDS-202-T: PAPER- II: ARTIFICIAL INTELLIGENCE
UNIT – I
Introduction: History of Intelligent Systems, Foundations of Artificial Intelligence, Sub areas of Artificial Intelligence, Applications of Artificial Intelligence.
Problem Solving: Introduction, General Problem-Solving Characteristics of Problems, State- Space Representation, Control Strategies.
UNIT–II
Search Techniques: Exhaustive Search Techniques, Heuristic Search Techniques, Iterative Deepening A*, Constraint Satisfaction Problems.
Game Playing: Introduction to Game Playing, Bounded Look-ahead Strategy, Use of Evaluation Functions, Alpha Beta Pruning.
UNIT–III
Logic Concepts and Logic Programming: Introduction, Propositional Calculus, Propositional Logic, Natural Deduction System, Axiomatic System, Semantic Table, A System in Propositional Logic, Resolution, Refutation in Propositional Logic, Predicate Logic, Logic Programming. Knowledge Representation: Introduction, Approaches to Knowledge Representation, Knowledge Representation using Semantic Networks, Extended Semantic Networks for Knowledge Representation, Knowledge Representation using Frames.
UNIT–IV
Expert Systems and Applications: Introduction, Phases in Building Expert Systems, Expert System Architecture, Expert Systems Vs Traditional Systems, Truth Maintenance Systems,
Applications of Expert Systems, List of Expert System Shells and Tools. Uncertainty Measures – Probability Theory: Introduction, Probability Theory, Bayesian Belief Networks, Certainty Factor Theory, Dempster–Shafer Theory.
Suggested Readings
1. Saroj Kaushik, Artificial Intelligence, Cengage Learning India, First Edition, 2011.
2. Russell, Norvig, Artificial Intelligence: A Modern Approach, Pearson Education, 2nd Edition, 2004.
3. Rich, Knight, Nair, Artificial Intelligence, Tata McGraw Hill, 3rd Edition 2009
MDS-203: PAPER-III: STATISTICAL PATTERN RECOGNITION
UNIT-I
Basic Concepts to Statistical Pattern Recognition, Pattern Recognition System, Fundamental problems in Pattern Recognition. Linear classifiers: Linear Discriminant Function (for binary outputs) with minimum squared error; Linear Discriminant function (for the normal density), Error bounds for Normal density. Statistical Decision Theory: Introduction, Bayes theorem, Bayes Decision Theory (continuous and discrete features), Bayes Classifier. Simple problems.
UNIT-II
Probability of errors: Two classes, Normal distribution, equal covariance matrix assumptions, Chernoff bounds and Bhattacharya distance. Nearest Neighbour Decision rules: Nearest Neighbor Algorithm for classification, K-Nearest Neighbor Estimation. Variants of the Nearest Neighbor Algorithm, description convergence, finite sample considerations. Estimation of probability of error in case Nearest Neighbour and Bayes classifiers. Minimum Error Rate Classifier, Estimation ofProbabilities. Comparison of Nearest Neighbour with the Bayes Classifier. Simple problems
UNIT-III
Hidden Markov Model and its use for pattern recognition. Branch and Bound Technique for the use of classification. Neural Networks: Perception linear classifier. Support Vector Machines: construction of Support Vectors, Support Vector Machines algorithm for Classification. Simple problems. Combination of Classifiers: Introduction, Methods for Constructing Ensembles of Classifiers, Methods for Combining Classifiers.
UNIT-IV
Feature selection and extraction: Feature extraction and Feature selection techniques Inter and intraclass distance measures, Probabilistic distance measures, Principal Components Analysis for variable selection and dimensionality reduction. An Application-Hand Written Digit Recognition: Description of the Digit Data, Preprocessing of
Data, Classification Algorithms, Selection of Representative Patterns, Results.
Suggested Readings:
1. R.O. Duda & H.E. Hart (1978): Pattern Recognition and scene analysis, Wiley
2. Earl Gose, Richard Johnson Baugh and Steve Jost (2005): Pattern Recognition and Image Analysis, PHI. (Unit-II: from Ch.3, 4,5 )
3. Murty, M. Narasimha, Devi, V. Susheela (2011): Pattern Recognition - An Algorithmic Approach:, Spinger Pub,1st
Edition.
4. Duda, Hast & Strok (2001): Pattern Recognition, 2nd edition.
MDS-204-T: PAPER- IV: SOFTWARE ENGINEERING
UNIT-I
Software Engineering: Nature of Software, Changing Nature of Software, Software Engineering as a Discipline, Software Process, Software Engineering Practice.Software Process Models: Generic Process Model, Framework Activities, Process Assessment and Improvement, Prescriptive Process Models, Unified Process.Agile Software Development: Defining Agility, Agile Process, Extreme Programming. Data Science Context: Software Engineering Using the Cloud, Global Software Development,
Global Teams.
UNIT-II
Requirements Engineering: Core Principles of Modeling, Requirements Engineering Process, Establishing the Groundwork, Eliciting Requirements, Developing Use Cases, Building the Analysis Model, Requirements Analysis. Analysis Modeling: UML Models that Supplement the Use Case, Identifying Analysis Classes, Specifying Attributes, Defining Operations, Class Responsibility– Collaborator (CRC) Modeling, Associations and Dependencies, Analysis Packages. Design Concepts: Design Process, Design Concepts, Design Model, Software Architecture, Architectural Styles, Architectural Considerations.
UNIT-III
Architectural Design: Software Architecture, Architectural Design, Component Concepts. Component-Level Design: Designing Class-Based Components, Conducting Component-Level Design, Component-Based Development. User Interface Design: User Interface Design Rules with emphasis on data-driven applications and analytics systems.
UNIT-IV
Quality Management: Quality, Software Quality, Software Quality Dilemma, Achieving Software Quality, Defect Amplification and Removal. Software Quality Assurance: Reviews, Informal Reviews, Formal Technical Reviews, Elements of Software Quality Assurance, SQA Tasks, Goals, and Metrics, Software Reliability. Software Testing: Strategic Approach to Software Testing, Validation Testing, System Testing, Debugging, Software Testing Fundamentals, White-Box Testing, Black-Box Testing, Object-Oriented Testing Strategies and Methods. Security Engineering: Security Engineering Analysis, Security Assurance, Security Risk Analysis.
Suggested Readings:
1. Roger S Pressman, B R Maxim (1982): Software Engineering – A Practitioner’s Approach 8th edition.
2. Ian Sommerville (2015): Software Engineering, 10th edition.
3. Hans Van Vliet (2008): Software Engineering, 3rd edition.
4. D. Bell (2005): Software Engineering for Students, 4th edition.
5. K.K. Aggarwal, Y. Singh (2001): Software Engineering, 1st edition.
6. R. Mall (2018): Fundamentals of Software Engineering, 5th edition.
MDS-205-P: PAPER- V (PRACTICAL-1):
ADVANCED MACHINE LEARNING USING PYTHON LAB
List of Practical’s using Python
1. Implement and demonstrate the use of set of training data samples with different formats like .doc, .txt, .CSV, ..pdf, xls files.
2. Implementation of classification techniques for the data sets and evaluation of its analysis using Naïve Bayes classifier,
Support Vector Machines, KNN algorithm
3. Implementation of classification techniques for the data sets and evaluation of its analysis using Decision Tree and
ensemble algorithms.
4. Evaluation of analysis using Multiple Linear and multiple logistic regressions
5. Write a program to demonstrate the working of the decision tree-based ID-3 algorithm. Use an appropriate data set for
building the decision tree and apply this knowledge to classify a new sample.
6. Implementing the Backpropagation algorithm and test the same using appropriate data sets.
7. Implementation of Expected maximization algorithm, k-Means, KNN algorithm
MDS-206-P: PAPER-VI (PRACTICAL-2):
ARTIFICIAL INTELLIGENCE USING R / PYTHON LAB
List of Practical's:
1. Implementation of A* and AO* algorithms.
2. Implementation of Alpha-beta pruning.
3. Implementation of search algorithms (BFS & DFS).
4. Implementation of Hill Climbing algorithm
5. Implementation of Gaming problems:
(i) Tower of Hanoi problem
(ii) Tic-Tac-Toe problem
(iii) Water-Jug problem.
(iv) 4-Queens problem.
(v) 8 Puzzle problems.
(vi) Monkey banana problem
MDS-207-P: PAPER- VII (PRACTICAL-3)
SPR LAB USING PYTHON
List of Practical's:
1. Problem Identification and Project Planning
Identify a suitable real-world problem that can be implemented using Python. Prepare a project definition document specifying objectives, scope, constraints, assumptions, and expected outcomes.
2. Requirements Engineering
Elicit, analyze, and document the functional and non-functional requirements of the selected project. Prepare a detailed Software Requirements Specification (SRS) document.
3. Use Case and Analysis Modeling
Develop use case diagrams and detailed use case descriptions for the proposed system. Identify the major system components and represent the data flow between them.
4. Design Modeling
Design the architecture of the Python-based system using appropriate class diagrams and modular design principles. Clearly illustrate classes, attributes, methods, and relationships.
5. Implementation, Testing, and Validation
Implement the core modules of the system using Python following modular programming practices. Design and execute unit and functional tests to validate system outputs against the specified requirements.
6. Statistical Pattern Recognition:
Apply Computation of Linear Discriminant Classifier for
i) Two Multivariate Normal Classes (LDA)
ii) Minimum Squared Error Classifier for Binary Data
7. Apply Statistical Pattern Recognition techniques to perform feature selection using Principal
Component Analysis (PCA) and classify the reduced feature sequences using a Hidden Markov
Model (HMM)
MDS-208-P: PAPER- VIII (PRACTICAL-4):
DATA HANDLING USING SPSS
List of Practical’s
1. Basic operations of Data entry, Data import and export, I/O files handling etc.
2. Data Visualization: Pie diagram, Bar diagram, Histogram, Line plot, frequency curves &polygons, Scatter Plot, Gantt Chart,
Box Plot.
3. Descriptive Statistics: Measures of Central Tendencies, Dispersions, Relative measures of Dispersions, Moments,
Skewness, Kurtosis.
4. Parametric Tests: Testing for Mean(s), Variance(s), Proportion(s), ANOVA for one-way two-way and two way with one and
m-observations per cell and with & without interactions,
5. Non–Parametric tests: Sign test, Wilxon Sign Rank test, Mann-Whitney U-test, Run test, Kolmogorov Smirnov test,
Chi-square test for goodness of fit and Chi-square test independence.
6. Design & Analysis of Experiments: Analysis of Variances for Completely randomized, randomized block and latin Square
Designs and Factorial experiments (22, 23 F.E. without confounding).
7. Regression Analysis: Analysis of Simple and Multiple Linear Regression models, Selection Best Linear Regression Model
(All possible, forward, backward, stepwise and stage wise methods). Binary and multinomial Logistic regression models,
Probit analysis.
8. Multivariate Data Analysis: Linear Discriminant Analysis, Principal Component analysis,
Factor analysis, multi-dimensional scaling, Cluster analysis.