M.SC. (DATASCIENCE) III-SEMESTER SYLLABUS
MDS-301: PAPER- I: DEEP LEARNING TECHNIQUES
UNIT–I
Artificial Neural Networks: Introduction, Biological Activations of Neuron; Artificial Neuron Models: Mc Culloch-Pitts, Perceptron, Adaline, Hebbian Models; Characteristics of ANN, Types of Neuron Activation Function, Signal functions and their properties, monotonicity, ANN Architecture, Classification Taxonomy of ANN, Supervised, Un-supervised and Reinforcement learning; Learning tasks, Memory, Adaptation, Statistical nature of the learning process. Statistical learning theory. Gathering and partitioning of data for ANN and its pre and post processing.
UNIT–II
Supervised learning algorithms: Perceptron Learning Algorithm, Derivation, Perceptron convergence theorem (statement); Multi-layer Perceptron Learning rule, limitations. Applications of the Perceptron learning. Gradient Descent Learning, Least Mean Square learning, Widrow-Hoff Learning. Feed-forward and Feed-back Back-Propagation Algorithms and derivation. Unsupervised learning Algorithms: Hebbian Learning, Competitive learning. Self-Organizing Maps, SOM algorithm, properties of feature map, computer simulations, Vector quantization, Learning vector quantization.
UNIT–III
Radial Basis Function Networks, Approximation properties of Radial Basis Function Networks. Boltzmann Machine, Hopfield model. Reinforcement learning, Markov Decision Process, Hidden Markov Model, Convolutional Neural Networks, Recurrent Neural Networks, Long-Short Term Memory Networks, Generative Adversarial Networks, Deep belief Networks
Suggested Readings:
1. Haykin, S. (1994). Neural Networks: A Comprehensive Foundation. New York: Macmillan Publishing. A comprehensive book and contains a great deal of background theory
2. Yagnanarayana, B. (1999): “Artificial Neural Networks” PHI
3. Bart Kosko(1997): Neural Networks and Fuzzy systems, PHI
4. Jacek M. Zurada(1992): Artificial Neural Systems, West Publishing Company.
5. Carling, A. (1992). Introducing Neural Networks. Wilmslow, UK: Sigma Press.
6. Box and Jenkins: Time Series analysis, Springer
MDS-302: PAPER- II: COMPUTER NETWORKS
UNIT–I
Computer Networks Fundamentals: Overview, Network Hardware, Network Software, Reference models– OSI Model, TCP/IP Reference Model, Comparison of OSI and TCP/IP Reference Model, Example Networks, Network Standardization. Physical Layer: Guided Transmission Media, Wireless Transmission, Multiplexing, Switching. Data Link Layer: Design Issues, Error Detection and Correction, Data Link Layer Protocols, Sliding Window Protocol
UNIT–II
Multiple Access Sublayer: ALOHA, CSMA, Collision Free Protocols, Ethernet, Wireless LAN802.11, Data Link Layer Switching –Repeaters, Hubs, Bridges, Switches, Routers, Gateways. Network Layer: Design Issues, Routing Algorithms – Shortest path, Flooding, Distance Vector Routing, Link state Routing, Hierarchical, Broadcast Routing, Multicast Routing; Congestion Control Algorithms. Internetworking: Tunneling, Internetwork Routing, Fragmentation, IPv4 Vs IPv6Protocol, IP Addresses, CIDR, Internet Control Protocols–IMCP, ARP, RARP, DHCP.
UNIT–III
Transport Layer: Services provided to the upper layers, Transport Protocols, Overview of Congestion Control. The Internet Transport Protocols: Introduction to UDP&RPC, Real Time Transport Protocols, The Internet Transport Protocols–TCP, TCP Service Model, TCP protocol, TCP Segment Header, TCP Connection Establishment, TCP Connection Release, Modelling TCP Connection Management, TCP Sliding Window, TCP Time Management, TCP Congestion Control. Application Layer: DNS, TELNET, E-Mail, FTP, HTTP, SSH, Overview of WWW.
Suggested Readings:
1. Andrew S. Tanenbaum, David J Wetherall, Computer Networks (5th edition)
2. William Stallings, Data and Computer Communications
3. Behrouz A. Forouzan, Data Communication and Networking
4. Behrouz A Forouzan, Firouz Mosharraf, Computer Networks A Top-Down Approach
MDS-303 (C): PAPER- III (C): CLOUD COMPUTING
UNIT-I
Introduction, Benefits and challenges, Cloud computing services, Resource Virtualization, Resource pooling sharing and provisioning, Case study of Iaas, Paas and Saas Scaling in the Cloud, Capacity Planning, Load Balancing, File System and Storage, Containers. Multi-tenant Software, Data in Cloud, Database Technology.
UNIT-II
Content Delivery Network, Security Reference Model, Security Issues, Privacy and Compliance Issues, Portability and Interoperability Issues, Cloud Management and a Programming Model Case Study, Popular Cloud Services
UNIT-III
Enterprise architecture and SOA, Enterprise Software, Enterprise Custom Applications, Workflow and Business Processes, Enterprise Analytics and Search, Enterprise Cloud Computing Ecosystem.
Suggested Readings:
1. Sandeep Bhowmik (2017): Cloud Computing, Cambridge University Press.
2. Gautam Shroff (2016): Enterprise Cloud Computing - Technology, Architecture, Applications by Cambridge University Press.
3. Kai Hwang, Geoffrey C. Fox, Jack J. Dongarra (2012): Distributed and Cloud Computing From Parallel Processing to
the Internet of Things, Elsevier.
MDS-304 (A): PAPER- IV(A): BIG DATA ANALYTICS
UNIT-I
Foundations of Big Data & The Hadoop Ecosystem: Introduction to Big Data, Elements of Big Data Analytics (Volume, Velocity, Variety, Veracity)Business Contexts & Use Cases: Social Media Analytics, Fraud Detection (General and in Insurance), Retail and Customer Analytics Technologies for Handling Big Data: Core Principle: Distributed and Parallel Computing. Introducing Hadoop: The solution to big data challenges. Hadoop Ecosystem: Storage: HDFS (Hadoop Distributed File System) Processing: MapReduce Framework Resource Management: Hadoop YARN (Introduction) Data Access & Ingestion Tools (Overview): HBase (NoSQL Database) Sqoop (Data transfer between Hadoop and RDBMS) Flume (Ingesting log/streaming data)
UNIT-II
Data Processing with MapReduce & Data Storage: Deep Dive into MapReduce: The MapReduce Framework (Mapper, Shuffle & Sort, Reducer). Techniques to Optimize MapReduce Jobs (Combiners, Partitioners). Developing a Simple MapReduce Application. Data Storage Paradigms: RDBMS vs. Big Data, Limitations of traditional databases. Introduction to NoSQL Databases: HBase and its role in Big Data processing. Integrating Big Data with Traditional Data Warehouses. 2.3. Customizing & Controlling MapReduce: Controlling Execution with Input/Output Formats. Optimizing jobs with Combiners. Implementing a MapReduce program for a classic problem (e.g., Sorting).
UNIT–III
Data Analysis with High-Level Tools (Hive, Pig) & Modern Architecture: Modern Hadoop Architecture: Hadoop YARN: Architecture, Advantages, and Working. Data Warehousing & SQL-on Hadoop: Exploring Hive: Introduction & Hive Services, Hive Data Types, DDL, and DML , Data Retrieval Queries and JOINS in Hive, Data Flow Programming: Analyzing Data with Pig: Introducing Pig and Pig Latin Working with key Operators and Functions Running and debugging Pig Scripts. Introduction to Analytics: Understanding Analytics vs. Reporting. Types of Analytics (Descriptive, Diagnostic, Predictive, Prescriptive).Overview of Analytical Approaches and Too
Suggested Readings::
1. DT Editorial Services, Big Data – Black Book (dream tech)
2. Radha S, M. Vijayalakshmi, Big Data Analytics
3. Arshdeep B and Vijay M, Big Data Science & Analytics – A Hands-On Approach.
4. Frank Ohlhorst, Big Data Fundamentals – Concepts, Drivers, Techniques
MDS-305 (LAB-I): PAPER- V:
DEEP LEARNING & COMPUTER NETWORKS USING PYTHON (LAB)
List of Experiments on Deep Learning
1.Implementation of Perceptron and Multi-layer Perceptron Learning Algorithm.
2.Implementation of Gradient Descent Learning rule
3.Implementation of Least Mean Square learning law
4.Implementation of Widrow-Hoff Learning rule
5.Implementation of Back-Propagation learning Algorithms.
6.Implementation of Hebbian Learning,
7.Implementation of Competitive learning
8.Implementation of Markov Decision Process,
9.Implementation of Hidden Markov Model,
10.Implementation of Convolutional Neural Networks,
11.Implementation of Recurrent Neural Networks,
12.Implementation of Long-Short Term Memory.
List of Experiments on Computer Networks:
1.TCP Echo Server & Client: Write a Python program where a client sends a message to a server, and the server echoes it back.
2.UDP Time Server: Write a program where a UDP client requests the current time from a server, and the server responds with its system time.
3.Simulate a sender and receiver that implement the Stop-and-Wait protocol with ACK/NACK for error control. Introduce a random probability for packet loss.
4.Implement a simple version of the Go-Back-N sliding window protocol to demonstrate flow control.
5. Write a Python script using scapy to Capture ICMP packets (from a ping command).
Parse and print the fields of the captured IP and ICMP headers (e.g., Version, TTL, Type, Code).
6.Write a program to simulate the functionality of nslookup or dig.
7.Develop a simple console-based chat application.
a) Create a multi-threaded server that can accept connections from multiple clients and
broadcast messages from one client to all others.
b) Create a peer-to-peer chat where users can send messages to each other's IP addresses
and ports.
PRACTICAL LISTS OF ELECTIVE-II:
E-I: C) CLOUD COMPUTING
1.Install Virtual box / VMware Workstation with different flavours of Linux or windows OS on top of windows7 or 8.
2.Install a C compiler in the virtual machine created using virtual box and execute Simple Programs.
3.Install Google App Engine. Create hello world app and other simple web applications using python/java.
4.Use GAE launcher to launch the web applications.
5.Simulate a cloud scenario using Cloud Sim and run a scheduling algorithm that is not present in Cloud Sim.
6.Find a procedure to transfer the files from one virtual machine to another virtual machine.
7.Find a procedure to launch virtual machine using try stack (Online Open stack Demo Version)
8.Install Hadoop single node cluster and run simple applications like wordcount.
E-II: A) BIG DATA ANALYTICS
1.Installation of Hadoop Framework, it‘s components and study the HADOOP ecosystem.
2.Write a program to implement word count program using MapReduce
3.Experiment on Hadoop Map-Reduce / PySpark: -Implementing simple algorithms in Map-Reduce: Matrix multiplication.
4.Install and configure MongoDB/ Cassandra/ HBase/ Hypertable to execute NoSQL Commands.
5.Implementing DGIM algorithm using any Programming Language/ Implement Bloom Filter using any programming
language
6.Implement and Perform Streaming Data Analysis using flume for data capture, PYSpark / HIVE for data analysis
of twitter data, chat data, weblog analysis etc.
7.Implement any one Clustering algorithm (K-Means/CURE) using Map-Reduce.
8.Implement Page Rank Algorithm using Map-Reduce
MDS-307 (LAB-III): PAPER- VII: MINI PROJECT
PROJECT GUIDELINES:
1.The Head of the Department will appoint Internal supervisor to guide the students for each group.
2.Each group should consist of 3-5 students only (exempted for industry internship students).
3.Each group has to search for the internship from any industry/ institution, if not found they have to choose a
project with the help of supervisor allotted such that, the aim of project work is to develop solutions to realistic
problems applying the knowledge and skills obtained on the courses studied with specializations, new technologies
and current industry practices.
4.Each student in the group must actively participate and report to the internal supervisor in each week.
5.Each student has to give minimum two seminars, one in the second week (“Project Design Seminar”) another on
8th week (project progress seminar).
6. Submit Title of the project and one page abstract /synopsis about the project in the first week to the Head, forwarded
by the internal supervisor.
7. Each project should give a 30 minutes presentation using power point presentation and followed by 10 minutes
of discussion.
8.Project seminar presentations should contain, source of the data, Sample data, data description, literature survey on
the similar studies, objectives of the study, Methodology, statistical techniques, work plan etc. and details of progress
of the work, individual roles and their work distribution and their plan etc.
9.Each group Project Report should follow the Ph.D. thesis norms with Plagiarism report and each group has to submit
two copies duly signed by the Students, Supervisor, industry certificate (if exists) and Head of the Department on
before the last instruct date of the semester.
10. Project will be awarded based on all stages of the project and the topic chosen, seminar presentation,
communication skills, role/ contribution of the student in the project etc and viva-voce conducted by the
internal & External examiners.
11.The Project dissertation should follow Ph.D. typing format: Font: Times New Roman 12 size; Titles of
Capital Letters, Sub titles bold small letters, Double spacing; Paragraph indentation, justify alignment,
Numbering should follow as per norms (2chapter.3subheading.4section: 2.3.4) all equations also should be
numbered as per norms. numbering for tables on top with heading and numbering for fig in bottom, etc.
12.The Data Science Project Report submitted to be hard bound (one for exam branch and one for Institution)
should contains:
1.Title Page in prescribed format
2.Declaration by the student in prescribed format
3.Certificate from the Industry / supervisor in prescribed format
4.Acknowledgments
5.Contents in prescribed format
6.The contents of project dissertation should contains/ covers the chapters are:
a.Introduction (Introduction, Motivation to topic, Significance and need of the study, Problem Statement,
Objectives of the study, Chapter wise summary).
b.Review of the Literature (Introduction with list of authors contributed (collected from journals) related to topic,
Each author method description (5-10 methods) and its comparison study
c. Materials and Methods (sample data set, Sample description, Software’s used, Tools & Technique applied)
d .Program Code for Implementation
e. Exploratory Data Analysis (Outputs of descriptive statistics, Data visualizations, Data interpretations and conclusions)
f. Model Building (Out puts of the Models used for the data sets, Data interpretations & conclusions)
g. Conclusions & Future Scope
h. Appendix (if any)
i. Bibliography (Minimum 30 articles)
j. Plagiarism Report