Courses in Fall 2021
CS6100 - Big Data Storage, Retrieval and Processing
Prerequisites:
By Courses: CS3100 or CS3310 or equivalent with a grade of C or better; or permission of the instructor.
By Topic: Advance understanding of high-level language programming - conditional structures; looping structures; arrays; program logic - to solve problems; object oriented programming - be able to create and use objects; software life cycle; validating quality of software produced; sorting and searching algorithms; data structures (linked lists, queues, stacks, hash maps); divide-and-conquer, greedy, and backtracking algorithm design techniques; documenting programs effectively and efficiently; team work. Proficient in Python, R, Java, C, C++, or C# beyond the experience in CS1120. Low-level, systems programming, and Linux programming experience desired. Some mathematics and statistics background. Exposure to machine-learning, Python & R programming are a plus. Strong desire, self-motivation & dedication to learn and contribute to the interesting area of big data computing (mainly storage, retrieval and processing of big data).
This course is a 3 credit hour advance undergraduate or graduate level course, intended for students who plan to design and develop applications, or pursue R&D in the exciting area of big data computing. The primary objective of this course is to familiarize the students with the most important information technologies used in manipulating, storing, and analyzing big data. Mainly focusing on big data models, reduction techniques, applications architectures and big data analytics using cloud-based and other related services, this course will provide students with the knowledge and hand-on experience in designing and implementing big data applications. Topics may include the characteristics and challenges of the big data, state-of-the-art computing paradigm sand platforms (e.g., MapReduce), big data programming tools (e.g., Hadoop, GFS, MongoDB and various cloud-based tools), big data extraction and integration, big data storage, scalable indexing for big data, big graph processing, big data stream techniques and algorithms, big probabilistic data management, big data privacy, big data visualizations, and big data applications (e.g., spatial, finance, multimedia, medical, health, and social data). This course will also explore the current challenges facing big data computing.
The course provides the student with an advanced understanding of the issues involved in dealing with Big Data. It prepares the student for advanced handling of extremely large data sets, accessing the data, reduction of the data into a manageable size and processing the results. Students will reduce Big Data sets, use and develop R packages and other code to analyze the data and produce graphics to explore and explain the data. This course will be very small team project oriented.
Big Data Fundamentals: Concepts, Drivers & Techniques by Erl, Khattak and Buhler, 2016, Pearson, ISBN: 9780134291208 (ebook), 9780134291079 (paper).
We will also extensively refer to research papers, material available on the web and material from the following recommended textbooks.
Data Science on the Google Cloud Platform by Valliappa Lakshmanan, 2018, O'reilly, ISBN: 9781491974568
Big-Data Analytics for Cloud, IoT & Cognitive Learning by Kai Hwang and Min Chen, 2017, John Wiley & Sons Inc, ISBN: 9781119247029
Hadoop: The Definitive Guide, Third Edition, by Tom White (O'Reilly)
Data-Intensive Text Processing with MapReduce, by Jimmy Lin and Chris Dyer
R for Data Science by Hadley Wickham and Garrett Grolemund, O'Reilly, 2016, ISBN: 9781491910399
Goals
Understand the big data characteristics and challenges
Understand real world applications and their techniques involving big data
Know the current big data processing platforms and tools
Understand big data collection, integration and storage
Learn the big data indexing
Learn various queries over big data
Learn the core techniques of processing big data
Familiarity with big data characteristics and challenges
Proficiency with at least one comprehensive big data handling tool
Experience with surveying big-data-related topics and presenting them (orally as well as written)
Design and implemention of a R&D project on big data problems
Collaboration with team members to study the big data techniques and learn big data tools
Students' progress and achievements towards reaching the course goals and objectives will be apparent from such measures as: the results and creativity displayed in course assignments; and the contents of and performance on certain exam portions.
Courses in Summer I 2021
CS3310 - Data and File Structures
T Th 10:30am-12:50pm (CRN 22190) in synchronous online, 3 credit hours
Office Hrs:
UG TAs - TBA
Tutoring Help
Instructor: Ajay Gupta : Tuesdays and Thursdays 02:00-03:00pm via MS Teams, and by appointment
Prerequisites:
Solid undergraduate background in programming in high-level and OOP language and/or consent of the instructor.
By Courses: CS1110 and CS 1120 and (CS 1310 or MATH 1450), with a grade of “C” or better, or equivalent.
By Topic: Advance understanding of high-level language programming - conditional structures; looping structures; arrays; program logic - to solve problems; object oriented programming - be able to create and use objects; sorting and searching algorithms; documenting programs effectively and efficiently; team work. Proficient in Java, C, C++, or C# beyond the experience in CS1120. Low-level, systems programming and Linux programming experience desired. Undergraduate level mathematics and statistics background. Strong desire, self-motivation & dedication to learn and contribute to the interesting areas of computer science.
More may be added...
Catalog Description:
This course focuses on the study of internal and external data structures and algorithms with an ongoing emphasis on the application of software engineering principles. Trees, graphs and the basic algorithms for creating, manipulating and using them will be studied. Various types of hash and indexed random access file structures will be discussed and implemented. B-trees and external file sorting will be introduced. Internal and external data and file organizations and algorithms will be compared and analyzed. Students will carry out a number of programming projects which will include the various interface (person-to-person, module-to-module, person-to-module-to-person) aspects of the software development process.
None
We will be extensively using Shaffer's Data Structures and Algorithms book - the link is to an older version, we will provide you an online interactive latest version on Canvas. A few other reference books will be available on eResearve at Waldo library.
Goodrich M. and Tamassia R. "Data Structures and Algorithms in Java,", 6th Edition, ISBN-13: 978-1118771334
ISBN-10: 1118771338, Wiley, 2014.
Sedgewick R and Wayne K., "Algorithms," Addision Wesley, 2011.
Shaffer, C., ""Data Structures and Algorithms Analysis," Dower Publications, 2009-2013, see OpenDSA project.
more Introductory / intermediate level texts to be added...
Corman, T., Leiserson, C. E., and Rivest, R. L., "Introduction to Algorithms," McGraw Hill, 1990.
Horowitz, E., Sahni, S. , and Rajasekaran, S., "Computer Algorithms in C++," Computer Science Press, (1997).
Knuth, D. E., "Fundamental Algorithms: The Art of Computer Programming," Addison-Wesley, Menlo Park, CA (1973).
Knuth, D. E., "Sorting and Searching: The Art of Computer Programming," Addison-Wesley, Menlo Park, CA (1973).
Sedgewick, R. "Algorithms," Addison-Wesley 1983.
Tarjan, R. E., "Data Structures and Network Algorithms," Society of Industrial and Applied Mathematics, 1983.
Course Learning Outcomes
Students who earn a "C" or better in this course should be able to:
communicate knowledgeably about data & file structures and their algorithms
select appropriate data & file structures for a particular problem
develop and test multiple programming applications
use selected data & file structures in programming projects
write readable, maintainable programs
follow written specifications for program functionality, structure and user interface
Students' progress and achievements towards reaching the course goals and objectives will be apparent from such measures as: the results and creativity displayed in course assignments; and the contents of and performance on certain exam portions.
Courses in Spring 2021
CS4310-5310 - Design and Analysis of Algorithms
This course can satisfy one of the foundation courses required for computer science masters degree for the WMU-CS undergraduate students continuing their graduate studies. Check the details of foundation courses at
https://wmich.edu/cs/academics/graduate/ms-program
Prerequisites:
By Courses: (MATH 1450 or CS1310) and CS3310 or equivalent with a grade of C or better; or permission of the instructor.
By Topic: Advance understanding of high-level language programming - conditional structures; looping structures; arrays; program logic - to solve problems; object oriented programming - be able to create and use objects; software life cycle; validating quality of software produced; introductory sorting and searching algorithms; elementary data structures (linked lists, queues, stacks, hash maps); documenting programs effectively and efficiently; team work.
This is an exciting but challenging course basic, fundamental and foundational to computer science. This course is a continuation of the study of data structures and algorithms, emphasizing methods useful in practice. It provides a theoretical foundation in designing algorithms as well as their efficient implementations. The focus is on the advanced analysis of algorithms and on how the selections of different data structures affect the performance of algorithms. Topics covered include: sorting; search trees, heaps, and hashing; divide-and-conquer; dynamic programming; backtracking; branch-and-bound; amortized analysis; graph algorithms; shortest paths; network flow; computational geometry; number-theoretic algorithms; polynomial and matrix calculations; and parallel computing.
The course is appropriate for seniors and beginning graduate students. Like most college courses, this course requires students to take responsibility for their own learning. The course follows a strict schedule of reading, writing, and quizzes. Each week reading will cover one or two chapters of the textbook. The reading, writing, and examination schedule is firm. Students are required to develop the discipline to follow the schedule.
As is typical of many college courses, this course will require two midterm examinations during the course and a number of quizzes.
Classroom activities, unlike the readings and quizzes, are somewhat less structured. This allows for tangents in discussions, the use of occasional visiting guests, unforeseen instructor absences, holidays, etc. The flexibility of the classroom does not tie students or instructor to the textbook readings, but does complement and enhance those readings. Students are responsible for material in the textbook, whether or not the material is addressed in the classroom. Students are also responsible for material and skills mentoned, presented or discussed in class.
Algorithm Design and Applications by Michael Goodrich and Roberto Tamassia, Oct 2014, Wiley, ISBN - 978-1-118-33591-8.
We will also be using Shaffer's Data Structures and Algorithms book - the link is to an older version, we will provide you an online interactive latest version on Canvas
Notes - Summation and Recurrence Relations.
A number of textbooks (in addition to GT's book) on algorithm design appropriate for this course have been published. The course will cover material from these books as well as material from research papers.
The book "Algorithms (4th Edition) by Robert Sedgewick and Kevin Wayne, Addison-Wesley Professional, ISBN-10: 032157351X, ISBN-13: 978-0321573513" used in previous semesters may suffice for you, if you can copy problems sets needed in homeworks from one of your colleagues who has purchased the required texbook.
The book "Algorithm Design: Foundations, Analysis and Internet Examples by Michael Goodrich and Roberto Tammassia, Wiley, ISBN: 978-0-471-38365-9" may also suffice, if you can copy problems sets needed in homeworks from one of your colleagues who has purchased the required texbook.
Introduction to Algorithms (3rd ed.) by Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. MIT Press and McGraw-Hill. ISBN 0-262-03384-4 is a good reference book to have. Read about it in its wiki page
Reinforce analytic development and problem solving abilities, and develop a deeper foundation in computer science.
Show progress with regard to understanding the analysis and performance of algorithms (for further use, e.g., in graduate level courses), including knowledge and use of terminology and how the theory connects with real-world applications, possibly in different and new areas.
Apply the concepts covered in the course to written and practical problems, e.g., by combining problem solving with computer programming and the use of software tools as part of assignments.
Students who earn a “C” or better in this course should have knowledge of
Sequential algorithms pertaining to the greedy, divide-and-conquer, dynamic programming, backtracking and branch-and-bound paradigms;
Parallel algorithms pertaining to SIMD, MIMD, shared memory and message passing systems;
Introductory databases and data management applications;
Analyzing iterative and recursive sequential and parallel algorithms;
Efficient data structures such as AVL trees, 2-3 trees, min-max heaps, B-trees.
CS6310 - Design of Algorithms and Advance Data Structures
Prerequisites:
Undergraduate background in Design and Analysis of Algorithms and/or consent of the instructor.
This is a challenging graduate course basic to computer science. A primary objective is to develop the advanced skills necessary for designing sequential and parallel high-performing algorithms. We will study advance concepts in various algorithm design paradigms such as divide & conquer, greedy, dynamic programming, backtracking, and branch & bound. We will also study NP problems, lower-bound proofs, approximation and randomized algorithms as well as combinatorial optimization. Focus is on algorithmic thinking, performance guarantees and boundary cases, and efficient solutions to practical problems. Two-thirds of this course will be devoted to algorithms and the rest to development/understanding/application of data structures.
One cannot design efficient algorithms (or solution to computational problems) without careful selection and use of appropriate data structures. On the other hand, one can't discuss data structures without the context of algorithms and problems in which they are most efficient. In this course, we re-examine, in depth, the basic algorithm design techniques (namely, divide & conquer, greedy, dynamic programming, backtracking, and branch-and-bound) and the data structures already known to us (namely, lists, priority queues, search trees, graphs, etc.). We will see and evaluate variations of these basic structures and their applicability to various fundamental and todays computational problems.
Introduction to Algorithms (3rd ed.) by Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. MIT Press and McGraw-Hill. ISBNs 9780262033848 and 0-262-03384-4 is a good book to have. Read about it in its wiki page
Notes - Summation and Recurrence Relations.
A number of textbooks (in addition to CLRS's book) on algorithm design appropriate for this course have been published. The course will cover material from these books as well as material from research papers.
Algorithm Design by J. Kleinberg and E. Tardos, Pearson / Addison Wesley, 2006, ISBN-10: 0-321-29535-8; ISBN-13: 978-0-321-29535-4.
Computers and Intracatibility: The Guide to the Theory of NP-Completemness by M. Garey and D. Johnson, W. H. Freeman, 1979.
Randomized Algorithms by R. Motwani and P. Raghavan, Cambridge University Press, 1995.
Combinatorial Optimization - Algorithms and Complexity by C. Papadimitriou, K. Steiglitz, Prentice-Hall, 1982.
Clever Algorithms: Nature-Inspired Programming Recipes by Jason Brownlee, LuLu, 2011
Reinforce analytic development and problem solving abilities, and develop a deeper foundation in computer science.
Show progress with regard to understanding the analysis and performance of algorithms (for further use, e.g., in R&D), including knowledge and use of terminology and how the theory connects with real-world applications, possibly in different and new areas.
Apply the concepts covered in the course to written and practical problems, e.g., by combining problem solving with computer programming and the use of software tools as part of assignments.
Students who earn a “C” or better in this course should have knowledge of
Sequential high-performance computing (HPC) algorithms pertaining to the greedy, divide-and-conquer, dynamic programming, backtracking and branch-and-bound paradigms;
Introductory parallel algorithms pertaining to SIMD, MIMD, shared memory and message passing systems;
Analyzing iterative and recursive HPC algorithms;
Efficient data structures such as min-max heaps, B-trees, k-d trees, union-find;
Exposure to approximation algorithms, randomized algorithms, and combinatorial optimization.