Syllabus
Fundamentals of Data Science
Prerequisite: Programming knowledge
Course Outcomes:
CO1: Identify basic building blocks of python to solve mathematical problems. (Understand)
CO2: Describe the key concepts in data science (Remember)
CO3: Enumerate the fundamentals of NumPy (Understand)
CO4: Demonstrate the fundamentals of Pandas (Understand)
CO5: Demonstrate data analysis, manipulation and visualization of data using Python libraries (Apply)
UNIT I
Introduction to Python: Features of Python, Data types, Operators, Input and output, Control Statements. Strings: Creating strings and basic operations on strings, string testing methods. Lists, Dictionaries, Tuples.
UNIT II
What is Data Science? Data Science life cycle, Datafication, Exploratory Data Analysis, The Data science process, A data scientist role in this process.
UNIT III
NumPy Basics: The NumPy ndarray: A Multidimensional Array Object, creating ndarrays , Data Types for ndarrays, Basic Indexing and Slicing, Boolean Indexing, Fancy Indexing, Expressing Conditional Logic as Array Operations, Methods for Boolean Arrays , Sorting , Unique.
UNIT IV
Getting Started with pandas: Introduction to pandas, Library Architecture, Features, Applications, Data Structures, Series, DataFrame, Index Objects, Essential Functionality Reindexing, Dropping entries from an axis, Indexing & selection, and filtering.
UNIT V
Data Preprocessing: Data Loading, Storage, and File Formats - Reading and Writing data in textformat, binary data formats, interacting with html and web apis, interacting with databases; Data Wrangling: Clean, Transform, Merge, Reshape - Combining and Merging Data Sets, Reshapingand Pivoting, Data Transformation. String Manipulation; Data Aggregation.
TEXTBOOKS:
1. Wes McKinney, “Python for Data Analysis”,O’REILLY, ISBN:978-1-449-31979- 3, 1st edition, October 2012.
2. Rachel Schutt & O’neil, “Doing Data Science”, O’REILLY, ISBN:978-1-449- 35865-5, 1st edition, October 2013.
3. Python For Data Analysis ( O Reilly, Wes Mckinney)
REFERENCE BOOKS:
1. Python: The Complete Reference, Martin C. Brown, McGraw Hill Education
2. Joel Grus, “Data Science from Scratch: First Principles with Python”, O’Reilly Media, 2015
3. Matt Harrison, “Learning the Pandas Library: Python Tools for Data Munging,
Analysis, and Visualization , O'Reilly, 2016.