Python Programming
Probabilty and Statistics
Machine Learning
UNIT-I
Data science in a Big Data World: Benefits and uses of data science and big Data-Facets of data-The data science process-The big data ecosystem and data science. The data science process-Overview of the data science process-Steps: Defining research goals and creating-- Retrieving data. [TB:1, CH:1]
UNIT-II
Handling Large Data on a Single Computer: The problem in handling large data-General techniques for handling large volumes of data-General programming tips for dealing with large data sets-Case Studies. [TB:1, CH:2,4]
UNIT-III
Data Manipulation with Pandas: Introducing Pandas Objects- Data Indexing and Selection- Operating on Data in Pandas- Handling Missing Data- Hierarchical Indexing- Combining Datasets: Concat and AppendCombining Datasets: Merge and Join- Aggregation and Grouping. [TB:2, CH:3]
UNIT-IV
Data Manipulation with Pandas:Vectorized String Operations- Working with Time SeriesHigh-Performance Pandas: eval () and query (). [TB:2, CH:3]
UNIT-V
Visualization with Matplotlib: Simple Line Plots- Simple Scatter Plots- Visualizing Errors- Density and Contour Plots- Histograms, Binnings, and Density- Customizing Plot Legends- Customizing ColorbarsMultiple Subplots- Text and Annotation- Customizing Ticks- Customizing Matplotlib: Configurations and Stylesheets- Three-Dimensional Plotting in Matplotlib- Geographic Data with Basemap. [ TB:2, CH:4]
Text Books
1. Davy Cielen, Arno d. B. Meysman, Mohamed Ali, Introduction to Data Science, Manning Publications, 2016.
2. Jake Vanderplas, Python Data science Hand Book, O’Reilly, 2017.
Reference
1. Cathy O’neil, Rachel Schutt, Doing Data Science, straight talk from the frontline, O’Reilly, 2013
2. Jure Leskovek, Anand Rajaraman, Jeffry Ullman, Mining of Massive datasets, V2.1, Cambridge University Press, 2014.
3. Joel Grus, Data Science from Scratch: First Principles with Python, first edition, O’Reilly, 2015.