Python for Data Science
Course Curriculum
Looking for Data Science Material ?
Course Curriculum
Unit 1: Preliminaries for Data Analysis
Your first assignment consist of various basic concepts of preliminaries for Data Analysis. You'll get familiar various data types: Discrete vs Continuous Data, Scales of Measurement, Quantitative vs Qualitative Data, Categorical vs Numerical Data and Structure vs Unstructured Data. These concepts are essential to move forward in your journey of Python for Data Science.
Unit 2: Python Fundamentals
Your second assignment consists of several python fundamental topics. This language is essential for any learner who wants to use machine learning to power the solutions in this course.
This assignment is designed for beginners and covers topics such as python variables, numeric python operators, logical operators, various loop statements (If, while, and for), python functions, strings and their operations/functions, list and list comprehension, as well as reference videos on each topic. Each idea of python discussed here can be learned by solving challenges.
Unit 3: Data Structures
Data structures are significant because they are a technique of organising and storing data. This project will cover a variety of Python data structures and how to implement them. Lists and their actions, such as slicing, deleting, appending, and updating, are described. List comprehensions, Sets and their operations such as union, intersection, and differences, Tuples and their implementation, and Dictionaries and their operations such as adding and removing key value pairs, iterating item values, and so on. There are also reference videos on every topic covered in addition to the hands-on problems.
Unit 4: Numpy
This assignment will assist learners in optimising their code using Numpy, which attempts to deliver an array object that is up to 50x faster than typical Python lists. Numpy Assignment covers topics such as defining multiple numpy array dimensions, using Numpy functions to generate arrays such as arange(), eye(), complete(), diag(), and linespace(), and more. Using random values to define a Numpy array Numpy array indexing and slicing, reshaping arrays to different dimensions Differences between Numpy copy and view functions, Numpy bonus operations such as hstack() and vstack(), Numpy array updates utilising insert, remove, and append methods, Numpy array searching and mathematical operations Also demonstrated is how arrays are faster than lists in practise. We've included reference links for each topic to help you fully comprehend it.
Unit 5: Pandas
Pandas is a data science library that includes functions for analysing, cleaning, examining, and manipulating data. This assignment covers subjects such as Pandas series and their operations such as sorting, appending, indexing, and so on, as well as Pandas dataframes and their operations such as accessing existing rows, columns, and adding new rows or columns. When it comes to converting series to dataframes, Concatenation of one or more dataframes, dataframeelement access via conditions, dataframe Indexes, loc and iloc, reading csv, merging, groupby, and apply functions are all available. We also included reference videos for all of the topics to help with conceptual clarity.
Unit 6: Data Cleaning
In the fields of data management, analytics, and machine learning, data cleaning is extremely significant. This project will provide you hands-on experience in dealing with stale data. You'll learn how to deal with data columns that are inconsistent or irrelevant. Imputing missing fields utilising techniques such as forward fill, backward fill, mean imputation, constant imputation, interpolation, and knn, and handling missing values by removing empty records Data frame shallow and deep copy methods in Pandas Working with iterrows and itertuples in code, renaming columns with meaningful labels, dealing with duplicate values, and dealing with constant(low variance) column values. Using regular expressions on textual data to experiment with various patterns. On each topic, reference links were provided for additional conceptual clarity.
Unit 7: Regular Expression
Regular Expressions, often known as regex or regexp, are tremendously and amazingly effective in searching and manipulating text strings, especially in text data processing. Several several lines of programming code can be simply replaced by a single line of regex. You will be solving regex problems ranging from easy to difficult, such as matching digit and non-digit characters, detecting HTML tags in text, IP address validation, detecting email addresses, detecting domain problems, whitespace and non-whitespace problems, and substring problems in this assignment. For any assistance, I've provided links to reference videos and documents. Regex is an important tool to know because it is extensively used in projects involving text validation, natural language processing, and text mining.
Unit 8: EDA
Exploratory Data Analysis is a technique for visualising, summarising, and analysing data stored in rows and columns. In this assignment, you will use the data cleaning techniques you learned in the previous assignment, as well as methods to extract fundamental statistical information from data, detect outliers that pollute the data, and remove outliers using techniques like IQR and Z-score to create a uniform dataset. You will use Univariate plots such as box plots, bar plots, count plots, histogram and density plots, Bivariate plots such as scatter plots, line plots, box plots with respect to third variable and joint distribution plots, and Multivariate plots such as pair plots, multivariate scatter plots, parallel coordinates, and heatmaps, as well as Multivariate plots such as pair plots, multivariate scatter plots and parallel coordinates.
Unit 9: Interview QnAs
This module consists of 50 Interview Questions and Answers. These questions were asked in different interviews of different companies. Questions and Answers along with the companies in which they were asked is provided in this module.