Welcome to Foundation of Data Science Laboratory
Welcome to Foundation of Data Science Laboratory
Assignment 1
Introduction to Data Manipulation Libraries
Data Visualization with Matplotlib
Data Analysis Project:
Assignment 2: Practical Assignments on Data Acquisition and Cleaning
Objective:
To develop skills in data acquisition from various sources, web scraping, API integration, and
data cleaning and preprocessing.
Programs:
2.1 Data Importing from Various Sources
3. Importing Data from SQL Databases:
Set up a local SQLite database and create a table with some sample data.
Use Python to import data from the SQL table into a DataFrame.
2.2 Data Scraping Techniques
1. Web Scraping with BeautifulSoup:
Identify a website with tabular data (e.g., a Wikipedia table).
Use BeautifulSoup to scrape the data and store it in a DataFrame.
2.3 Data Cleaning and Preprocessing
2. Handling Outliers:
Assignment 3: Practical Assignments on Data Visualization and Exploratory Data Analysis (EDA)
Objective:
To develop skills in data visualization using libraries like Matplotlib and Seaborn and to perform Exploratory Data Analysis (EDA) using summary statistics, data distribution analysis, and correlation analysis.
Assignment 4: Practical Assignments on Statistical / Algorithmic Data Modeling
Objective:
To develop skills in statistical data modeling, hypothesis testing, classification and regression algorithms, model evaluation techniques, and hands-on exercises with the scikit-learn library.
4.1: Hypothesis Testing and Probability Distributions
4.2: Basics of Classification and Regression Algorithms
1. Classification Algorithm (Logistic Regression):
o Implement a logistic regression model using scikit-learn to classify the Iris dataset.
2. Regression Algorithm (Linear Regression):
o Implement a linear regression model using scikit-learn to predict house prices.
4.3: Model Evaluation Techniques
1. Performance Metrics for Classification:
o Evaluate a classification model using confusion matrix, precision, recall, and F1-score.
2. Performance Metrics for Regression:
o Evaluate a regression model using mean squared error, mean absolute error, and R-squared.
4.4: Hands-on Exercises with scikit-learn Library
1. Implement a Decision Tree Classifier:
o Train and evaluate a Decision Tree Classifier on the Iris dataset.
2. Implement a Random Forest Regressor:
o Train and evaluate a Random Forest Regressor on the Boston Housing dataset.