With the exponential growth of Massive Open Online Courses (MOOCs), extracting meaningful insights from learner reviews becomes challenging due to the sheer volume of textual data. The motivation for this project lies in the imperative to employ robust data preprocessing techniques to facilitate efficient sentiment analysis. By leveraging machine learning and natural language processing, we seek to automate the categorization of reviews into positive, negative, or neutral sentiments. This emphasis on data preprocessing aims to streamline the analysis, allowing MOOC platforms to derive actionable conclusions, improve course quality, and identify areas for enhancement in a more systematic and effective manner.
The surge in MOOC learner reviews demands efficient data preprocessing for subsequent model readiness. Manual analysis is impractical due to the data volume. This project focuses on crucial preprocessing steps—cleaning, feature engineering, normalization, and transformation—to optimize the data for sentiment analysis models. The goal is to enable MOOC platforms to derive actionable insights, improve course quality, and identify areas for enhancement through systematic and effective data preparation.
The approach employed for the execution of this project followed the iterative cycle ingrained in our minds as the Data Preprocessing Methodology throughout the entire semester.
Formulating the Problem Definition.
Cleaning the data.
Feature Engineering and Selection.
Data Normalization and Transformation.
assessing the outcomes and conveying and visualizing the results constituted the key steps in this approach.
this website is also structured and organized in the same sequence:
Project Statement: Encompasses the motivation, problem statement, and the approach we adhered to.
Data: Details the data source and outlines various steps in the Preprocessing Process, including cleaning, transformation, feature engineering, etc.
EDA (Exploratory Data Analysis): Elaborates on all the analyses and assessments conducted on the extracted features during the Exploratory Data Analysis process.
Modeling : Expounds on the adopted modeling approach, detailing the evaluation of various models, the steps taken to identify optimal parameters for each model, and the process leading to the final model.
and Describes the achieved prediction accuracy scores with different evaluated models across the train, validation, and test datasets, along with the accuracy attained by the final model.
Related Resources: cotains links to view and download the Project Report and Jupyter Notebook for this Project.
Conclusion: Summarizes findings from the approach and results, offering insights into potential next steps to enhance prediction and model performance. Also, suggests alternative modeling approaches for consideration.
References: Provides a comprehensive list of references and literature used to gather understanding and knowledge for project execution.
Team: Furnishes information about the collaborative effort that brought this project to fruition, including guidance from our valuable project guide.
Acknowledgements: Expresses gratitude to our exceptional teaching team whose support was instrumental in making this project possible.