Approach and Models
Approach and Models
We Tried 3 different Sentiment Analysis models with our prepared data:
Our own Custom model
SpaCy
Roberta
below you'll find the code and analysis of each of them in addition to the results:
Code:
Explanation:
Text Encoding: TF-IDF vectorization is used to convert text data into numerical features.
Model: Random Forest Classifier is employed for sentiment classification.
Results: The model achieved a training accuracy of approximately 89.98% and testing accuracy of 89.67%.
Code:
Explanation:
Text Encoding: Word embeddings from spaCy are used to represent each document.
Model: Logistic Regression Classifier is employed for sentiment classification.
Results: The model achieved training accuracy and testing accuracy, although specific accuracy values are not provided.
Code:
Explanation:
Text Encoding: Uses Roberta pre-trained model for sequence classification.
Model: The model assigns polarity scores for negative, neutral, and positive sentiments.
Results: The sentiment scores are calculated and merged with the original DataFrame.
Custom Model: Achieved a training accuracy of 89.98% and testing accuracy of 89.67%.
SpaCy Model: Specific accuracy values are not provided.
Roberta Model: Generates polarity scores for negative, neutral, and positive sentiments.
Custom Model: High accuracy on both training and testing sets, indicating good generalization.
SpaCy Model: No specific accuracy values provided, making it challenging to assess performance.
Roberta Model: Success is measured in terms of generated polarity scores for each sentiment category.
Custom Model: Demonstrated sentiment predictions for new text inputs.
SpaCy Model: Presented predictions for new text inputs using spaCy embeddings.
Roberta Model: Applied the model to the entire dataset, producing sentiment scores for each comment.
Final Verdict:
These models collectively offer a diverse approach to sentiment analysis, combining traditional TF-IDF techniques, word embeddings, and state-of-the-art pre-trained models like Roberta. The documentation includes insights into their respective accuracies, strengths, and applications in the context of MOOC learner reviews.