Sentiment Analysis (Epsilon)

Approach and Models

Approach

We Tried 3 different Sentiment Analysis models with our prepared data:

Our own Custom model
SpaCy
Roberta

below you'll find the code and analysis of each of them in addition to the results:

1. Custom Model (Random Forest with TF-IDF encoding):

Code:

Explanation:

Text Encoding: TF-IDF vectorization is used to convert text data into numerical features.
Model: Random Forest Classifier is employed for sentiment classification.
Results: The model achieved a training accuracy of approximately 89.98% and testing accuracy of 89.67%.

2. SpaCy Model (Word Embedding):

Code:

Explanation:

Text Encoding: Word embeddings from spaCy are used to represent each document.
Model: Logistic Regression Classifier is employed for sentiment classification.
Results: The model achieved training accuracy and testing accuracy, although specific accuracy values are not provided.

3. Roberta Model

Code:

Explanation:

Text Encoding: Uses Roberta pre-trained model for sequence classification.
Model: The model assigns polarity scores for negative, neutral, and positive sentiments.
Results: The sentiment scores are calculated and merged with the original DataFrame.

Summary of Results:

Custom Model: Achieved a training accuracy of 89.98% and testing accuracy of 89.67%.
SpaCy Model: Specific accuracy values are not provided.
Roberta Model: Generates polarity scores for negative, neutral, and positive sentiments.

Success Rates and Accuracy:

Custom Model: High accuracy on both training and testing sets, indicating good generalization.
SpaCy Model: No specific accuracy values provided, making it challenging to assess performance.
Roberta Model: Success is measured in terms of generated polarity scores for each sentiment category.

Usage Example:

Custom Model: Demonstrated sentiment predictions for new text inputs.
SpaCy Model: Presented predictions for new text inputs using spaCy embeddings.
Roberta Model: Applied the model to the entire dataset, producing sentiment scores for each comment.

Final Verdict:

These models collectively offer a diverse approach to sentiment analysis, combining traditional TF-IDF techniques, word embeddings, and state-of-the-art pre-trained models like Roberta. The documentation includes insights into their respective accuracies, strengths, and applications in the context of MOOC learner reviews.

Page updated

Report abuse