METHODOLOGY

Our Researcher

This study uses data to find patterns and predict if someone is an introvert or extrovert based on their behavior. The dataset includes 7 behavioral attributes. We used Python to analyze the data and apply different models to find patterns and make predictions.

Using three main methods:

Naive Bayes
Decision Tree
Random Forest

Data Collection

Souce: Kaggle - Extrovert vs. Introvert Behavior Data by Rakesh Kapilavai
Data Size: 2,900 participants

Data Preprocessing

To prepare the data for analysis:

Missing values were removed to ensure clean input for modeling.
Categorical responses (Yes/No) were encoded numerically (1/0).
Data was split into training (70%) and testing (30%) — one for training and one for testing the prediction models.

Data summary after cleaning process:

Total Data: 2477
Training Data: 1733
Testing Data: 744

Naive Bayes

Naive Bayes calculates the probability of a person being an introvert or extrovert based on their behavior.

Formula:

Decision Tree

A decision tree chooses features based on Information Gain or Gini Index. It chooses the best question to ask first using something called Information Gain, which helps pick the most useful behavior for splitting the data.

Formula Entropy (for information gain):

Formula Information Gain:

Random Forest

Random Forest combines many decision trees. To improve accuracy by combining many decision trees

Formula:

Confusion Matrix

A confusion matrix is a table used to evaluate the performance of a classification model (like Naive Bayes, Decision Tree, or Random Forest).
It compares the model’s predictions with the actual results.

Term Means:

True Positive (TP): Correctly predicted as extrovert
True Negative (TN): Correctly predicted as introvert
False Positive (FP): Incorrectly predicted as introvert/extrovert
False Negative (FN): Missed prediction (was introvert/extrovert, but model got it wrong)

Key Performance Matrix

These metrics help us understand how well our model is working — especially for predicting if someone is an introvert or extrovert.

BACK TO [DATASET USED]

NEXT TO [RESULTS]

Page updated

Google Sites

Report abuse