We choose User-Based Collaborative Filtering and Content-Based Filtering for my Spotify recommendation system project because they provide a balanced and effective approach to personalized music recommendations. User-Based Collaborative Filtering leverages users' listening history to find similar users and recommend songs they enjoyed, ensuring personalized suggestions based on user behavior. Content-Based Filtering uses the attributes of songs, such as acousticness, danceability, and energy, to recommend tracks with similar features, which helps in addressing the cold start problem for new users and providing transparent recommendations. These methods are specifically tailored for recommendation systems and are more effective for handling user preferences and item characteristics compared to KNN and Decision Trees, which may struggle with scalability and complex feature interactions in this context.
A technique called User-Based Collaborative Filtering uses ratings from other users who have similar likes to the target user to anticipate which products the user will like.
Step 1: Determine which users are comparable to the intended user, U.
Pearson Correlation Coefficient is used to measure the similarity between users. The Pearson Correlation is used to measure the strength of a linear association between two variables and is computed using the following formula (shown at the right hand side) :
where Xi and Yi are the frequencies for users "a" and "b", respectively, and x and y are the mean values of these frequencies.
Step 2: Estimating the item's missing rating
Now, the target user could differ greatly from some users while having a lot in common with others. As a result, reviews for a certain item from users who are more alike should be given more weight than ratings from users who are less alike. This problem can be solved with a weighted averages-based method. This method involves multiplying the rating of each user by a similarity factor that was calculated using the previously described procedure.
Why Pearson Correlation?
Pearson correlation is invariant to scaling, i.e., multiplying all elements by a nonzero constant or adding any constant to all elements does not affect the correlation. For example, if you have two vectors X and Y, then pearson(X,Y) = pearson (X,2×Y+3). This is important in recommendation systems because two users might rate items differently in absolute terms but still share similar tastes. A Pearson correlation of 1 indicates that two users have similar tastes, while a correlation of -1 indicates the opposite.
Additional data relating to users and/or items has been used in the content-based approach. This sorting method uses both the item's attributes and the user's past behavior or explicit feedback to recommend more products.
We use Spotify Dataset and Spotify Playlist data from Kaggle, that's data.csv, data_by_artist.csv, data_by_genre.csv, data_by_year.csv, data_w_genres.csv and spotify_dataset.csv.
In python, we use content based filtering and user based collaborative filtering to make Spotify Recommendation System.
User-Based Collaborative Filtering
We create a recommendation function, named as ColFilter which used for filtering based on the input Artist, user subset and similarity calculation, and then make the Top Similar Users and Recommendation Generation. In other words to make it more simple, ColFilter function will identifies users with similar tastes to the target user based on artist preferences and recommends artists that these similar users have rated highly. This is also the core principle of User-Based Collaboration Filtering.
Filtering by Input Artist:
It filters the df_freq dataframe (presumably containing user-artist frequency data) to include only artists present in inputArtist dataframe which containing artists the user likes.
User Subset and Similarity Calculation:
First, it groups df_freq by user and sorts these groups by the number of artists in common with the input artist (descending). It then selects the top 100 user groups for further processing. Finally, it will iterates through these user groups in order to:
Sorts both the user group and input artist dataframes by artist ID for easier comparison.
Calculates the Pearson correlation coefficient between the user group's ratings and the input artist ratings.
Stores the user ID and correlation coefficient in a dictionary (pearsonCorDict).
Top Similar Users and Recommendation Generation:
It converts the personCorDict to a DataFrame (pearsonDF) with user IDs and similarity scores. It then selects the top 50 users with the highest similarity scores. It merges the top user IDs with the df_freq dataframe to obtain the ratings of these users for all artists. It then calculates a "weighted frequency" for each artist by multiplying the user similarity score with the user's individual rating for that artist. It groups the data by artist ID and calculate the sum of similarity scores and weighted frequencies.
It then creates a new dataframe (recommendation_df) containing the artist ID and the weighted average frequency score (considering user similarities). It sorts the recommendation dataframe by the weighted average frequency score (highest to lowest).
Finally, it retrieves the top 100 artists from the original df_artist dataframe based on the artist IDs in the sorted recommendation dataframe.
Content-Based Filtering
Example recommendation output based on song id (song_id) and the number of top recommendations to return (N)
The evaluation of our Spotify Recommendation System uses three key metrics: precision at k, recall at k, and AUC (Area Under the Curve) score. These metrics help us understand the model's performance on both training and test datasets while ensuring that the training interactions are excluded from the test evaluation to prevent data leakage.
Evaluation Metrics:
Precision at k: Measures the proportion of relevant recommendations among the top k recommendations.
Recall at k: Measures the proportion of all relevant items that are recommended in the top k recommendations.
AUC (Area Under the Curve): Measures the model's ability to distinguish between relevant and irrelevant items
Results:
Precision:
Train Precision: 0.40
Test Precision: 0.20
Recall:
Train Recall: 0.05
Test Recall: 0.07
AUC:
Train AUC: 0.96
Test AUC: 0.95
Analysis:
AUC Scores: The AUC scores of 0.96 for training and 0.95 for testing indicate that the model is highly accurate in distinguishing between relevant and irrelevant items. This high score shows strong discriminative ability.
Precision: The model's precision drops significantly from 0.40 on the training data to 0.20 on the test data. This suggests that while the model performs well in identifying relevant items during training, its ability to do so decreases with new data, indicating potential overfitting.
Recall: The recall values are low, with 0.05 for training and 0.07 for testing. This indicates that the model misses a significant number of relevant items, highlighting the need for improvement in identifying more relevant items.
Our Spotify Recommendation System shows a high level of accuracy in distinguishing between relevant and irrelevant items (high AUC scores). However, the drop in precision from training to testing data and the low recall values suggest that the model struggles to generalize and identify a higher proportion of relevant items with new data. To improve the model, we need to focus on enhancing its ability to generalize and correctly identify more relevant recommendations, especially when applied to unseen data.