We have used the CamRa2011 dataset. We are not discussing the entire dataset here, But only the data we used for our project. Dataset Description:
Number of Groups: 290
Number of Users: 602
Number of Items: 7710
Data Structure Descriptions generated from the data:
User Training Matrix having user-item data
Group Training Matrix having group-item data
Group-User dictionary mapping group to dictionary of users
Note: Improved MF refers to the baseline-included MF
For our experiment:
Decay Rate=0.1
Step Size=10
k=100 [k= number of recommendations]
NOTE: Wherever we mention NDCG and RMSE going forward, we mean NDCG@k and RMSE@k
In order to interpret our results, let us first re-iterate our Base cases first:
Base_Case: APD
(
Social Relationship Descriptor= All;
Group Decision Strategies= Max Satisfaction, Avg Satisfaction, Minimum Misery;
Expertise Descriptor=yes;
Dissimilarity Descriptor= APD;
(W1, W2)=(0.8, 0.2);
(MF Type)=N.A.;
LR Schedule=N.A.
)
Base_Case: VD
(
Social Relationship Descriptor= All;
Group Decision Strategies= Max Satisfaction, Avg Satisfaction, Minimum Misery;
Expertise Descriptor=yes;
Dissimilarity Descriptor= VD;
(W1, W2)=(0.8, 0.2);
(MF Type)=N.A.;
LR Schedule=N.A.
)
-----------------------------------------------------------------------------------------------------------
Now we compared our results with these two base cases (which ever is appropriate for the test case being evaluated) and have the following observations:
APD Comparisons:
The following Test Case gives the best NDCG result:
Best_Case: NDCG
(
Social Relationship Descriptor= All;
Group Decision Strategies= Max Satisfaction, Avg Satisfaction, Minimum Misery;
Expertise Descriptor=yes;
Dissimilarity Descriptor= APD;
(W1, W2)=(0.8, 0.2);
(MF Type)=Default;
LR Schedule=Exponential Decay with Step Size
)
The following Test Case gives the best RMSE result:
Best_Case: RMSE
(
Social Relationship Descriptor= All;
Group Decision Strategies= Max Satisfaction, Avg Satisfaction, Minimum Misery;
Expertise Descriptor=yes;
Dissimilarity Descriptor= APD;
(W1, W2)=(0.8, 0.2);
(MF Type)=Baseline-Included;
LR Schedule=Exponential Decay
)
The following Test Case gives the best Computation Time result:
Best_Case: Computation_Time
(
Social Relationship Descriptor= All;
Group Decision Strategies= Max Satisfaction, Avg Satisfaction, Minimum Misery;
Expertise Descriptor=yes;
Dissimilarity Descriptor= APD;
(W1, W2)=(0.8, 0.2);
(MF Type)=Default;
LR Schedule=Exponential Decay with Step Size
)
VD Comparisons:
The following Test Case gives the best RMSE result:
Best_Case: RMSE
(
Social Relationship Descriptor= All;
Group Decision Strategies= Max Satisfaction, Avg Satisfaction, Minimum Misery;
Expertise Descriptor=yes;
Dissimilarity Descriptor= VD;
(W1, W2)=(0.8, 0.2);
(MF Type)=Default;
LR Schedule=Exponential Decay
)
We found no NDCG result better than Best Case
We don't rely on Computation Time much because it depends on system factors and continuous memory/stack usage as well.
As per our observations, our interpretations go as followed:
For APD, increase in NDCG => Our ranking has improved as per ground truth.
For APD, decrease in RMSE=> Along with better ranking,, we are able to reduce rating prediction errors.
For APD, improvement in Time computation=> Not much reliable since computation time is system and other factor dependent.
For VD, decrease in RMSE=>Reduction in rating prediction error.
However, It is not possible for us to conclude from above data that the improvement is significant because the improvements (especially NDCG Improvement ~ 0.6%) are not much.
Hence our next step is to ensure consistency in the improvement.