Lecture 19

Yelp reputation system evaluation

I uploaded the code to compute the median and do the sampling. Note: I updated the code on Friday, March 16; please use this newer version of the code. I tested the code, and it seems to work, but please let me know if you discover any errors. Note that the code provided is able to deal with reviews of different weight (e.g., in my code, a review of a good user has weight 1.0, and of a bad user has weight < 1.0).

As a reminder, we decided to do the following evaluations of our Yelp work.

In the description of the evaluations, let median(h) be the median (computed with the code provided) of a star histogram h, and let eval(h, m) be computed as follows:

For each restaurant, compute the following, and add it for all restaurants:
- Sample 20 reviews from h around median m with the code provided.
- Compute the total number of users who found the sampled review useful (data is part of the json)

Note that the code deals gracefully with the cases where there are less than 20 sample-able reviews.

The evaluations are:

Baseline. Let m1 = median(h), where h is the uncorrected (original) star histogram, and compute eval(h, m1). This is what would happen if we took Yelp reviews close to the median as our guess / prediction for which reviews are helpful.
Debiasing to compute median. Let m2 = median(h'), where h' is the star histogram after you have de-biased the user reviews with your algorithm. Then, compute eval(h, m2). In other words, you are using the de-biasing process to understand what is the true star-value of the restaurant (m2), and then you use that (presumably better) star-value to select the reviews that had that number of stars.
Debiasing as selection. Let m2 = median(h') as before, and compute eval(h', m2). In other words, after you debias and compute a new median m2, you select reviews that are close in value to m2 once debiased.
Arbitrary (optional). Use any algorithm you wish in order to select reviews from a restaurant; the only restrictions is that the algorithm should not use the text of the review, or the information on how many users found it useful / cool / funny / ... . You should select no more than 20 reviews per restaurants, and you should make an effort to select 20 reviews whenever possible (this is a bit fuzzy; it's hard to make this evaluation fully comparable with the above evaluations, unless we force all evaluations to always select min(20, n_reviews_of(h)) reviews). This is optional; do it only if you are interested or if you think it helps show particular aspects of your algorithm. Being a very general evaluation, there is no end to the amount of work you could do to optimize it.

Posting sorted restaurant lists

Please post on the wiki a list of restaurants, sorted according to decreasing median star value as computed by your algorithm after debiasing (when you post, name the file after your name, so we know to whom each file belongs). Do this this week if you can, so that you can compare the results with those of others. Likewise, you are more than welcome to post the results of your evaluations to the wiki, so that we can all see how we are doing (you will not be graded in proportion to your performance!).

Page updated

Report abuse