We have moved to a new Address!
Shapley values concept is based upon cooperative game theory, which assumes that a group of players or a coalition between players is the primary unit of decision making rather than individual players. But before going into shapley values usage in a data problem, let’s dive a bit more into cooperative game theory by taking an example
Let’s say Amar, Akbar and Anthony goes to a restaurant and shares food among them. Now when it comes to bill, it is difficult to calculate share of each since they did not have equal shares of dishes. But due to there past experiences they know that when Amar visits alone he pays 150, when Akbar visits he pays 240, .....
Putting it mathematically in terms of a cost function f(s), where s is a subset of {Amar, Akbar, Anthony}
Cost(s) => 150, s = {Amar} // Only Amar visit restaurant
240, s = {Akabar} // Only Akbar visit restaurant
180, s = {Anthony}
360, s = {Amar, Akbar} // Amar & Akbar visit restaurant together
340, s = {Amar, Anthony}
420, s = {Akbar, Anthony}
560, s = {Amar, Akbar, Anthony} // All three of them visit together
Now, let’s form all the possible coalition among three with the rule that each will pay their marginal contributions according to their respective position in the set.
If we take (Amar, Akbar, Anthony) as a coalition then starting from Amar, he will have to pay 150 as Cost(s) => 80, s = {Amar}. Now moving to Akbar, he will be paying 210 as Cost(s) => 360, s = {Amar, Akbar} and A has already paid 150. And lastly since Cost(s) => 560, s = {Amar, Akbar, Anthony}, Anthony will be paying remaining 200.
Performing same operations on all coalitions, they get their individual marginal contribution to bill share if they go to dine together.
Now coming back to our original discussion, for an observation, we try to assign a weightage(shapley value) on terms of cooperative gaming theory to each feature such that the sum of all the shapley values equals the difference between the average prediction of dataset and the prediction of the observation. Or in other words shapley values determines the contribution of each features towards pushing the prediction from the average prediction value.