- Bayesian inference
- Beta distribution, and beta function (needed to renormalize the beta distribution). There, the remark that the beta distribution is the posterior probability of the parameter p of a binomial distribution after observing \alpha - 1 successes and \beta - 1 failures is given; that is what we use in class (see also the video lecture).
- You can also read the brief section on "Crowdsensus" in this paper, which presents some general context for the problem of gathering information from a crowd we saw in class.
For homework, please think at the following problems. Write up your thoughts, and any math, so that we can discuss this in class. You will turn in the papers in one week (Tuesday January 17), so you have time to refine them after a class discussion on Thursday. These are hard problems, and I do not expect you to solve them, but simply to make an attempt and document it.
- What is a good way to gather input from a large number of people and decide, e.g., what is the correct phone number for a restaurant, or which color is a given building? We saw in class that Bayesian inference does not scale well for a large number of inputs. We proposed in class a method whereby users vote, with each vote having a weight proportional to the cost of achieving the degree of accuracy of the user. Is this vote-with-accuracy-cost a good scheme? What are its properties, can we prove anything about it? Do you have better proposals?
- Assume a building can be either Red, Green, or Blue (R, G, B), and we ask many people to tell us which color it is. We saw in class that, using Bayesian inference or voting, from user inputs we can compute a probability that the building is R, G, or B. We also saw in class how this probability on R, G, B, can be used to update the probability distribution p_u(x), indicating how likely it is that user u has an accuracy (tells the truth) with probability x. We can of course keep iterating this: From the user input, assuming some user accuracy, we compute a probability distribution of R, G, B; then using the probability distribution over R, G, B, we can update the probability distributions over user accuracy -- and all over again -- using these updates probability distributions over user accuracy, we can compute a new probability distribution over R, G, B, and using the probability distribution over R, G, B, we can once more update the probability distributions over user accuracy. But, does this process converge? Can we prove convergence?