2.4. MORE ON RATINGS - THE NETFLIX CHALLENGEThe NeighborhoodUsing everybody's ratings to predict Mary's ratings suffers from a clear weakness. A person's tastes depend on factors such as demographics (gender, age), education, family background, etc. It would be more effective if we could use only the ratings of users that share background with Mary. However, background information is usually not available so, instead, we may try to identify those individuals whose ratings are in closer agreement with Mary's ratings on similar items. We can think of such people as an affinity group for Mary, but the technical term used such a group is neighborhood. Here is how can go about defining such a neighborhood. Suppose we have two users U1 and U2. Let us use R1(A) to denote the rating user U1 gave to some item A and R2(A) to denote the corresponding rating for user U2. We can define the Distance between the two users by the adding up the absolute values of the differences of each pair of ratings over all items that both have rated.
For the example of Table 2.2.1 we have
Then we can define the neighborhood for user U1 as consisting of the, say, three users that are closest to U1 according to this distance. (The technical literature uses the term the K-Nearest Neighbors of U1 for such a group when we take the K users closest to U1.) From Table 2.4.1 we see that the 3-Nearest Neighbors of John are Richard, Lucy, and Michael or Basil. The 3-Nearest Neighbors of Richard are John, Lucy, and Michael or Basil, and so forth. Next we look at all items that U1 does not have and form the combined score of such items by U1's affinity group (or, in technical terminology, U1's K-Nearest Neighbors). The combined score could be simply the sum of individual scores
We can find the item with the largest score and recommend to user U1. Combining Everything - The Netflix ChallengeOf course we can combine methods, for example, by finding a regression line for the K-Nearest Neighbors and use that line to predict Mary's ratings. There are several other refinements possible. Some users give habitually high ratings while others give habitually low ratings. We can compensate for such bias by computing for each user the average rating for all items and then use the differences of ratings from the average to make a prediction. One method of this kind are discussed in Section 2.X. The first round of the Netflix Challenge was won by a group from the AT&T Research Labs (they used to be called Bellcore) who combined several methods to improve upon the ratings. Then they combined their efforts with those of two other top teams creating a new team called BellKor’s Pragmatic Chaos and that team won the final contest. The following is from the Netflix web site [A]. Emphasis has been added.
Cinematch is the in-house system of Netflix. The company was planning a second challenge but it changed plans in response to legal objections regarding the privacy of the company's customers whose movie ratings had been used in the design and test sets. The Bottom LineComputers try to predict what movie or book you might like, not the basis of your personality or your demographics but only on the basis of your past shopping (or ratings) history. They try to identify other people who made similar purchase as you did (or rated items in a similar way as you did). The they assume that you have similar tastes with people with whom you share purchasing history. Most of the time their predictions are accurate but that only demostrates the power of the mathematical and statistical methods used. Notes[A] http://www.netflixprize.com//community/viewtopic.php?id=1537 |