您的位置:首页 > 其它

Mean Average Precision

2016-07-04 10:20 330 查看


reference: https://www.kaggle.com/wiki/MeanAveragePrecision
Introduction

Parameters: n

Suppose there are m missing outbound edges from a user in a social graph, and you can predict up to n other nodes that the user is likely to follow. Then, by adapting the definition of average precision in IR (http://en.wikipedia.org/wiki/Information_retrieval,http://sas.uwaterloo.ca/stats_navigation/techreports/04WorkingPapers/2004-09.pdf),
the average precision at n for this user is:

ap@n=∑k=1nP(k)/min(m,n)ap@n=∑k=1nP(k)/min(m,n)

where P(k) means the precision at cut-off k in the item list, i.e., the ratio of number of users followed, up to the position k, over the number k; P(k) equals 0 when the k-th item is not followed upon recommendation; m is the number
of relevant nodes; n is the number of predicted nodes. If the denominator is zero, P(k)/min(m,n) is set to zero.

(1) If the user follows recommended nodes #1 and #3 along with another node that wasn't recommend, then ap@10 = (1/1 + 2/3)/3 ≈ 0.56

(2) If the user follows recommended nodes #1 and #2 along with another node that wasn't recommend, then ap@10 = (1/1 + 2/2)/3 ≈ 0.67

(3) If the user follows recommended nodes #1 and #3 and has no other missing nodes, then ap@10 = (1/1 + 2/3)/2 ≈ 0.83

The mean average precision for N users at position n is the average of the average precision of each user, i.e.,

MAP@n=∑i=1Nap@ni/NMAP@n=∑i=1Nap@ni/N

Note this means that order matters. But it depends. Order matters only, if there is at least one incorrect prediction. The other words, if all predictions are correct, it doesn't matter in which order they are given.

Thus, if you recommend two nodes A & B in that order and a user follows node A and not node B, your MAP@2 score will be higher (better) than if you recommended B and then A. This makes sense - you want the most relevant results to show up first. Consider the
following examples:

(1) The user follows recommended nodes #1 and #2 and has no other missing nodes, then ap@2 = (1/1 + 1/1)/2 = 1.0

(2) The user follows recommended nodes #2 and #1 and has no other missing nodes, then ap@2 = (1/1 + 1/1)/2 = 1.0

(3) The user follows node #1 and it was recommended first along with another node that wasn't recommended, then ap@2 = (1/1 + 0)/2 = 0.5

(4) The user follows node #1 but it was recommended second along with another node that wasn't recommend, then ap@2 = (0 + 1/2)/2 = 0.25

So, it is better to submit more certain recommendations first. AP score reflects this.

Here's an easy intro to MAP: http://fastml.com/what-you-wanted-to-know-about-mean-average-precision/

Here's another intro to MAP from our forums.


Sample Implementations

our C# Production Implementation
R, test
cases
Haskell, test
cases
MATLAB / Octave, test
cases
Python, test
cases


Contests that used MAP@K

MAP@500: https://www.kaggle.com/c/msdchallenge/details/Evaluation
MAP@200: https://www.kaggle.com/c/event-recommendation-engine-challenge
MAP@10: https://www.kaggle.com/c/FacebookRecruiting
MAP@10: https://www.kaggle.com/c/coupon-purchase-prediction/details/evaluation
MAP@5: https://www.kaggle.com/c/expedia-hotel-recommendations
MAP@3: https://www.kddcup2012.org/c/kddcup2012-track1/details/Evaluation


Article needs:

explanation
formula
example solution & submission files
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: