For brevity of notation below let us define the number of relevant documents among the first k retrieved documents as #rel@k
.
relevant | A document in the result list of a query is relevant, if it fulfills the users information need. For 20 Newsgroups test corpus a retrieved document is relevant to a query document if it belongs to the same newsgroup. |
yield | For a query: Total number of relevant document in test corpus |
precision@k | #rel@k divided by number of retrieved documents. E.g. P@10 = #rel@10 / 10 . |
recall | #rel@k divided by yield (total number of relevant documents in corpus). |
Sometimes the precision increases after we see more result documents. ...
The R-precision measures the precision at yield. This equals by definition the recall. An ideal retrieval system has an R-precision of 1.0 for every query.
r_precision = rel docs in yield / yield;