rectorch.metrics¶
Module containing the definition of the evaluation metrics.
The metrics are implemented as static methods of the class Metrics
. Up to now
the following metrics are implemented:
See also
Modules:
evaluation
-
class
rectorch.metrics.
Metrics
[source]¶ The class Metrics contains metric functions.
All methods are static and no object of type
Metrics
is needed to compute the metrics.-
static
compute
(pred_scores, ground_truth, metrics_list)[source]¶ Compute the given list of evaluation metrics.
The method computes all the metric listed in
metric_list
for all the users.- Parameters
- pred_scores
numpy.array
The array with the predicted scores. Users are on the rows and items on the columns.
- ground_truth
numpy.array
Binary array with the ground truth. 1 means the item is relevant for the user and 0 not relevant. Users are on the rows and items on the columns.
- metrics_list
list
ofstr
The list of metrics to compute. Metrics are indicated by strings formed in the following way:
matric_name
@k
where
matric_name
must correspond to one of the method names without the suffix ‘_at_k’, andk
is the corresponding parameter of the method and it must be an integer value. For example:ndcg@10
is a valid metric name and it corresponds to the methodndcg_at_k()
withk=10
.- pred_scores
- Returns
dict
ofnumpy.array
Dictionary with the results for each metric in
metric_list
. Keys are string representing the metric, while the value is an array with the value of the metric computed on the users.
-
static
Examples
>>> import numpy as np
>>> from rectorch.metrics import Metrics
>>> scores = np.array([[4., 3., 2., 1., 0.]])
>>> groud_truth = np.array([[1., 1., 0., 0., 1.]])
>>> met_list = ["recall@2", "recall@3", "ndcg@2"]
>>> Metrics.compute(scores, ground_truth, met_list)
{'recall@2': array([1.]),
'recall@3': array([0.66666667]),
'ndcg@2': array([1.])}
-
static
hit_at_k
(pred_scores, ground_truth, k=100)[source]¶ Compute the hit at k.
The Hit@k is either 1, if a relevan item is in the top k scored items, or 0 otherwise.
- Parameters
- pred_scores
numpy.array
The array with the predicted scores. Users are on the rows and items on the columns.
- ground_truth
numpy.array
Binary array with the ground truth. 1 means the item is relevant for the user and 0 not relevant. Users are on the rows and items on the columns.
- k
int
[optional]The number of top items to considers, by default 100
- pred_scores
- Returns
numpy.array
An array containing the hit@k value for each user.
Examples
>>> import numpy as np
>>> from rectorch.metrics import Metrics
>>> scores = np.array([[4., 3., 2., 1.]])
>>> ground_truth = np.array([[0, 0, 1., 1.]])
>>> Metrics.hit_at_k(scores, ground_truth, 3)
np.array([1.])
>>> Metrics.hit_at_k(scores, ground_truth, 2)
np.array([0.])
-
static
mrr_at_k
(pred_scores, ground_truth, k=100)[source]¶ Compute the Mean Reciprocal Rank (MRR).
The MRR@k is the mean overall user of the reciprocal rank, that is the rank of the highest ranked relevant item, if any in the top k, 0 otherwise.
- Parameters
- pred_scores
numpy.array
The array with the predicted scores. Users are on the rows and items on the columns.
- ground_truth
numpy.array
Binary array with the ground truth. 1 means the item is relevant for the user and 0 not relevant. Users are on the rows and items on the columns.
- k
int
[optional]The number of top items to considers, by default 100
- pred_scores
- Returns
numpy.array
An array containing the mrr@k value for each user.
Examples
>>> import numpy as np
>>> from rectorch.metrics import Metrics
>>> scores = np.array([[4., 2., 3., 1.], [1., 2., 3., 4.]])
>>> ground_truth = np.array([[0, 0, 1., 1.], [0, 0, 1., 1.]])
>>> Metrics.mrr_at_k(scores, ground_truth, 3)
array([.5, 1.])
>>> Metrics.mrr_at_k(scores, ground_truth, 1)
array([0., 1.])
-
static
ndcg_at_k
(pred_scores, ground_truth, k=100)[source]¶ Compute the Normalized Discount Cumulative Gain (nDCG).
nDCG is a measure of ranking quality. nDCG measures the usefulness, or gain, of an item based on its position in the scoring list. The gain is accumulated from the top of the result list to the bottom, with the gain of each result discounted at lower ranks.
The nDCG is computed over the top-k items (out of \(m\)), where k is specified as a parameter, for all users independently. The nDCG@k (\(k \in [1,2,\dots,m]\)) is computed with the following formula:
\(nDCG@k = \frac{\textrm{DCG}@k}{\textrm{IDCG}@k}\)
where
\(\textrm{DCG}@k = \sum\limits_{i=1}^k \frac{2^{rel_i}-1}{\log_2 (i+1)},\)
\(\textrm{IDCG}@k = \sum\limits_{i=1}^{\min(k,R)} \frac{1}{\log_2 (i+1)}\)
with \(rel_i \in \{0,1\}\) indicates whether the item at i-th position in the ranking is relevant or not, and R is the number of relevant items.
- Parameters
- pred_scores
numpy.array
The array with the predicted scores. Users are on the rows and items on the columns.
- ground_truth
numpy.array
Binary array with the ground truth items. 1 means the item is relevant for the user and 0 not relevant. Users are on the rows and items on the columns.
- k
int
[optional]The number of top items to considers, by default 100
- pred_scores
- Returns
numpy.array
An array containing the ndcg@k value for each user.
Examples
>>> import numpy as np
>>> from rectorch.metrics import Metrics
>>> scores = np.array([[4., 3., 2., 1.]])
>>> ground_truth = np.array([[0, 0, 1., 1.]])
>>> Metrics.ndcg_at_k(scores, ground_truth, 3)
array([0.306573596])
-
static
recall_at_k
(pred_scores, ground_truth, k=100)[source]¶ Compute the Recall.
The recall@k is the fraction of the relevant items that are successfully scored in the top-k The recall is computed over the top-k items (out of \(m\)), where k is specified as a parameter, for all users independently.
Recall@k is computed as
\(\textrm{recall}@k = \frac{\textrm{TP}}{\textrm{TP}+\textrm{FN}}\)
where TP and FN are the true positive and the false negative retrieved items, respectively.
- Parameters
- pred_scores
numpy.array
The array with the predicted scores. Users are on the rows and items on the columns.
- ground_truth
numpy.array
Binary array with the ground truth. 1 means the item is relevant for the user and 0 not relevant. Users are on the rows and items on the columns.
- k
int
[optional]The number of top items to considers, by default 100
- pred_scores
- Returns
numpy.array
An array containing the recall@k value for each user.
Examples
>>> import numpy as np
>>> from rectorch.metrics import Metrics
>>> scores = np.array([[4., 3., 2., 1.]])
>>> ground_truth = np.array([[0, 0, 1., 1.]])
>>> Metrics.ndcg_at_k(scores, ground_truth, 3)
array([0.306573596])