您的位置：首页 > 其它

tensorflow streaming_recall@k&precision@k与sklearn的区别

2017-10-22 21:04 971 查看

声明：tensorflow的版本1.1.0

class_id是用来让你确定哪一个类别是正类的，

这是tf.contrib.metrics.streaming_sparse_recall_at_k官方文档

Signature: tf.contrib.metrics.streaming_sparse_recall_at_k(predictions, labels,

k, class_id=None, weights=None, metrics_collections=None, updates_collections=No

ne, name=None)

Computes recall@k of the predictions with respect to sparse labels.

If `class_id` is not specified, we'll calculate recall as the ratio of true

positives (i.e., correct predictions, items in the top `k` highest

`predictions` that are found in the corresponding row in `labels`) to
actual positives (the full `labels` row).

如果class_id没有指明的，那么则使用所有预测正确的样本数除以样本总数。这么一看，recall@k变成了accuracy@k的定义了

If `class_id` is specified, we calculate recall by considering only the rows

in the batch for which `class_id` is in `labels`, and computing the

fraction of them for which `class_id` is in the corresponding row in

`labels`.

`streaming_sparse_recall_at_k` creates two local variables,

`true_positive_at_<k>` and `false_negative_at_<k>`, that are used to compute

the recall_at_k frequency. This frequency is ultimately returned as

`recall_at_<k>`: an idempotent operation that simply divides

`true_positive_at_<k>` by total (`true_positive_at_<k>` +

`false_negative_at_<k>`).

For estimation of the metric over a stream of data, the function creates an

`update_op` operation that updates these variables and returns the

`recall_at_<k>`. Internally, a `top_k` operation computes a `Tensor`

indicating the top `k` `predictions`. Set operations applied to `top_k` and

`labels` calculate the true positives and false negatives weighted by

`weights`. Then `update_op` increments `true_positive_at_<k>` and

`false_negative_at_<k>` using these values.

If `weights` is `None`, weights default to 1. Use weights of 0 to mask values.

Args:

predictions: Float `Tensor` with shape [D1, ... DN, num_classes] where

N >= 1. Commonly, N=1 and predictions has shape [batch size, num_classes].

The final dimension contains the logit values for each class. [D1, ... DN]

must match `labels`.

labels: `int64` `Tensor` or `SparseTensor` with shape

[D1, ... DN, num_labels], where N >= 1 and num_labels is the number of

target classes for the associated prediction. Commonly, N=1 and `labels`

has shape [batch_size, num_labels]. [D1, ... DN] must match `predictions`.

Values should be in range [0, num_classes), where num_classes is the last

dimension of `predictions`. Values outside this range always count

towards `false_negative_at_<k>`.

k: Integer, k for @k metric.

class_id: Integer class ID for which we want binary metrics. This should be

in range [0, num_classes), where num_classes is the last dimension of

`predictions`. If class_id is outside this range, the method returns NAN.

weights: `Tensor` whose rank is either 0, or n-1, where n is the rank of

`labels`. If the latter, it must be broadcastable to `labels` (i.e., all

dimensions must be either `1`, or the same as the corresponding `labels`

dimension).

metrics_collections: An optional list of collections that values should

be added to.

updates_collections: An optional list of collections that updates should

be added to.

name: Name of new update operation, and namespace for other dependent ops.

Returns:

recall: Scalar `float64` `Tensor` with the value of `true_positives` divided

by the sum of `true_positives` and `false_negatives`.

update_op: `Operation` that increments `true_positives` and

`false_negatives` variables appropriately, and whose value matches

`recall`.

Raises:

ValueError: If `weights` is not `None` and its shape doesn't match

`predictions`, or if either `metrics_collections` or `updates_collections`

are not a list or tuple.

File: d:\programdata\anaconda3\lib\site-packages\tensorflow\contrib\metrics

\python\ops\metric_ops.py
Type: function

这是 tf.contrib.metrics.streaming_sparse_precision_at_k的官方文档

Signature: tf.contrib.metrics.streaming_sparse_precision_at_k(predictions, label

s, k, class_id=None, weights=None, metrics_collections=None, updates_collections

=None, name=None)

Docstring:

Computes precision@k of the predictions with respect to sparse labels.

If `class_id` is not specified, we calculate precision as the ratio of true

positives (i.e., correct predictions, items in the top `k` highest

`predictions` that are found in the corresponding row in `labels`) to

positives (all top `k` `predictions`).

如果没有指定class_id,那么预测正确的样本数除以所有top@k个预测数目，也就是样本总数*k。因此，随着K的增大，precision@k一般会减小。

If `class_id` is specified, we calculate precision by considering only the

rows in the batch for which `class_id` is in the top `k` highest

`predictions`, and computing the fraction of them for which `class_id` is

in the corresponding row in `labels`.

We expect precision to decrease as `k` increases.

`streaming_sparse_precision_at_k` creates two local variables,

`true_positive_at_<k>` and `false_positive_at_<k>`, that are used to compute

the precision@k frequency. This frequency is ultimately returned as

`precision_at_<k>`: an idempotent operation that simply divides

`true_positive_at_<k>` by total (`true_positive_at_<k>` +

`false_positive_at_<k>`).

For estimation of the metric over a stream of data, the function creates an

`update_op` operation that updates these variables and returns the

`precision_at_<k>`. Internally, a `top_k` operation computes a `Tensor`

indicating the top `k` `predictions`. Set operations applied to `top_k` and

`labels` calculate the true positives and false positives weighted by

`weights`. Then `update_op` increments `true_positive_at_<k>` and

`false_positive_at_<k>` using these values.

If `weights` is `None`, weights default to 1. Use weights of 0 to mask values.

Args:

predictions: Float `Tensor` with shape [D1, ... DN, num_classes] where

N >= 1. Commonly, N=1 and predictions has shape [batch size, num_classes].

The final dimension contains the logit values for each class. [D1, ... DN]

must match `labels`.

labels: `int64` `Tensor` or `SparseTensor` with shape

[D1, ... DN, num_labels], where N >= 1 and num_labels is the number of

target classes for the associated prediction. Commonly, N=1 and `labels`

has shape [batch_size, num_labels]. [D1, ... DN] must match

`predictions`. Values should be in range [0, num_classes), where

num_classes is the last dimension of `predictions`. Values outside this

range are ignored.

k: Integer, k for @k metric.

class_id: Integer class ID for which we want binary metrics. This should be

in range [0, num_classes], where num_classes is the last dimension of

`predictions`. If `class_id` is outside this range, the method returns

NAN.

sklearn版本为0.19.1

下面网页引自sklearn

sklearn.metrics.

recall_score

(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)

Compute the recall

The recall is the ratio

tp / (tp + fn)

where

tp

is
the number of true positives and

fn

the
number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The best value is 1 and the worst value is 0.

Read more in the User Guide.

Parameters:	y_true : 1d array-like, or label indicator array / sparse matrix Ground truth (correct) target values. y_pred : 1d array-like, or label indicator array / sparse matrix Estimated targets as returned by a classifier. labels : list, optional The set of labels to include when average != 'binary' , and their order if average is None . Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices. By default, all labels in y_true and y_pred are used in sorted order. Changed in version 0.17: parameter labels improved for multiclass problem. pos_label : str or int, 1 by default The class to report if average='binary' and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting labels=[pos_label] and average != 'binary' will report scores for that label only. average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’] This parameter is required for multiclass/multilabel targets. If None , the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data: 'binary' : Only report results for the class specified by pos_label . This is applicable only if targets ( y_{true,pred} ) are binary. 'micro' : Calculate metrics globally by counting the total true positives, false negatives and false positives. 'macro' : Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. 'weighted' : Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall. 'samples' : Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score ). sample_weight : array-like of shape = [n_samples], optional Sample weights.
Returns:	recall : float (if average is not None) or array of float, shape = [n_unique_labels] Recall of the positive class in binary classification or weighted average of the recall of each class for the multiclass task.

Parameters:

y_true : 1d array-like, or label indicator array / sparse matrix

Ground truth (correct) target values.

y_pred : 1d array-like, or label indicator array / sparse matrix

Estimated targets as returned by a classifier.

labels : list, optional

The set of labels to include when

average != 'binary'

,
and their order if

average is None

.
Labels present in the data can be excluded, for example to calculate a multiclass average ignoring a majority negative class, while labels not present in the data will result in 0 components in a macro average. For multilabel targets, labels are column indices.
By default, all labels in

y_true

and

y_pred

are
used in sorted order.

Changed in version 0.17: parameter labels improved for multiclass problem.

pos_label : str or int, 1 by default

The class to report if

average='binary'

and
the data is binary. If the data are multiclass or multilabel, this will be ignored; setting

labels=[pos_label]

and

average != 'binary'

will
report scores for that label only.

average : string, [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]

This parameter is required for multiclass/multilabel targets. If

None

,
the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:

'binary'

:

Only report results for the class specified by

pos_label

.
This is applicable only if targets (

y_{true,pred}

) are
binary.

'micro'

:

Calculate metrics globally by counting the total true positives, false negatives and false positives.

'macro'

:

Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

'weighted'

:

Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

'samples'

:

Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from

accuracy_score

).

sample_weight : array-like of shape = [n_samples], optional

Sample weights.

Returns:

recall : float (if average is not None) or array of float, shape = [n_unique_labels]

Recall of the positive class in binary classification or weighted average of the recall of each class for the multiclass task.

Examples

>>>

>>> from sklearn.metrics import recall_score
>>> y_true = [0, 1, 2, 0, 1, 2]
>>> y_pred = [0, 2, 1, 0, 0, 1]
>>> recall_score(y_true, y_pred, average='macro')
0.33...
>>> recall_score(y_true, y_pred, average='micro')
0.33...
>>> recall_score(y_true, y_pred, average='weighted')
0.33...
>>> recall_score(y_true, y_pred, average=None)
array([ 1.,  0.,  0.])

需要注意的第一点，y_true和y_pred可以一维矩阵，也可是多维的，但是必须要统一
第二点，micro是计算全局的TP，FP,FN，macro是计算每个类的TP、FP、FN，并算出算术平均值，weighted和macro类似，只是它是加权平均值，权重是由每个真实样本的个数决定的。sample，这个是针对每个样本，目前暂不清楚是如何工作的。

第三点，pos_label 和label是用来确定要计算正类的label。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航