您的位置：首页 > 其它

svm理论与实验之20: libsvm多标签实验与评价指标

2013-10-15 17:50 197 查看

徐海蛟博士 Teaching.

数据集如下：
名称源类型类数训练样本测试样本特征数
----------------------------------------------------------------
scene景色MB04a多标签61,2111,196294
yeast酵母AE02a多标签141,500917103

svm-train ../svm-data/scene scene.model 出错
cd D:\work\Lab\libsvm-3.17\tools
python trans_class.py ../svm-data/scene ../svm-data/scene.t
> 生成: "tmp_train" , "tmp_test" , "tmp_class"
..\windows\svm-train -t 0 tmp_train
..\windows\svm-predict tmp_test tmp_train.model tmp.result
> 精度 = 66.7224% (798/1196) (分类) 线性核(-t 0) √
> 精度 = 16.4716% (197/1196) (分类) 多项式核
> 精度 = 63.796% (763/1196) (分类) RBF核
> 精度 = 57.0234% (682/1196) (分类) S型核
python measure.py ../svm-data/scene.t tmp.result tmp_class
>
number of labels = 6
Exact match ratio: 0.667224080268
微平均 F-measure: 0.719215686275
宏平均 F-measure: 0.723864830515

开始有新的评价指标了: 宏平均、微平均。

python trans_class.py ../svm-data/yeast ../svm-data/yeast.t
> 生成: "tmp_train" , "tmp_test" , "tmp_class"
..\windows\svm-train -t 0 tmp_train
..\windows\svm-predict tmp_test tmp_train.model tmp.result
> 精度 = 25.0818% (230/917) (分类) 线性核(-t 0) √
> 精度 = 15.1581% (139/917) (分类) 多项式核
> 精度 = 14.6129% (134/917) (分类) RBF核
> 精度 = 14.6129% (134/917) (分类) S型核
python measure.py ../svm-data/yeast.t tmp.result tmp_class
>
number of labels = 14
Exact match ratio: 0.250817884406
微平均 F-measure: 0.636425186188
宏平均 F-measure: 0.378852223294

解释下评价指标。

测试集scene.t有6个类, 类别ci(i=1,..,6)的分类结果中，正确分为该类的样本(RR)数目是a，错误划归为该类的样本(RN)数目是b，将本属于该类却错误划归为其它类的样本(NR)数目是c。

1.准确率：p = a / (a+b)，衡量的是类别ci的查准率。
2.召回率：r = a / (a+c)，衡量的是类别ci的查全率。
3.F1：F1 = 2(p×r)/(p+r) 或者 2/(1/p + 1/r)，衡量的是类别ci查全率和查准率的综合，以及对它们的偏向程度。这里贝塔=1。
4.微平均: 每个文档性能指标(p,r,F1等)的算数平均值。
MicroP = a之和 / (a+b)之和(共6个类)
MicroR = a之和 / (a+c)之和(共6个类)
MicroF1= 2(MicroP×MicroP)/(MicroP+MicroP) =
5.宏平均：每类文档性能指标(p,r,F1等)的算数平均值。
MacroP = (p1 + ... + p6) / 6
MacroR = (r1 + ... + r2) / 6
MacroF1= (6个类的F1之和) / 6 = 0.723864830515
6.MAP(MeanAP)：对所有查询的AP求宏平均。
这里未对分类考虑排序，故没有计算MAP。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航