您的位置:首页 > 编程语言 > Python开发

【语言处理与Python】5.6基于转换的标注

2013-05-26 16:09 453 查看
Brill标注,一种归纳标注方法。基于转换的学习:猜每个词的标记,然后返回和修复错误。在这种方式中,Brill标注器将会陆续将一个不良标注的文本转换成一个更好的。需要已经标注的训练数据来评估标注器的猜测是否是一个错误。

(1) ThePresidentsaid he willask Congressto increase grantsto states for vocational rehabilitation.

在这句话中,Brill是这样做的:





下面这个代码,演示了Brill标注器:

>>>nltk.tag.brill.demo()
TrainingBrilltagger on80sentences...
Finding initial usefulrules...
Found6555usefulrules.
B |
S F r O | Score= Fixed- Broken
c i o t | R Fixed= num tags changedincorrect -> correct
o x k h | u Broken= num tags changedcorrect -> incorrect
r e e e | l Other= num tags changedincorrect -> incorrect
e d n r | e
------------------+-------------------------------------------------------
12 13 1 4 | NN-> VBif the tag ofthe precedingwordis 'TO'
8 9 1 23 | NN-> VBDif the tag ofthe following wordis 'DT'
8 8 0 9 | NN-> VBDif the tag ofthe preceding wordis 'NNS'
6 9 3 16 | NN-> NNPif the tag ofwordsi-2...i-1 is '-NONE-'
5 8 3 6 | NN-> NNPif the tag ofthe following wordis 'NNP'
5 6 1 0 | NN-> NNPif the text ofwordsi-2...i-1 is 'like'
5 5 0 3 | NN-> VBNif the text ofthe following wordis '*-1'
...
>>>print(open("errors.out").read())
left context | word/test->gold | right context
-----------------------------------+--------------------------------+--------------------------| Then/NN->RB | ,/, in/IN the/DT guests/N
, in/IN the/DT guests/NNS | '/VBD->POS | honor/NN,/, the/DT speed
'/POS honor/NN,/, the/DT | speedway/JJ->NN | hauled/VBD out/RP four/CD
NN,/, the/DT speedway/NN| hauled/NN->VBD | out/RPfour/CD drivers/NN
DTspeedway/NN hauled/VBD| out/NNP->RP | four/CD drivers/NNS,/, c
dway/NNhauled/VBD out/RP| four/NNP->CD | drivers/NNS ,/, crews/NNS
hauled/VBD out/RPfour/CD | drivers/NNP->NNS | ,/, crews/NNS and/CC even
Pfour/CD drivers/NNS ,/, | crews/NN->NNS | and/CC even/RB the/DT off
NNSand/CC even/RB the/DT| official/NNP->JJ | Indianapolis/NNP 500/CDa
| After/VBD->IN | the/DT race/NN ,/, Fortun
ter/IN the/DT race/NN ,/, | Fortune/IN->NNP | 500/CDexecutives/NNS dro
s/NNS drooled/VBD like/IN | schoolboys/NNP->NNS | over/INthe/DT cars/NNS a
olboys/NNS over/INthe/DT | cars/NN->NNS | and/CC drivers/NNS ./.


有关Brill的具体知识,请自行查找相关资料。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: