您的位置:首页 > 编程语言 > C#

C# 验证码识别基础方法及源码介绍

2014-01-03 18:34 495 查看


C# 验证码识别基础方法及源码介绍

2012-10-16 16:24 来源:博客园 作者:Aimeast 字号:T|T

http://dotnet.9sssd.com/csbase/art/754

[摘要]本文介绍C# 验证码识别基础方法及源码,包括去背景噪音和二值化、制作字符样本等相关内容和提供详细的源码供参考。


先说说写这个的背景

最近有朋友在搞一个东西,已经做的挺不错了,最后想再完美一点,于是乎就提议把这种验证码给K.O.了,于是乎就K.O.了这个验证码。达到单个图片识别时间小于200ms,500个样本人工统计正确率为95%。由于本人没有相关经验,是摸着石头过河。本着经验分享的精神,分享一下整个分析的思路。在各位大神面前献丑了。


再看看部分识别结果



是不是看着很眼熟?


处理第一步 去背景噪音和二值化

对于这一块,考虑了几种方法。

方法一:统计图片颜色分布,颜色占有率低的判定为背景噪音。由于背景噪音和前景色区分并不明显,尝试了很多种取景方法都不能很好去除背景噪音,最终放弃了这种方法。

方法二:事后在网上稍微查了下,最近比较流行计算灰度后设定一个阈值进行二值化。其实所谓的灰度图片原理是根据人眼对色彩敏感度取了权值,这个权值对计算机来说没有什么意义。稍微想一下就可以发现,这两个过程完全可以合并。于是乎我一步完成了去背景噪音和二值化。阈值设置为RGB三分量之和到500。结果非常令人满意。




处理第二步 制作字符样本

样本对于计算机来说是非常重要的,因为计算机很难有逻辑思维,就算有逻辑思维也要经过长期训练才能让你满意。所以要用事先制作好的样本进行比较。如果你仔细观察过这些验证码会发现一个bug,几乎大部分的验证码都是使用同样的字体,于是乎就人工制作了一套字体的样本。由于上一步已经有去除背景噪音的结果,可以直接利用。制作样本这一步有点简单枯燥,还需要细心。可能因为你的一个不细心会导致某个符号的识别率偏低。在这500个样本中,只发现了31个字符。幸亏是某部门的某人员还考虑到了易错的字符,例如,1和I,0和O等。要不然这个某部门要背负更多的骂名。


处理第三步 匹配

单个匹配用了最简单最原始的二值比较,不过匹配的是匹配率而不是匹配数。我定义了相关的计分原则。大原则是“该有的有了加分,该有的没了减分,不该有的有了适度减分,可达区域外的不算分”。

由于一些符号的部分区域匹配结果跟另一些符号的完整匹配结果相似,需要把单个匹配在一个扩大的区域内择优。在一定的范围内,找到一个最佳匹配,这个最佳匹配就是当前位置对应的符号。

完成了一次最佳匹配,可以把匹配位置向右推进一大步,若找不到合适的最佳匹配就向右推进一小步。


处理第四步 优化和调整

任何一个算法都是需要优化和调整的。现在要找到最佳参数配置和最佳代码组织。这一步往往是需要花费最多时间和精力的。


处理第五步 验证结果

这一步呢,纯人力验证结果,统计出正确率。


思考

结果是出来了,代码也不多,效果也很理想。搞这一行的,很多时候都想要通用的。能否通用,很大程度上在于抽象层次。本方法只是单纯的匹配,自然不能通用,但是方法和思想却是通用的。具体案例具体分析。至于扭曲文字、空心文字等,处理要复杂的多。网上也有一些使用第三方图像库的方法,也许那些方法会比较通用。等有空了有兴趣了继续搞一下这个主题。


源码

至于这个源码要不要发布,纠结了一段时间。网上已经有类似的商业活动了,而且这个识别本身没有太大难度,再加上某系统天生的bug,此验证码本身就相当于没有设置,因此发布此代码,仅作于学习交流。

View Row Code
1using

System.Collections.Generic;
2using

System.Drawing;
3using

System.IO;
4using

System.IO.Compression;
5
6namespace

Crack12306Captcha
7{
8public

class
Cracker
9{
10List<CharInfo>
words_
=
new
List<CharInfo>();
11
12public

Cracker()
13{
14var bytes
=

new
byte[] {
150x1f,

0x8b,
0x08,
0x00,
0x00,
0x00,
0x00,
0x00,
0x04,
0x00,
0xc5,
0x58,
0xd9,
0x92,
0x13,
0x31,
160x0c,

0x94,
0x9e,
0x93,
0x0c,
0x61,
0x97,
0x2f,
0xe1,
0x58,
0xe0,
0x91,
0x9b,
0x82,
0x62,
0x0b,
170x58,

0xee,
0xff,
0xff,
0x10,
0xd8,
0xcc,
0xc8,
0xea,
0x96,
0x6c,
0x8f,
0x13,
0x48,
0xe1,
0xaa,
180x4d,

0x46,
0x96,
0x6d,
0xb5,
0x8e,
0x96,
0x67,
0x73,
0x7f,
0x3b,
0x09,
0x0e,
0x25,
0x41,
0x49,
190xa3,

0xae,
0xd7,
0x5b,
0xa9,
0xa8,
0xd5,
0xb4,
0x76,
0x02,
0x6a,
0x5c,
0x52,
0x94,
0x54,
0xed,
200x18,

0x5a,
0x7f,
0x18,
0x00,
0x00,
0x84,
0x07,
0x1b,
0x80,
0x4a,
0x9a,
0x08,
0x35,
0xb8,
0x81,
210x50,

0xe7,
0xad,
0xbe,
0xc4,
0x8e,
0xb1,
0x4f,
0x2d,
0x5f,
0xba,
0x80,
0xbb,
0xfd,
0x9a,
0xad,
220x19,

0x36,
0xe5,
0xad,
0x87,
0xf1,
0x10,
0xc0,
0x8d,
0xc6,
0x50,
0x40,
0x52,
0xf8,
0xb3,
0x98,
230x2c,

0xd6,
0xec,
0x59,
0xe7,
0x0d,
0x3e,
0x0f,
0x93,
0x3e,
0x1d,
0x02,
0x7a,
0x18,
0x8f,
0xb6,
240xc7,

0x46,
0x4e,
0x01,
0xa3,
0x96,
0xdc,
0x3a,
0x20,
0x77,
0xbf,
0x2c,
0x24,
0xe4,
0x80,
0xa9,
250x20,

0x14,
0xe5,
0x2d,
0xb5,
0x68,
0xc9,
0x55,
0x89,
0x23,
0x96,
0x82,
0xaa,
0xba,
0x58,
0xa6,
260x03,

0x38,
0x71,
0x4b,
0x29,
0xd2,
0x47,
0x80,
0xe3,
0x84,
0x91,
0xf4,
0x78,
0x43,
0x64,
0x41,
270x7b,

0x73,
0x99,
0x80,
0x42,
0x48,
0x00,
0xde,
0x00,
0x12,
0x88,
0x80,
0xdb,
0x51,
0x4a,
0x49,
280x84,

0x43,
0xf6,
0x51,
0x90,
0x27,
0x21,
0xc9,
0xf8,
0xac,
0x00,
0x4d,
0xcd,
0x46,
0x09,
0x9d,
290x15,

0x78,
0xe0,
0x00,
0x1e,
0x44,
0x2a,
0x51,
0x8c,
0xbc,
0xd3,
0xa3,
0x68,
0x8a,
0xd5,
0x3a,
300x20,

0x79,
0xba,
0x4d,
0x71,
0x4c,
0x0b,
0x91,
0x98,
0x90,
0x7b,
0x2a,
0x42,
0xc5,
0x78,
0x7a,
310xfc,

0xd5,
0x1b,
0x4b,
0x09,
0xa7,
0x27,
0x99,
0x38,
0x05,
0x01,
0xc2,
0x80,
0x39,
0x9c,
0x67,
320xbb,

0x4e,
0x7f,
0x6c,
0x33,
0xdd,
0xed,
0x87,
0x55,
0xda,
0x5d,
0xb5,
0x56,
0x33,
0xc6,
0xf9,
330xea,

0x60,
0x64,
0xcf,
0xa7,
0x41,
0xe0,
0x5c,
0x1c,
0xc4,
0xb2,
0x25,
0xa3,
0x89,
0x88,
0x8d,
340x16,

0x00,
0xb5,
0xed,
0xa5,
0x22,
0x9d,
0x52,
0x41,
0x53,
0x8d,
0x92,
0x7f,
0x31,
0x51,
0x3f,
350xa8,

0x00,
0x85,
0x8a,
0x71,
0x10,
0x92,
0x78,
0xc4,
0x59,
0x08,
0x39,
0x69,
0xa9,
0x38,
0x41,
360x48,

0xf7,
0x40,
0x5a,
0x03,
0xd5,
0x3a,
0xf5,
0xe5,
0x9d,
0x33,
0x66,
0xc3,
0xd7,
0x1f,
0xef,
370x94,

0xa0,
0x53,
0xea,
0xf4,
0x15,
0xb2,
0x1c,
0x40,
0x2d,
0xcf,
0xaf,
0xce,
0xe9,
0xd4,
0x7a,
380x89,

0x09,
0xe6,
0xdd,
0xdb,
0x0e,
0xb8,
0x58,
0xa7,
0x60,
0x37,
0xfd,
0xf2,
0xfa,
0x2c,
0x4e,
390x51,

0x87,
0x0d,
0xfc,
0x16,
0x72,
0x2a,
0x5f,
0xc0,
0x80,
0xf0,
0x54,
0xa7,
0xde,
0xfc,
0x15,
400x8b,

0x9a,
0x36,
0x3a,
0x2c,
0x62,
0xfc,
0xd4,
0x8c,
0x31,
0xb7,
0xea,
0xd7,
0x26,
0xc4,
0xaf,
410x75,

0xea,
0xdb,
0x8b,
0xff,
0x9b,
0x9b,
0x50,
0x7e,
0xfe,
0x15,
0xab,
0x17,
0x2f,
0x96,
0x96,
420xbd,

0xaa,
0x87,
0xdd,
0x77,
0xa3,
0x77,
0xd3,
0x85,
0xf0,
0xe0,
0x58,
0xd5,
0xf6,
0x8c,
0xcd,
430xc4,

0x63,
0x52,
0x12,
0x48,
0x46,
0x0f,
0x93,
0x5a,
0xe3,
0xea,
0x24,
0x67,
0x73,
0x63,
0xa0,
440xdf,

0xdf,
0x3d,
0x67,
0xf6,
0xa9,
0xfc,
0xed,
0x08,
0xe3,
0x82,
0x57,
0x08,
0x35,
0x47,
0x68,
450x9c,

0x01,
0x40,
0x87,
0x8b,
0xbd,
0x0c,
0xb3,
0xf4,
0xe1,
0x72,
0xd7,
0x54,
0x62,
0xfd,
0x40,
460xed,

0x99,
0xa6,
0x7e,
0x2b,
0xe4,
0xb4,
0xc4,
0x62,
0x0d,
0x79,
0xae,
0x1b,
0xd7,
0xf4,
0x09,
470xb7,

0xe1,
0x7c,
0x44,
0x09,
0x9a,
0xda,
0xff,
0x52,
0x6a,
0x3c,
0xe1,
0xc8,
0xd7,
0xbd,
0xbb,
480xbe,

0x37,
0xfc,
0xd6,
0xd5,
0x4e,
0x3c,
0x40,
0x2a,
0x4b,
0x39,
0x1a,
0xbd,
0x2a,
0xcd,
0xc1,
490x18,

0x59,
0x40,
0x62,
0x78,
0xec,
0x63,
0x19,
0x72,
0xf0,
0xcf,
0xf8,
0x38,
0xfa,
0x42,
0x3a,
500xc8,

0x02,
0xec,
0x5b,
0xeb,
0x8d,
0xae,
0xf1,
0x45,
0xdd,
0x32,
0x98,
0x35,
0x3c,
0x9f,
0xa6,
510x3d,

0xce,
0x13,
0xce,
0x94,
0x38,
0x87,
0x00,
0x8d,
0x85,
0xc4,
0x70,
0x17,
0x26,
0x0e,
0xa6,
520x1e,

0x16,
0xcb,
0xbf,
0x52,
0xdf,
0x29,
0x63,
0xc4,
0xf6,
0x8c,
0x35,
0xba,
0xf2,
0xf9,
0x1f,
530xbf,

0x73,
0x1f,
0x91,
0x1b,
0x9e,
0x24,
0x5e,
0x63,
0x22,
0x82,
0x23,
0x05,
0x19,
0xb9,
0x71,
540x73,

0xdc,
0xcf,
0x05,
0x88,
0x94,
0x71,
0xdb,
0xdd,
0x48,
0x10,
0xd5,
0x55,
0xb3,
0x52,
0xc3,
550x1b,

0x01,
0x94,
0x13,
0x74,
0x94,
0x3a,
0x80,
0x2f,
0x39,
0xe2,
0x75,
0x0e,
0xf2,
0xc6,
0x18,
560xdc,

0x46,
0xfc,
0xf3,
0xea,
0x14,
0x80,
0xc1,
0xce,
0x24,
0xee,
0x72,
0xed,
0x94,
0xaf,
0xfb,
570xa9,

0xaa,
0x4a,
0xe0,
0xd4,
0x22,
0xc6,
0xf0,
0x57,
0x1d,
0x8e,
0xd2,
0x90,
0xc6,
0x0c,
0xd3,
580x9a,

0x53,
0xfb,
0xd6,
0xb7,
0xdd,
0x14,
0xd4,
0xbd,
0x41,
0xa7,
0x80,
0x7b,
0x23,
0xfe,
0x34,
590x56,

0x0d,
0x96,
0x46,
0x02,
0xfe,
0xfd,
0xb2,
0x00,
0x5f,
0x01,
0x9c,
0xa0,
0x32,
0x39,
0xd7,
600x90,

0xc2,
0x6c,
0xc7,
0x4e,
0x68,
0x88,
0x7d,
0x9f,
0x9b,
0xcf,
0xa7,
0xbe,
0xa0,
0xfc,
0x18,
610x7d,

0x07,
0x5b,
0xa9,
0xbe,
0x56,
0x1f,
0x67,
0x1a,
0x4a,
0x91,
0x9c,
0x04,
0x38,
0x53,
0x6b,
620x70,

0x68,
0x8f,
0xea,
0xf4,
0x34,
0x87,
0x7f,
0x6e,
0x82,
0xc3,
0xc1,
0xab,
0x40,
0xc4,
0x50,
630x13,

0x0e,
0x33,
0x5d,
0x67,
0x7d,
0x01,
0x1f,
0xdb,
0xc0,
0x7f,
0xed,
0x87,
0x7f,
0xbc,
0x0f,
640x75,

0xe0,
0xa5,
0xba,
0xc0,
0x84,
0x3d,
0x24,
0x04,
0xe0,
0xf1,
0x16,
0x41,
0x3b,
0x74,
0xd2,
650x52,

0xc5,
0xf8,
0x7c,
0x12,
0xfb,
0xe4,
0x37,
0x5b,
0xfb,
0x57,
0x11,
0xa1,
0x18,
0x00,
0x00,
66};
67using (var
stream
=
new
MemoryStream(bytes))
68using (var
gzip
=
new
GZipStream(stream,
CompressionMode.Decompress))
69using (var
reader
=
new
BinaryReader(gzip))
70{
71while (true)
72{
73char ch
= reader.ReadChar();
74if (ch
==

'\0')
75break;
76int width
= reader.ReadByte();
77int height
= reader.ReadByte();
78
79bool[,] map
=
new

bool[width, height];
80for (int
i
=
0; i
< width; i++)
81for (int
j
=
0; j
< height; j++)
82map[i, j]
= reader.ReadBoolean();
83words_.Add(new

CharInfo(ch, map));
84}
85}
86}
87
88public

string
Read(Bitmap bmp)
89{
90var result
=

string.Empty;
91var width
= bmp.Width;
92var height
= bmp.Height;
93var table
=

ToTable(bmp);
94var next
=

SearchNext(table,
-1);
95
96while (next
< width
-

7)
97{
98var matched
=

Match(table, next);
99if (matched.Rate
>

0.6)
100{
101result
+= matched.Char;
102next
= matched.X
+

10;
103}
104else
105{
106next
+=

1;
107}
108}
109
110return result;
111}
112
113private

bool[,]
ToTable(Bitmap bmp)
114{
115var table
=

new
bool[bmp.Width, bmp.Height];
116for (int
i
=
0; i
< bmp.Width; i++)
117for (int
j
=
0; j
< bmp.Height; j++)
118{
119var color
= bmp.GetPixel(i, j);
120table[i, j]
= (color.R
+ color.G
+ color.B
<

500);
121}
122return table;
123}
124
125private

int
SearchNext(bool[,] table,

int start)
126{
127var width
= table.GetLength(0);
128var height
= table.GetLength(1);
129for (start++;
start
< width; start++)
130for (int
j
=
0; j
< height; j++)
131if (table[start, j])
132return start;
133
134return start;
135}
136
137private

double
FixedMatch(bool[,] source,

bool[,] target,
int x0,
int y0)
138{
139double total
=

0;
140double count
=

0;
141int targetWidth
= target.GetLength(0);
142int targetHeight
= target.GetLength(1);
143int sourceWidth
= source.GetLength(0);
144int sourceHeight
= source.GetLength(1);
145int x, y;
146
147for (int
i
=
0; i
< targetWidth; i++)
148{
149x
= i
+ x0;
150if (x
<

0
|| x
>= sourceWidth)
151continue;
152for (int
j
=
0; j
< targetHeight; j++)
153{
154y
= j
+ y0;
155if (y
<

0
|| y
>= sourceHeight)
156continue;
157
158if (target[i, j])
159{
160total++;
161if (source[x, y])
162count++;
163else
164count--;
165}
166else

if (source[x, y])
167count
-=

0.55;
168}
169}
170
171return count
/ total;
172}
173
174private

MatchedChar
ScopeMatch(bool[,] source,

bool[,] target,
int start)
175{
176int targetWidth
= target.GetLength(0);
177int targetHeight
= target.GetLength(1);
178int sourceWidth
= source.GetLength(0);
179int sourceHeight
= source.GetLength(1);
180
181double max
=

0;
182var matched
=

new
MatchedChar();
183for (int
i
=
-2; i
<

6; i++)
184for (int
j
=
-3; j
< sourceHeight
- targetHeight
+

5; j++)
185{
186double rate
=

FixedMatch(source, target, i
+ start, j);
187if (rate
> max)
188{
189max
= rate;
190matched.X
= i
+ start;
191matched.Y
= j;
192matched.Rate
= rate;
193}
194}
195return matched;
196}
197
198private

MatchedChar
Match(bool[,] source,

int start)
199{
200MatchedChar best
=

null;
201foreach (var
info
in words_)
202{
203var matched
=

ScopeMatch(source, info.Table, start);
204matched.Char
= info.Char;
205if (best
==

null
|| best.Rate
< matched.Rate)
206best
= matched;
207}
208return best;
209}
210
211private

class
CharInfo
212{
213public

char
Char {
get;
private
set; }
214public

bool[,]
Table {
get;
private
set; }
215
216public

CharInfo(char ch,

bool[,] table)
217{
218Char
= ch;
219Table
= table;
220}
221}
222
223private

class
MatchedChar
224{
225public

int
X {
get;
set; }
226public

int
Y {
get;
set; }
227public

char
Char {
get;
set; }
228public

double
Rate {
get;
set; }
229}
230}
231}
使用方法:

View Row Code
1var cracker
=
new

Cracker();
2var result
= cracker.Read(img);
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: