您的位置：首页 > 编程语言 > Python开发

Python 识别基本图形验证码

2019-04-09 23:20 169 查看

做UI自动化绕不开的就是识别验证码了，最简单的方法就是让开发加一个万能验证码，但在生产环境下最好自己写代码识别，保证生产环境的真实性。其实也有其他方法，涉及到验证码储存的机制，由于过于麻烦，这边不做过多说明，本次教学面向的验证码属于最基本的图形验证码

步骤一：

截取页面上的验证码截图，并对验证码做简单的图像处理，使之更容易辨识。

步骤二：

利用第三方库把截图上的字符输出出来。

使用到的第三方库如下：

pillow

pip install pillow

pytesseract

pip install pytesseract

先来说下怎么截取验证码，

from PIL import Image
from selenium import webdriver

'''
省略了driver = webdriver.Firefox()那些过程...建议结合自己的项目或自己找
'''
driver.save_screenshot('v_code.png')	# 先把整个页面截图下来
element = driver.find_element_by_id('verifcode')	# 定位验证码元素
print("验证码元素位置:", element.location)
print("验证码元素尺寸:", element.size)
left = element.location['x']
top = element.location['y']
right = element.location['x'] + element.size['width']
bottom = element.location['y'] + element.size['height']
im = Image.open('v_code.png')
im = im.crop((left, top, right, bottom))
im.save('v_code.png')  # 将得到的图片保存在本地

》》》验证码元素位置: {'x': 628, 'y': 300}
》》》验证码元素尺寸: {'width': 70.0, 'height': 30.0}

到这一步能拿到原始验证码，接下来需对其做图像处理

from PIL import Image
from PIL import ImageOps

def clean_image(image_path):
# 打开图片
image = Image.open(image_path)
# 处理图上的每个像素点，使图上每个点“非黑即白”
image = image.point(lambda x: 0 if x < 143 else 255)
border_image = ImageOps.expand(image, border=20, fill='white')
border_image.save(image_path)

image_path = r'C:\Users\hli7\Desktop\v_code.png'
clean_image(image_path)

处理过后能看见原本保存的验证码截图变了

有了处理过的图片，接着使用pytesseract 库输出图中的字符：

from pytesseract import pytesseract

print(pytesseract.image_to_string(image_path))
》》》zq nB

发现输出结果有空格，加一个replace()方法处理一下：

from pytesseract import pytesseract

print(pytesseract.image_to_string(image_path).replace(' ',''))
》》》zqnB

完成。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航