您的位置：首页 > 其它

利用Spark mllib识别点阵文本

2017-09-21 11:09 323 查看

Step 1

准备手写字体，生成图片；

总共写了10个字：你、我、他、分、布、式、计、算、框、架，每个写了10遍

然后写了5个待识别的字：你、我、好、世、界、框、架

图片如下（手机上写的，字丑见谅！）

Step 2

切割图片（抠图），对齐大小至64*64，输出二值化（0-1）点阵，参考了网上的部分代码，java源码如下：

import java.awt.Color;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;

import javax.imageio.ImageIO;

public class ImageTest
{
static int NORMAL_WIDTH = 64;
static int NORMAL_HEIGHT = 64;
static String FILE_DIR = "/Users/bluejoe/testdata/pics";

public static BufferedImage validateArea(File file) throws IOException
{
BufferedImage bi = ImageIO.read(file);
// 获取当前图片的高,宽,ARGB
int h = bi.getHeight();
int w = bi.getWidth();
int arr[][] = new int[w][h];

// 获取图片每一像素点的灰度值
for (int i = 0; i < w; i++)
{
for (int j = 0; j < h; j++)
{
// getRGB()返回默认的RGB颜色模型(十进制)
arr[i][j] = getImageRgb(bi.getRGB(i, j));// 该点的灰度值
}

}

int left = w - 1, top = h - 1, right = 0, bottom = 0;

int FZ = 130;
for (int i = 0; i < w; i++)
{
for (int j = 0; j < h; j++)
{
if (getGray(arr, i, j, w, h) > FZ)
{
if (i > right)
right = i;
if (j > bottom)
bottom = j;
if (i < left)
left = i;
if (j < top)
top = j;
}
}
}

BufferedImage croped = bi.getSubimage(left, top, right - left + 1,
bottom - top + 1);
BufferedImage resized = new BufferedImage(NORMAL_WIDTH, NORMAL_HEIGHT,
BufferedImage.TYPE_INT_ARGB);
resized.getGraphics().drawImage(croped, 0, 0, NORMAL_WIDTH,
NORMAL_HEIGHT, null);
/*
* File file1 = new File(file.getPath() + ".1_.jpg");
* ImageIO.write(resized, "png", file1);
*
* return ImageIO.read(file1);
*/
return resized;
}

public static void main(String[] args) throws IOException
{
File dir = new File(FILE_DIR);
HashMap<String, ArrayList<Integer>> filePoints = new HashMap<String, ArrayList<Integer>>();
for (File file : dir.listFiles())
{
if (!file.isFile() || file.isHidden()
|| file.getPath().endsWith("_.jpg"))
continue;

try
{
BufferedImage bi = validateArea(file);
// 获取当前图片的高,宽,ARGB
int h = bi.getHeight();
int w = bi.getWidth();
int arr[][] = new int[w][h];

// 获取图片每一像素点的灰度值
for (int i = 0; i < w; i++)
{
for (int j = 0; j < h; j++)
{
// getRGB()返回默认的RGB颜色模型(十进制)
arr[i][j] = getImageRgb(bi.getRGB(i, j));// 该点的灰度值
}

}

BufferedImage bufferedImage = new BufferedImage(w, h,
BufferedImage.TYPE_BYTE_BINARY);// 构造一个类型为预定义图像类型之一的
// BufferedImage，TYPE_BYTE_BINARY（表示一个不透明的以字节打包的
// 1、2 或 4 位图像。）

// ArrayList<ArrayList<Integer>> arr2 = new
// ArrayList<ArrayList<Integer>>();

int FZ = 130;
// System.err.println(file.getPath());
ArrayList<Integer> points = new ArrayList<Integer>();
for (int i = 0; i < h; i++)
{
for (int j = 0; j < w; j++)
{
if (getGray(arr, j, i, w, h) > FZ)
{
int black = new Color(255, 255, 255).getRGB();
bufferedImage.setRGB(j, i, black);
points.add(0);
}
else
{
int white = new Color(0, 0, 0).getRGB();
bufferedImage.setRGB(j, i, white);
points.add(1);
}
}
}

filePoints.put(file.getName(), points);
System.err.println(String.format("(%s,%s)",
file.getName().charAt(0) - '0', points));

/*
* File file2 = new File(file.getPath() + ".2_.jpg");
* ImageIO.write(bufferedImage, "jpg", file2);
*/
}
catch (Throwable e)
{
e.printStackTrace();
}
}
}

private static int getImageRgb(int i)
{
String argb = Integer.toHexString(i);// 将十进制的颜色值转为十六进制
// argb分别代表透明,红,绿,蓝 分别占16进制2位
int r = Integer.parseInt(argb.substring(2, 4), 16);// 后面参数为使用进制
int g = Integer.parseInt(argb.substring(4, 6), 16);
int b = Integer.parseInt(argb.substring(6, 8), 16);
int result = (int) ((r + g + b) / 3);
return result;
}

// 自己加周围8个灰度值再除以9，算出其相对灰度值
public static int getGray(int gray[][], int x, int y, int w, int h)
{
int rs = gray[x][y] + (x == 0 ? 255 : gray[x - 1][y])
+ (x == 0 || y == 0 ? 255 : gray[x - 1][y - 1])
+ (x == 0 || y == h - 1 ? 255 : gray[x - 1][y + 1])
+ (y == 0 ? 255 : gray[x][y - 1])
+ (y == h - 1 ? 255 : gray[x][y + 1])
+ (x == w - 1 ? 255 : gray[x + 1][y])
+ (x == w - 1 || y == 0 ? 255 : gray[x + 1][y - 1])
+ (x == w - 1 || y == h - 1 ? 255 : gray[x + 1][y + 1]);
return rs / 9;
}
}

抠完的图很多很多，见下图：

Step 3

将如上输出分别存入2个文件，一个points.txt（10个汉字的手写点阵），一个query.txt（待识别的汉字点阵）；

文件下载地址：文本点阵文件

Step 4

启动spark-shell，加载并采用LogisticRegressionWithLBFGS算法识别：

val data = MLUtils.loadLabeledPoints(sc,"file:///Users/bluejoe/testdata/points.txt");

val query = MLUtils.loadLabeledPoints(sc,"file:///Users/bluejoe/testdata/query.txt").collect.map(_.features)

val model = new LogisticRegressionWithLBFGS().setNumClasses(10).run(data)

识别第1、5个字（我，框）看看：

scala> model.predict(query.collect()(1))
res91: Double = 1.0

scala> model.predict(query.collect()(5))
res92: Double = 8.0

全部识别出来看看：

scala> model.predict(query).collect
res82: Array[Double] = Array(0.0, 1.0, 8.0, 8.0, 4.0, 8.0, 9.0)

结果是，存在的字被正确识别了，不存在的字识别失败！仔细看了一下源码，LogisticRegressionWithLBFGS没有一个同时输出匹配率的方法，它只是简单的挑选了一个匹配度比较高的分类，所以它总能输出一个分类。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： 文字识别 spark 图片机器学习

相关文章推荐

新的分享

章节导航