您的位置：首页 > 编程语言 > Java开发

用java多线程实现“百度翻译接口API快速翻译”

2017-07-08 13:01 507 查看

不知道为啥，突然开始想写博客，可能是想找个地方写点东西，煽情文艺的咱写不了，就写技术贴好了。不当之处，还希望同志们多多指教，不胜感激。

API准备：自己先到百度去申请一个百度翻译API，话说百度翻译还是可以的，每个月200W字符的免费翻译，不做商业的基本够用了，感谢百度。百度不仅提供了API，还提供了各个编程语言的demo，我们这边使用java语言版。自己可以试着玩一玩。

我要做的是用百度翻译来帮我翻译大概有十几个英文文档，长短不一，十几KB到一百多KB不等，总的在600KB左右。

经过试验，单线程跑了一个小时左右，而多线程大概跑了12分钟（最后就剩下那个最长的文档在跑，长板效应！！不是因为它的话时间会缩短很多！！！）。

以前感觉多线程、并行计算老牛逼了，其实实现后发现这实现起来也太简单了点。其实就是把需要相同重复执行的代码封装到run()方法里面。多线程有个很基本的点就是：子线程的执行不影响主线程的执行，想明白这点就差不多了。

//extends Thread 就是继承多线程类
public class MultiTranslate extends Thread{
private static final String APP_ID = "你的API_ID";
private static final String SECURITY_KEY = "API密匙";
//通过定义类属性，来传递变量，Thread类的run（）方法不能带参数
String filename="";
public MultiTranslate(String filename) {
super();
this.filename = filename;
}

@Override
public void run() {
String ThreadName=Thread.currentThread().getName();
//调用百度翻译API
TransApi api = new TransApi(APP_ID, SECURITY_KEY);
//获得待翻译文档路径，文档为eclipse下项目Source文件，filename为文件名
String url = newMain.class.getClassLoader().getResource(filename).getPath();
try {
String newFile = URLDecoder.decode(url, "UTF-8");
BufferedReader reader = new BufferedReader(new InputStreamReader(
new FileInputStream(newFile), "utf-8"));
String line = null;
//翻译完输出到新文档，我给新文档名加了前缀“ZH”
FileOutputStream out = new FileOutputStream(new File("ZH"+filename),true);
//正则匹配，百度翻译完传回来一个json形式，包括很多内容，我只取翻译结果
String questionRegex = "\"dst\":\"(.*)\"";
Pattern pattern = Pattern.compile(questionRegex);
while ((line = reader.readLine()) != null) {
String[] words=line.split(" ");
StringBuilder NewLine=new StringBuilder();
for(int i=0;i<words.length;i++){
//调用接口，进行翻译
String get = new String(api.getTransResult(words[i], "auto", "zh"));
Matcher matcher = pattern.matcher(get);
while (matcher.find()) {
String getword = matcher.group(1);
//API返回的是Unicode编码，import org.apache.commons.lang.StringEscapeUtils，
//调用StringEscapeUtils.unescapeJava()方法可解决编码问题
String newgetword = StringEscapeUtils.unescapeJava(getword);
NewLine.append(newgetword+" ");
}
}
String newline=NewLine.toString().trim()+"\n";
System.out.println(filename+" "+ThreadName+" newline="+newline);
out.write(newline.getBytes("utf-8"));
}
out.close();
reader.close();
} catch (Exception e) {
throw new RuntimeException("文档加载失败！");
}
}

public static void main(String[] args) throws FileNotFoundException, IOException, InterruptedException {
String path="xxx";
ArrayList<String> filenameList=new ArrayList<String>();
filenameList=getAllFile(path);
for (int i = 0; i < filenameList.size(); i++) {
//对每个文档生成一个线程，for循环生成多个线程分别执行翻译任务
Thread myThread=new MultiTranslate(filenameList.get(i));
//Thread会去执行run()方法
myThread.start();
//如果多加了join（）方法会变成单线程效果，主线程会等待子线程执行完再继续。
//myThread.join();
}
}
/**
* 读取某个文件夹下的所有文件，包括子文件夹下的文件
*
* @param filepath
* 文件夹的路径
* @return
* @throws FileNotFoundException
* @throws IOException
*/
public static ArrayList<String> getAllFile(String filepath)
throws FileNotFoundException, IOException {
File file = new File(filepath);
ArrayList<String> pathList=new ArrayList<String>();
File[] filelist = null;
try {
filelist = file.listFiles();
for (File onefile : filelist) {
File readfile = new File(onefile.getPath());
if (readfile.isDirectory()) {
getAllFile(onefile.getPath());
} else {
String filename=getDocName(readfile.getPath());
pathList.add(filename);
}
}
} catch (FileNotFoundException e) {
throw new RuntimeException("getAllFile() Exception:"
+ e.getMessage());
}
return pathList;
}

//取得文档标题
public static String getDocName(String path) {
String[] temp = path.split("\\\\");
if (temp.length >= 1) {
String strtemp = temp[temp.length - 1];
return strtemp.trim();
}
return "";
}
}

转发请注明原文出处，谢谢。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： java 多线程百度翻译API 并行快速翻译

相关文章推荐

新的分享

章节导航