dom4j中使用SAXReader读取xml文件出现Premature end of file异常问题
2014-01-14 19:19
781 查看
一般性写法:
SAXReader reader = new SAXReader();
Document dom = null;
try{
dom = reader.read(new File(path));
}catch(Exception e){e.printStackTrace();}
此种写法有时会出现org.dom4j.DocumentException: Error on line -1 of document : Premature end of file. Nested exception: Premature end of file.异常,
而文件却是没什么问题。
官网上有解释说:
使用dom4j解析来自InputStream中的XML内容,发生异常:Premature end of file原因是:
这个InputStream已经读过了,在读入dom4j时,不是从开头读的,因此报错。
另外加点自己的想法:上面一种写法未指定读取文件的编码,导致dom4j在处理解析时出现错误!
因此有了另外一种写法(先获取文件内容存入缓存中):
Document dom = null;
try{
StringBuffer content = new StringBuffer();
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(path),"UTF-8"));
String line = null;
while((line = br.readLine()) != null ){
content.append(line+"\n");
}
br.close();
dom = DocumentHelper.parseText(content.toString());
}catch(Exception e){
}
若该种写法仍有问题,请先确认该文件的编码方式,若是UTF-8编码的且进行过BOM mark的则需要对文件内容进行特殊处理(即去掉文件开头的BOM mark标记)。
具体可参见该处理工具类:
SAXReader reader = new SAXReader();
Document dom = null;
try{
dom = reader.read(new File(path));
}catch(Exception e){e.printStackTrace();}
此种写法有时会出现org.dom4j.DocumentException: Error on line -1 of document : Premature end of file. Nested exception: Premature end of file.异常,
而文件却是没什么问题。
官网上有解释说:
使用dom4j解析来自InputStream中的XML内容,发生异常:Premature end of file原因是:
这个InputStream已经读过了,在读入dom4j时,不是从开头读的,因此报错。
另外加点自己的想法:上面一种写法未指定读取文件的编码,导致dom4j在处理解析时出现错误!
因此有了另外一种写法(先获取文件内容存入缓存中):
Document dom = null;
try{
StringBuffer content = new StringBuffer();
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(path),"UTF-8"));
String line = null;
while((line = br.readLine()) != null ){
content.append(line+"\n");
}
br.close();
dom = DocumentHelper.parseText(content.toString());
}catch(Exception e){
}
若该种写法仍有问题,请先确认该文件的编码方式,若是UTF-8编码的且进行过BOM mark的则需要对文件内容进行特殊处理(即去掉文件开头的BOM mark标记)。
具体可参见该处理工具类:
</pre><p> </p><pre class="java" name="code">package org.shefron.fc.utfwithbom; import java.io.FileInputStream; import java.io.IOException; import java.io.InputStream; import java.io.PushbackInputStream; public class UTFFileHandler { /** * * @param file the filePath * @return the FileInputStream * @throws Exception */ public static InputStream getInputStream(String file) throws Exception{ FileInputStream fis = null; try{ fis = new FileInputStream(file); }catch(Exception e){ System.out.println(e.getMessage()); throw new Exception("IO Stream Error!"); } return fis; } /** * * @param file the filePath * @param enc the default encoding * @return the UTFFileHandler.UnicodeInputStream * @throws Exception */ public static InputStream getInputStreamWithoutBom(String file,String enc) throws Exception{ UnicodeInputStream stream = null; try{ FileInputStream fis = new FileInputStream(file); stream = new UnicodeInputStream(fis,null); System.out.println("encoding:"+stream.getEncoding() ); }catch(Exception e){ System.out.println(e.getMessage()); throw new Exception("IO Stream Error!"); } return stream; } /** * This inputstream will recognize unicode BOM marks and will skip bytes if * getEncoding() method is called before any of the read(...) methods. * * Usage pattern: String enc = "ISO-8859-1"; // or NULL to use systemdefault * FileInputStream fis = new FileInputStream(file); UnicodeInputStream uin = new * UnicodeInputStream(fis, enc); enc = uin.getEncoding(); // check and skip * possible BOM bytes InputStreamReader in; if (enc == null) in = new * InputStreamReader(uin); else in = new InputStreamReader(uin, enc); */ public static class UnicodeInputStream extends InputStream { PushbackInputStream internalIn; boolean isInited = false; String defaultEnc; String encoding; private static final int BOM_SIZE = 4; public UnicodeInputStream(InputStream in, String defaultEnc) { internalIn = new PushbackInputStream(in, BOM_SIZE); this.defaultEnc = defaultEnc; } public String getDefaultEncoding() { return defaultEnc; } public String getEncoding() { if (!isInited) { try { init(); } catch (IOException ex) { IllegalStateException ise = new IllegalStateException( "Init method failed."); ise.initCause(ise); throw ise; } } return encoding; } /** * Read-ahead four bytes and check for BOM marks. Extra bytes are unread * back to the stream, only BOM bytes are skipped. */ protected void init() throws IOException { if (isInited) return; byte bom[] = new byte[BOM_SIZE]; int n, unread; n = internalIn.read(bom, 0, bom.length); if ((bom[0] == (byte) 0x00) && (bom[1] == (byte) 0x00) && (bom[2] == (byte) 0xFE) && (bom[3] == (byte) 0xFF)) { encoding = "UTF-32BE"; unread = n - 4; } else if ((bom[0] == (byte) 0xFF) && (bom[1] == (byte) 0xFE) && (bom[2] == (byte) 0x00) && (bom[3] == (byte) 0x00)) { encoding = "UTF-32LE"; unread = n - 4; } else if ((bom[0] == (byte) 0xEF) && (bom[1] == (byte) 0xBB) && (bom[2] == (byte) 0xBF)) { encoding = "UTF-8"; unread = n - 3; } else if ((bom[0] == (byte) 0xFE) && (bom[1] == (byte) 0xFF)) { encoding = "UTF-16BE"; unread = n - 2; } else if ((bom[0] == (byte) 0xFF) && (bom[1] == (byte) 0xFE)) { encoding = "UTF-16LE"; unread = n - 2; } else { // Unicode BOM mark not found, unread all bytes encoding = defaultEnc; unread = n; } // System.out.println("read=" + n + ", unread=" + unread); if (unread > 0) internalIn.unread(bom, (n - unread), unread); isInited = true; } public void close() throws IOException { // init(); // isInited = true; internalIn.close(); } public int read() throws IOException { // init(); // isInited = true; return internalIn.read(); } } }
相关文章推荐
- 使用dom4j读取xml配置文件
- spring使用@Value注解读取.properties文件时出现中文乱码问题的解决
- 使用DOM4J读取XML文件
- 使用记事本编写xml文件保存出现异常解决办法
- Dom4j无法以UTF-8保存xml文件,出现异常:Invalid byte 2 of 2-byte UTF-8 sequence
- 关于使用JAXB读取xml文档转换为java对象出现非法注解异常
- Eclipse中通过FileSystemXmlApplicationContext读取不到配置文件问题
- 使用FileReader读取本地磁盘文件问题
- Android开发中,使用线程下载apk文件,出现FileNotFound的问题,解决办法
- 后台使用Spring MVC 4.15 版本 通过 ajaxFileUpload plugin插件上传文件相应时引起的一个小问题,Chrome、Firefox中出现SyntaxError:unexpected token <
- Maven出现User setting file does not exist ...\.m2\setting.xml的问题解决(同时也解决用户.m2目录下无setting.xml文件)
- 使用DOM4j读取xml文件
- Hibernate 使用时出现 Could not parse configuration: /hibernate.cfg.xml 的异常问题
- dom4j 使用xpath 解析 persistence.xml 出现xmlns后不能解析问题解决
- 使用xml作为数据库的配置文件的路径读取问题
- .NET:使用 XPATH 读取有 xmlns 属性的 XML 文档出现的问题
- 使用dom4j和jdom读取xml文件
- 使用dom4j读取xml文件
- Dom4j SAXReader读取xml异常时占用文件,导致不能移动文件
- 使用Dom4j读取和写入xml文件