您的位置:首页 > 其它

XML validation for multiple schemas 验证使用多个XSD schema的XML文件

2011-04-12 18:18 936 查看
很多情况下我们为了优化XSD文件的可读性和可维护性,以及复用等问题的时候我们需要将schema文件拆分成多个,本文将着重关注于使用多个schema文件验证单一XML文件的问题(注: XML validation for multiple schemas)

下面将通过以下几个步骤演示如何使用多个schema(XSD)文件验证单一XML文件

1. 创建需要被验证的XML文件

2. 根据XML反向创建XSD文件

3. 使用多个schema验证XML文件

4. 运行测试

现在将逐步展开演示:

1. 创建需要被验证的XML文件

Xml代码




<?
xml

version
=
"1.0"

encoding
=
"utf-8"

?>

<
employees

xmlns:admin
=
"http://www.company.com/management/employees/admin"
>

<
admin:employee
>

<
admin:userId
>
johnsmith@company.com
</
admin:userId
>

<
admin:password
>
abc123_
</
admin:password
>

<
admin:name
>
John Smith
</
admin:name
>

<
admin:age
>
24
</
admin:age
>

<
admin:gender
>
Male
</
admin:gender
>

</
admin:employee
>

<
admin:employee
>

<
admin:userId
>
christinechen@company.com
</
admin:userId
>

<
admin:password
>
123456
</
admin:password
>

<
admin:name
>
Christine Chen
</
admin:name
>

<
admin:age
>
27
</
admin:age
>

<
admin:gender
>
Female
</
admin:gender
>

</
admin:employee
>

</
employees
>

<?xml version="1.0" encoding="utf-8" ?>
<employees xmlns:admin="http://www.company.com/management/employees/admin">
<admin:employee>
<admin:userId>johnsmith@company.com</admin:userId>
<admin:password>abc123_</admin:password>
<admin:name>John Smith</admin:name>
<admin:age>24</admin:age>
<admin:gender>Male</admin:gender>
</admin:employee>
<admin:employee>
<admin:userId>christinechen@company.com</admin:userId>
<admin:password>123456</admin:password>
<admin:name>Christine Chen</admin:name>
<admin:age>27</admin:age>
<admin:gender>Female</admin:gender>
</admin:employee>
</employees>


2. 根据XML反向创建XSD文件

注:本文是反向生成的XSD文件,当然您可能是已经有XSD文件,那就可以直接跳过第二步了。

通过观察employees.xml的格式我们可以反向的创建出employees.xsd文件,但是为了快捷起见,我们可以选择使用转换工具(XML to XSD)来完成这项工作,这里我将使用trang:http://www.thaiopensource.com/relaxng/trang.html

首先下载最新版的trang.jar文件,然后将employees.xml和trang.jar放在同一个目录下,运行如下命令行:

java -jar trang.jar employees.xml employees.xsd

运行之后将会在当前目录下生成两个XSD文件:employees.xsd, admin.xsd, 如下:

employees.xsd

Xml代码




<?
xml

version
=
"1.0"

encoding
=
"UTF-8"
?>

<
xs:schema

xmlns:xs
=
"http://www.w3.org/2001/XMLSchema"

elementFormDefault
=
"qualified"

xmlns:admin
=
"http://www.company.com/management/employees/admin"
>

<
xs:import

namespace
=
"http://www.company.com/management/employees/admin"

schemaLocation
=
"admin.xsd"
/>

<
xs:element

name
=
"employees"
>

<
xs:complexType
>

<
xs:sequence
>

<
xs:element

maxOccurs
=
"unbounded"

ref
=
"admin:employee"
/>

</
xs:sequence
>

</
xs:complexType
>

</
xs:element
>

</
xs:schema
>

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" xmlns:admin="http://www.company.com/management/employees/admin">
<xs:import namespace="http://www.company.com/management/employees/admin" schemaLocation="admin.xsd"/>
<xs:element name="employees">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="admin:employee"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>


admin.xsd

Xml代码




<?
xml

version
=
"1.0"

encoding
=
"UTF-8"
?>

<
xs:schema

xmlns:xs
=
"http://www.w3.org/2001/XMLSchema"

elementFormDefault
=
"qualified"

targetNamespace
=
"http://www.company.com/management/employees/admin"

xmlns:admin
=
"http://www.company.com/management/employees/admin"
>

<
xs:import

schemaLocation
=
"employees.xsd"
/>

<
xs:element

name
=
"employee"
>

<
xs:complexType
>

<
xs:sequence
>

<
xs:element

ref
=
"admin:userId"
/>

<
xs:element

ref
=
"admin:password"
/>

<
xs:element

ref
=
"admin:name"
/>

<
xs:element

ref
=
"admin:age"
/>

<
xs:element

ref
=
"admin:gender"
/>

</
xs:sequence
>

</
xs:complexType
>

</
xs:element
>

<
xs:element

name
=
"userId"

type
=
"xs:string"
/>

<
xs:element

name
=
"password"

type
=
"xs:NMTOKEN"
/>

<
xs:element

name
=
"name"

type
=
"xs:string"
/>

<
xs:element

name
=
"age"

type
=
"xs:integer"
/>

<
xs:element

name
=
"gender"

type
=
"xs:NCName"
/>

</
xs:schema
>

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" targetNamespace="http://www.company.com/management/employees/admin" xmlns:admin="http://www.company.com/management/employees/admin">
<xs:import schemaLocation="employees.xsd"/>
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element ref="admin:userId"/>
<xs:element ref="admin:password"/>
<xs:element ref="admin:name"/>
<xs:element ref="admin:age"/>
<xs:element ref="admin:gender"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="userId" type="xs:string"/>
<xs:element name="password" type="xs:NMTOKEN"/>
<xs:element name="name" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="gender" type="xs:NCName"/>
</xs:schema>


当然你也可以自己手动的去书写XSD文件。

3. 使用多个schema验证XML文件

如果想验证使用单一shema的XML,应该不会遇到太多问题,示例如下:

Java代码




public

static

boolean
validateSingleSchema(File xml, File xsd) {

boolean
legal =
false
;

try
{

SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

Schema schema = sf.newSchema(xsd);

Validator validator = schema.newValidator();

validator.validate(new
StreamSource(xml));

legal = true
;

} catch
(Exception e) {

legal = false
;

log.error(e.getMessage());

}

return
legal;

}

public static boolean validateSingleSchema(File xml, File xsd) {
boolean legal = false;

try {
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(xsd);

Validator validator = schema.newValidator();
validator.validate(new StreamSource(xml));

legal = true;
} catch (Exception e) {
legal = false;
log.error(e.getMessage());
}

return legal;
}


但是当使用多个schema验证的时候会导致无法加载classpath外部的使用<xs:import>/<xs:include>加载的XSD文件,导致如下error message:

org.xml.sax.SAXParseException: src-resolve: Cannot resolve the name 'admin:employee' to a(n) 'element declaration' component.

为了解决这个问题我们需要使用LSResourceResolver, SchemaFactory在解析shcema的时候可以使用LSResourceResolver加载外部资源。

代码如下:

Java代码




package
com.javaeye.terrencexu.jaxb;

import
java.io.File;

import
java.io.FileInputStream;

import
java.io.FileNotFoundException;

import
java.io.InputStream;

import
java.io.Reader;

import
java.net.URI;

import
java.net.URISyntaxException;

import
org.apache.log4j.Logger;

import
org.w3c.dom.ls.LSInput;

import
org.w3c.dom.ls.LSResourceResolver;

/**

*

* Implement LSResourceResolver to customize resource resolution when parsing schemas.

* <p>

* SchemaFactory uses a LSResourceResolver when it needs to locate external resources

* while parsing schemas, although exactly what constitutes "locating external resources"

* is up to each schema language.

* </p>

* <p>

* For example, for W3C XML Schema, this includes files <include>d or <import>ed,

* and DTD referenced from schema files, etc.

*</p>

*

*/

class
SchemaResourceResolver
implements
LSResourceResolver {

private

static

final
Logger log = Logger.getLogger(SchemaResourceResolver.
class
);

/**

*

* Allow the application to resolve external resources.

*

* <p>

* The LSParser will call this method before opening any external resource, including

* the external DTD subset, external entities referenced within the DTD, and external

* entities referenced within the document element (however, the top-level document

* entity is not passed to this method). The application may then request that the

* LSParser resolve the external resource itself, that it use an alternative URI,

* or that it use an entirely different input source.

* </p>

*

* <p>

* Application writers can use this method to redirect external system identifiers to

* secure and/or local URI, to look up public identifiers in a catalogue, or to read

* an entity from a database or other input source (including, for example, a dialog box).

* </p>

*/

public
LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {

log.info("/n>> Resolving "
+
"/n"

+ "TYPE: "
+ type +
"/n"

+ "NAMESPACE_URI: "
+ namespaceURI +
"/n"

+ "PUBLIC_ID: "
+ publicId +
"/n"

+ "SYSTEM_ID: "
+ systemId +
"/n"

+ "BASE_URI: "
+ baseURI +
"/n"
);

String schemaLocation = baseURI.substring(0
, baseURI.lastIndexOf(
"/"
) +
1
);

if
(systemId.indexOf(
"http://"
) <
0
) {

systemId = schemaLocation + systemId;

}

LSInput lsInput = new
LSInputImpl();

URI uri = null
;

try
{

uri = new
URI(systemId);

} catch
(URISyntaxException e) {

e.printStackTrace();

}

File file = new
File(uri);

FileInputStream is = null
;

try
{

is = new
FileInputStream(file);

} catch
(FileNotFoundException e) {

e.printStackTrace();

}

lsInput.setSystemId(systemId);

lsInput.setByteStream(is);

return
lsInput;

}

/**

*

* Represents an input source for data

*

*/

class
LSInputImpl
implements
LSInput {

private
String publicId;

private
String systemId;

private
String baseURI;

private
InputStream byteStream;

private
Reader charStream;

private
String stringData;

private
String encoding;

private

boolean
certifiedText;

public
LSInputImpl() {}

public
LSInputImpl(String publicId, String systemId, InputStream byteStream) {

this
.publicId = publicId;

this
.systemId = systemId;

this
.byteStream = byteStream;

}

public
String getBaseURI() {

return
baseURI;

}

public
InputStream getByteStream() {

return
byteStream;

}

public

boolean
getCertifiedText() {

return
certifiedText;

}

public
Reader getCharacterStream() {

return
charStream;

}

public
String getEncoding() {

return
encoding;

}

public
String getPublicId() {

return
publicId;

}

public
String getStringData() {

return
stringData;

}

public
String getSystemId() {

return
systemId;

}

public

void
setBaseURI(String baseURI) {

this
.baseURI = baseURI;

}

public

void
setByteStream(InputStream byteStream) {

this
.byteStream = byteStream;

}

public

void
setCertifiedText(
boolean
certifiedText) {

this
.certifiedText = certifiedText;

}

public

void
setCharacterStream(Reader characterStream) {

this
.charStream = characterStream;

}

public

void
setEncoding(String encoding) {

this
.encoding = encoding;

}

public

void
setPublicId(String publicId) {

this
.publicId = publicId;

}

public

void
setStringData(String stringData) {

this
.stringData = stringData;

}

public

void
setSystemId(String systemId) {

this
.systemId = systemId;

}

}

}

package com.javaeye.terrencexu.jaxb;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.io.Reader;
import java.net.URI;
import java.net.URISyntaxException;

import org.apache.log4j.Logger;
import org.w3c.dom.ls.LSInput;
import org.w3c.dom.ls.LSResourceResolver;

/**
*
* Implement LSResourceResolver to customize resource resolution when parsing schemas.
* <p>
* SchemaFactory uses a LSResourceResolver when it needs to locate external resources
* while parsing schemas, although exactly what constitutes "locating external resources"
* is up to each schema language.
* </p>
* <p>
* For example, for W3C XML Schema, this includes files <include>d or <import>ed,
* and DTD referenced from schema files, etc.
*</p>
*
*/
class SchemaResourceResolver implements LSResourceResolver {

private static final Logger log = Logger.getLogger(SchemaResourceResolver.class);

/**
*
* Allow the application to resolve external resources.
*
* <p>
* The LSParser will call this method before opening any external resource, including
* the external DTD subset, external entities referenced within the DTD, and external
* entities referenced within the document element (however, the top-level document
* entity is not passed to this method). The application may then request that the
* LSParser resolve the external resource itself, that it use an alternative URI,
* or that it use an entirely different input source.
* </p>
*
* <p>
* Application writers can use this method to redirect external system identifiers to
* secure and/or local URI, to look up public identifiers in a catalogue, or to read
* an entity from a database or other input source (including, for example, a dialog box).
* </p>
*/
public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) {
log.info("/n>> Resolving " + "/n"
+ "TYPE: " + type + "/n"
+ "NAMESPACE_URI: " + namespaceURI + "/n"
+ "PUBLIC_ID: " + publicId + "/n"
+ "SYSTEM_ID: " + systemId + "/n"
+ "BASE_URI: " + baseURI + "/n");

String schemaLocation = baseURI.substring(0, baseURI.lastIndexOf("/") + 1);

if(systemId.indexOf("http://") < 0) {
systemId = schemaLocation + systemId;
}

LSInput lsInput = new LSInputImpl();

URI uri = null;
try {
uri = new URI(systemId);
} catch (URISyntaxException e) {
e.printStackTrace();
}

File file = new File(uri);
FileInputStream is = null;
try {
is = new FileInputStream(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}

lsInput.setSystemId(systemId);
lsInput.setByteStream(is);

return lsInput;
}

/**
*
* Represents an input source for data
*
*/
class LSInputImpl implements LSInput {

private String publicId;
private String systemId;
private String baseURI;
private InputStream byteStream;
private Reader charStream;
private String stringData;
private String encoding;
private boolean certifiedText;

public LSInputImpl() {}

public LSInputImpl(String publicId, String systemId, InputStream byteStream) {
this.publicId = publicId;
this.systemId = systemId;
this.byteStream = byteStream;
}

public String getBaseURI() {
return baseURI;
}

public InputStream getByteStream() {
return byteStream;
}

public boolean getCertifiedText() {
return certifiedText;
}

public Reader getCharacterStream() {
return charStream;
}

public String getEncoding() {
return encoding;
}

public String getPublicId() {
return publicId;
}

public String getStringData() {
return stringData;
}

public String getSystemId() {
return systemId;
}

public void setBaseURI(String baseURI) {
this.baseURI = baseURI;
}

public void setByteStream(InputStream byteStream) {
this.byteStream = byteStream;
}

public void setCertifiedText(boolean certifiedText) {
this.certifiedText = certifiedText;
}

public void setCharacterStream(Reader characterStream) {
this.charStream = characterStream;
}

public void setEncoding(String encoding) {
this.encoding = encoding;
}

public void setPublicId(String publicId) {
this.publicId = publicId;
}

public void setStringData(String stringData) {
this.stringData = stringData;
}

public void setSystemId(String systemId) {
this.systemId = systemId;
}

}

}


最后要做的事情就是创建一个validator去封装XML验证的逻辑代码, 如下:

Java代码




package
com.javaeye.terrencexu.jaxb;

import
java.io.File;

import
java.io.IOException;

import
java.io.InputStream;

import
java.io.StringWriter;

import
java.util.List;

import
javax.xml.parsers.DocumentBuilder;

import
javax.xml.parsers.DocumentBuilderFactory;

import
javax.xml.parsers.ParserConfigurationException;

import
javax.xml.transform.Source;

import
javax.xml.transform.dom.DOMSource;

import
javax.xml.transform.stream.StreamSource;

import
javax.xml.validation.Schema;

import
javax.xml.validation.SchemaFactory;

import
javax.xml.validation.Validator;

import
org.apache.log4j.Logger;

import
org.xml.sax.SAXException;

public

final

class
XMLParser {

private

static

final
Logger log = Logger.getLogger(XMLParser.
class
);

private
XMLParser() {}

public

static

boolean
validateWithSingleSchema(File xml, File xsd) {

boolean
legal =
false
;

try
{

SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

Schema schema = sf.newSchema(xsd);

Validator validator = schema.newValidator();

validator.validate(new
StreamSource(xml));

legal = true
;

} catch
(Exception e) {

legal = false
;

log.error(e.getMessage());

}

return
legal;

}

public

static

boolean
validateWithMultiSchemas(InputStream xml, List<File> schemas) {

boolean
legal =
false
;

try
{

Schema schema = createSchema(schemas);

Validator validator = schema.newValidator();

validator.validate(new
StreamSource(xml));

legal = true
;

} catch
(Exception e) {

legal = false
;

log.error(e.getMessage());

}

return
legal;

}

/**

* Create Schema object from the schemas file.

*

* @param schemas

* @return

* @throws ParserConfigurationException

* @throws SAXException

* @throws IOException

*/

private

static
Schema createSchema(List<File> schemas)
throws
ParserConfigurationException, SAXException, IOException {

SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

SchemaResourceResolver resourceResolver = new
SchemaResourceResolver();

sf.setResourceResolver(resourceResolver);

Source[] sources = new
Source[schemas.size()];

DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();

docFactory.setValidating(false
);

docFactory.setNamespaceAware(true
);

DocumentBuilder docBuilder = docFactory.newDocumentBuilder();

for
(
int
i =
0
; i < schemas.size(); i ++) {

org.w3c.dom.Document doc = docBuilder.parse(schemas.get(i));

DOMSource stream = new
DOMSource(doc, schemas.get(i).getAbsolutePath());

sources[i] = stream;

}

return
sf.newSchema(sources);

}

}

package com.javaeye.terrencexu.jaxb;

import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringWriter;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.Source;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;

import org.apache.log4j.Logger;
import org.xml.sax.SAXException;

public final class XMLParser {

private static final Logger log = Logger.getLogger(XMLParser.class);

private XMLParser() {}

public static boolean validateWithSingleSchema(File xml, File xsd) {
boolean legal = false;

try {
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(xsd);

Validator validator = schema.newValidator();
validator.validate(new StreamSource(xml));

legal = true;
} catch (Exception e) {
legal = false;
log.error(e.getMessage());
}

return legal;
}

public static boolean validateWithMultiSchemas(InputStream xml, List<File> schemas) {
boolean legal = false;

try {
Schema schema = createSchema(schemas);

Validator validator = schema.newValidator();
validator.validate(new StreamSource(xml));

legal = true;
} catch(Exception e) {
legal = false;
log.error(e.getMessage());
}

return legal;
}

/**
* Create Schema object from the schemas file.
*
* @param schemas
* @return
* @throws ParserConfigurationException
* @throws SAXException
* @throws IOException
*/
private static Schema createSchema(List<File> schemas) throws ParserConfigurationException, SAXException, IOException {
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
SchemaResourceResolver resourceResolver = new SchemaResourceResolver();
sf.setResourceResolver(resourceResolver);

Source[] sources = new Source[schemas.size()];

DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
docFactory.setValidating(false);
docFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();

for(int i = 0; i < schemas.size(); i ++) {
org.w3c.dom.Document doc = docBuilder.parse(schemas.get(i));
DOMSource stream = new DOMSource(doc, schemas.get(i).getAbsolutePath());
sources[i] = stream;
}

return sf.newSchema(sources);
}

}


4. 运行测试

Java代码




public

static

void
testValidate()
throws
SAXException, FileNotFoundException {

InputStream xml = new
FileInputStream(
new
File(
"C://eclipse//workspace1//JavaStudy//test//employees.xml"
));

List<File> schemas = new
ArrayList<File>();

schemas.add(new
File(
"C://eclipse//workspace1//JavaStudy//test//employees.xsd"
));

schemas.add(new
File(
"C://eclipse//workspace1//JavaStudy//test//admin.xsd"
));

XMLParser.validateWithMultiSchemas(xml, schemas);

}

public static void testValidate() throws SAXException, FileNotFoundException {
InputStream xml = new FileInputStream(new File("C://eclipse//workspace1//JavaStudy//test//employees.xml"));

List<File> schemas = new ArrayList<File>();
schemas.add(new File("C://eclipse//workspace1//JavaStudy//test//employees.xsd"));
schemas.add(new File("C://eclipse//workspace1//JavaStudy//test//admin.xsd"));

XMLParser.validateWithMultiSchemas(xml, schemas);
}


注:如果两个schema文件在同一个目录下,那么可以只传递一个主schema文件(employees.xsd)即可, SchemaResourceResolver会帮我们加载admin.xsd
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐