XML
2016-05-01 21:27
309 查看
XML
Xml是一种可扩展标记语言,1998年2月,W3C正式批准了可扩展标记语言的标准定义,可扩展标记语言可以对文档和数据进行结构化处理,从而能够再部门、客户和供应商之间进行交换,实现动态内容生成,企业集成和应用开发。可扩展标记语言可以是我们能够更准确的搜索,更方便的传送软件组件,更好的描述一些事物。例如电子商务交易等。= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
* 可扩展的标记语言
* 作用:
*) 存储数据
*) 传输数据
*)结构化的数据
*)不是给人(用户)看的,而是由机器处理
*)它的标签没有被预定义,需要我们自行定义标签
XML语法
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =1. 头标记
<?xml version="1.0" encoding="GBK"?>
*)可有可无
*)必须是第一个字节
2. 根元素只能有一个
3. 元素必须由结束标签
如: <a> </a> 、
<a/>
4. 元素必训正确嵌套
<authors>
<author>张三</author>
<author>李四</author>
</authors>
5. 属性必须由属性值
6. 属性值必须有引号,双引号单引号都行
<email date="2015-8-26" time="15:4:12"></email>
7. 转义实体
< <
> >
" "
' '
& &
8. <![CDATA[...]]>
机器从 CDATA 中提取文本时,所有字符不做任何运算处理,直接作为普通字符提取出来
9. <!-- 注释 -->
XML的应用
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =1. 用于描述布局信息,如Android布局文件所使用的文档格式
2. XML格式存储的数据不仅具有良好的内在结构,而且由于它是W3C提出的国际标准,因而受到广大软件提供商的支持、易于进行数据交流和开发。
3. 不同应用系统间数据分共享和交互,这其中需要定义协议。
4. 它和json都是一种数据交换格式
标签 Tag
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =* 一对尖括号
<a> a的开始标签
</a> a的结束标签
<book isbn="xxx"> book的开始标签
元素 Element
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =* 从开始标签到结束标签之间所有内容
authors元素:
<authors>
<author>张三</author>
<author>李四</author>
</authors>
属性 Attribute
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =* 开始标签中的键值对数据
<email date="2015-8-26" time="15:4:12">
文本 Text
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =* 开始标签和结束标签之间的文本内容
<a>abcabcabc</a>
属性 vs 文本
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =* 选择原则: 随便你
* 长文本、复杂结构的文本,选择用“文本”
附:
<?xml version="1.0" encoding="GBK"?> <email date="2015-8-26" time="15:4:12"> <from>aaa@aaa.com</from> <to> <to-email>bbb@bbb.com</to-email> <to-email>ccc@ccc.com</to-email> <to-email>ddd@ddd.com</to-email> </to> <subject>Hello XML</subject> <body> >>>Hello XML!!!<<< <![CDATA[>>> Hello XML!!!<<< >>> Hello XML!!!<<< ]]> </body> </email> |
DTD
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =* 文档类型定义(Document Type Definition)
* 在一个领域内,由组织、企业或技术领袖制定的 xml
编写规范
(它是标准通用标记语言文档的验证机制。可以通过比较文档和文档类型定义文件来看文档是否符合规范,元素和标签是否正确)
DTD 的作用
--------------------------------------------------------------------------------------------------
* 通过它,您的每一个XML文件均可携带一个有关其自身格式的描述。
* 通过它,独立的团体可以一致地使用某个标准的文档类型定义来交换数据
* 应用程序也可以使用某个标准的文档类型定义来验证从外部接收带的数据
* 可以使用它来验证您自身的数据
定义DTD
<?xml version="1.0" encoding="GBK"?> <!ELEMENT email (from,to,subject,body)> <!ELEMENT from (#PCDATA)> <!ELEMENT to (to-email+)> <!ELEMENT subject (#PCDATA)> <!ELEMENT body (#PCDATA)> <!ELEMENT to-email (#PCDATA)> <!ATTLIST email date CDATA #REQUIRED time CDATA #IMPLIED> |
<?xml version="1.0" encoding="GBK"?> <!DOCTYPE email SYSTEM "email.dtd"> <email date="2015-8-26" time="15:4:12"> <from>aaa@aaa.com</from> <to> <to-email>bbb@bbb.com</to-email> <to-email>ccc@ccc.com</to-email> </to> <subject>Hello XML</subject> <body> <![CDATA[>>> Hello XML!!!<<< >>> Hello XML!!!<<< ]]> </body> </email> |
DTD语法
* 声明XML根标签中的元素*)定义 email
标签中依次含有from,to,subject,body 几个子标签
<!ELEMENT email (from,to,subject,body)>
*)定义from
标签中的内容为普通文本
<!ELEMENT from (#PCDATA)>
* 声明XML文档中含有的空元素
<!EMENMENT 元素名称 EMPTY>
* 声明XML文档中带有任何内容的元素
<!EMEMENT 元素名称 ANY>
* 声明XML文档中某子元素(to-email)至少出现一次
<!ELEMENT to (to-email+)>
* 声明XML文档中某子元素(from)出现0次或多次
<!ELEMENT 元素名称(from?)>
* 声明XML文档中元素的属性
<!ATTLIST 元素名称 属性名称 属性值类型 默认值>
<!ATTLIST email date CDATA #REQUIRED time CDATA #IMPLIED>
XPath
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =* 用路径格式,表示数据的位置
/email/from
/email/body
/email/to/to-email
/email/@date
/email//to-email
//to-email
//@date
Java 处理xml
java 处理xml= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
* SAX
* DOM4J
* Xml Pull
SAX
SAX(Simple API for Xml) 是一种XML解析的方法。SAX是事件驱动型XML解析的一个标准,SAX的工作原理简单地说就是对文档进行顺序扫描,当扫描到文档(document)开始与结束、元素(element)开始与结束等地方时通知事件处理函数,由事件处理函数做相应动作,然后继续同样的扫描,直至文档结束。= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
SAX解析的特点
------------------------------------------------------------------
* SAX是一种速度快、有效的XML解析方式
* 它逐行扫描,边扫描边解析
* 操作相对复杂
主要的几个类
--------------------------------------------------------------------
* SaxParserFactory SAX解析工厂
* SaxParser SAX解析器
* DefaultHandler 用于处理监听
SaxParserFactory
---------------------------------
* 负责创建解析器对象
创建对象
---------------------------
SaxParserFactory f = SaxParserFactory.newInstance();
方法
---------------------------
newSaxParser() 创建解析器对象
SaxParser
--------------------------------------
方法
-----------------------------
parse(文件,
数据处理器)
parse(输入流,
数据处理器)
DefaultHandler
--------------------------------------
方法
-----------------------------
startElement()
endElement()
characters()
示例代码:
测试类:
import java.util.List; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; public class Test1 { public static void main(String[] args) throws Exception { /* * 有一个工具 * 可以用相对路径格式,来获取绝对路径 * "/"-表示程序执行的目录 eclipse开发环境中,是bin目录 * D:\aaa_workspace\day2101_xml\bin */ String path=Test1.class.getResource("/books.xml").getPath(); SAXParser p=SAXParserFactory.newInstance().newSAXParser(); BookHandler h = new BookHandler(); p.parse(path, h ); List<Book> list=h.list; for(Book b:list){ System.out.println(b); } } } |
import java.util.ArrayList; import java.util.List; import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.helpers.DefaultHandler; public class BookHandler extends DefaultHandler{ StringBuilder text=new StringBuilder(); List<Book> list=new ArrayList<Book>(); Book book=null; @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { if("book".equals(qName)){ book=new Book(); list.add(book); book.setIsbn(attributes.getValue(0)); } } @Override public void endElement(String uri, String localName, String qName) throws SAXException { String s=text.toString().trim(); if(!s.equals("")){ if("name".equals(qName)){ book.setName(s); }else if("author".equals(qName)){ book.getAuthors().add(s); }else if("publisher".equals(qName)){ book.setPublisher(s); }else if("pages".equals(qName)){ book.setPages(s); }else if("price".equals(qName)){ book.setPrice(s); } } text.delete(0,text.length()); } @Override public void characters(char[] ch, int start, int length) throws SAXException { text.append(ch,start,length); } } |
DOM4J
DOM(Document Object Model)文档对象模型,将整篇XML文档都读入内存中,形成一颗倒置的dom树。DOM模型提供了很多API供我们对XML文档进行操作。由于DOM需要预先把整篇XML都读入内存,所以不适合较大的XML文档的解析,但是API相对来说比较简单。= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
* Document Object Model for Java
符合 Java 编程习惯的
DOM API
* 第三方开源 API
* 参考官方文档和示例代码
使用DOM解析读取XML
-------------------------------------------------------------------
* 导入DOM4J的jar包
* 具体操作参考以下示例代码
public static void main(String[] args) throws Exception { String path=Test2.class.getResource("/email.xml").getPath(); SAXReader reader = new SAXReader(); //读取xml文件,生成DOM树 //Document 对象是DOM树结构的根 Document doc = reader.read(path); Element email=doc.getRootElement(); for (Iterator<Element> it = email.elementIterator(); it.hasNext(); ) { Element e = it.next(); if(e.isTextOnly()){ System.out.println(e.getText()); } } System.out.println("===================="); for(Iterator<Element> it=email.element("to").elementIterator("to-email");it.hasNext();){ Element e=it.next(); System.out.println(e.getText()); } System.out.println("===================="); for(Iterator<Attribute> it = email.attributeIterator();it.hasNext();){ Attribute a = (Attribute) it.next(); System.out.println(a.getName()+"="+a.getValue()); } } |
--------------------------------------------------------------------------------------------------------
* 导入DOM4J的jar包
* 具体操作参考以下示例代码:
public static void main(String[] args) throws Exception { File f=new File("d:/abc/stu.xml"); Document doc; if(f.exists()){ doc=new SAXReader().read(f); }else{ doc=DocumentHelper.createDocument(); doc.addElement("students"); } System.out.println("姓名:"); String name=new Scanner(System.in).nextLine(); System.out.println("性别:"); String gender=new Scanner(System.in).nextLine(); System.out.println("年龄:"); String age=new Scanner(System.in).nextLine(); Element stus=doc.getRootElement();//获得根元素 Element stu = stus.addElement("stu");//添加子元素 stu.addElement("name").setText(name); stu.addElement("gender").setText(gender); stu.addElement("age").setText(age); XMLWriter write=new XMLWriter( new FileOutputStream("d:/abc/stu.xml"),OutputFormat.createPrettyPrint()); write.write(doc); write.flush(); write.close(); } } |
Xml Pull
PULL解析和SAX解析方式原理类似,不一样的地方是Pull解析读取XML文件后返回数字来说明当前驱动事件。= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
* 第三方开源API
* Android SDK 中集成了 Xml Pull
* XmlPullParserFactory
* Xml
* XmlPullPareser
* XmlSerializer
XmlPullParserFactory
-------------------------------------------------------------------------------------------
*负责创建 X买了PullParser
和 XmlSerializer的工厂对象
创建对象
--------------------------------------------------------------------------------------
XmlPullParserFactory f =XmlPullParserFactory.newInstance();
方法
------------------------------------
newXmlPullParser()
newXmlSerializer()
Xml
-----------------------------------------------
* 代替工厂,提供一种简化的创建方式
方法
------------------------------------
Xml.newXmlPullParser()
Xml.newXmlSerializer()
XmlPullParser
---------------------------------------------
* 解析器,用来解析 xml
方法
----------------------------------------
setInput(Reader in)
setInput(InputStream in, String encoding) 指定读取数据的流,
*) 这个方法必须首先调用
next() 跳到下一段,返回表示数据类型的一个数字代码
结束后,再向后跳返回 XmlPullParser.END_DOCUMENT (值是1)
getEventType() 返回当前所在位置的数字代码
getName() 获得标签名,数字代码是XmlPullParser.START_TAG、XmlPullParser.END_TAG才能调用这个方法
getText() 获得文本,数字代码是XmlPullParser.TEXT,才能调用这个方法
nextText() 向后跳一步,并取文本;等价于: next(); getText();
数字代码是XmlPullParser.START_TAG才能调用这个方法
getAttributeCount() 属性数量
getAttributeName(index) 指定位置属性名
getAttributeValue(index) 指定位置属性值
getAttributeValue(namespace, name) 通过属性名,获得对应的值
namespace 命名空间,没有命名空间,给null
数字代码是: XmlPullParser.START_TAG 才能调用这几个属性方法
示例代码: 读取学生信息
public void qu(View view){ try { tv.setText(""); //获取文件输入流 FileInputStream in =new FileInputStream("/sdcard/stu.xml"); //创建Pull解析器 XmlPullParser p= Xml.newPullParser(); //设置解析器所使用的输入流及字符集 p.setInput(in, "GBK"); //事件驱动类型 int type; while((type=p.next())!=XmlPullParser.END_DOCUMENT){ //根据不同的事件类型,做不同的解析处理 if(type==XmlPullParser.START_TAG){ String n=p.getName(); if("name".equals(n)){ tv.append("\n姓名:"+p.nextText()); }else if("gender".equals(n)){ tv.append("\n性别:"+p.nextText()); }else if("age".equals(n)){ tv.append("\n年龄:"+p.nextText()); } } } in.close(); Toast.makeText(this, "读取完成", Toast.LENGTH_SHORT).show(); } catch (Exception e) { Toast.makeText(this, "读取失败", Toast.LENGTH_SHORT).show(); e.printStackTrace(); } } |
-------------------------------------------------------------
* 将数据输出为 xml
格式字符序列
方法
---------------------------------------
setOutput(Writer out)
setOutput(OutputStream out, String encoding) 指定用来输出
xml 的流
*) 必须首先调用
startDocument(encoding, standalone) 用来输出
xml 头标记
<?xml version="1.0" encoding="GBK" standalone="true" ?>
startTag(namespace, name) 输出开始标签,没有命名空间,第一个参数给
Null
attribute(namespace, name, value) 输出属性
text(string) 输出文本
cdsect(string) 输出
<![CDATA[ ]]>
endTag(namespace, name) 输出结束标签
示例代码:保存学生信息
public void cun(View view){ try { /* * /sdcard/stu.xml */ //创建文件输出流 FileOutputStream out =new FileOutputStream("/sdcard/stu.xml"); //创建 XmlSerializer 对象 XmlSerializer s=Xml.newSerializer(); //设置文件输出流 s.setOutput(out, "GBK"); //开始构建文档 //s.startDocument("GBK", true); s.startTag(null, "student") .startTag(null, "name").text(et1.getText().toString()).endTag(null, "name") .startTag(null, "gender").text(et2.getText().toString()).endTag(null, "gender") .startTag(null, "age").text(et3.getText().toString()).endTag(null, "age") .endTag(null, "student"); //释放资源 s.flush(); out.close(); Toast.makeText(this, "保存成功", Toast.LENGTH_SHORT).show(); } catch (Exception e) { Toast.makeText(this, "保存失败", Toast.LENGTH_SHORT).show(); e.printStackTrace(); } } |
相关文章推荐
- Linux账号管理
- 第三天测试
- Scala学习第五弹:求值策略
- C++11一些新特性
- 文件操作
- Trace32 dump bmp
- 《网络攻防技术与实践》第九周学习总结
- HDU 1166 敌兵布阵
- Spring核心之依赖注入(三)
- HDU 1166 敌兵布阵
- CSS3之布局(分栏布局、响应式布局)
- CSU oj 1685 Entertainment Box
- QML 中的 ListView 中的隐藏秘技
- hdu 4712 Hamming Distance(随机数法)
- 软件设计模式——责任链模式(Chain Of Responsibility)
- 基于A*算法10*10迷宫
- Linux内核学习总结
- 解析XML各种异常
- 数据挖掘-离群点检测
- centOS解决乱码问题