SAX 解析 XML——JAVA
2015-04-30 16:28
295 查看
<?xml version='1.0' encoding='UTF-8'?> <samples> <search_results><query id="7015">the raven</query><engine status="OK" timestamp="2014-05-15 13:43:06" name="CiteSeerX" id="FW14-e004"/><snippets><snippet id="FW14-e004-7015-01"><link cache="FW14-topics-docs/e004/7015_01.html" timestamp="2014-05-15 13:43:07">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.51.7167&rank=1</link><title>The Raven System</title><description>The Raven System by Donald Acton, Terry Coatta, Gerald Neufeld , 1992 "... The Raven System 1 Donald Acton, Terry Coatta and Gerald Neufeld Technical Report TR 92-15 August ..." Abstract \- Cited by 7 (4 self) \- Add to MetaCart This report describes the distributed object-oriented system, <em>Raven</em>. <em>Raven</em> is both a distributed</description></snippet><snippet id="FW14-e004-7015-02"><link cache="FW14-topics-docs/e004/7015_02.html" timestamp="2014-05-15 13:43:08">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.35.4932&rank=2</link><title>The Raven System</title><description>The Raven System by Donald Acton Terry, Terry Coatta, Gerald Neufeld , 1992 "... The Raven System 1 Donald Acton, Terry Coatta and Gerald Neufeld Technical Report TR 92-15 August ..." Abstract \- Add to MetaCart This report describes the distributed object-oriented system, <em>Raven</em>. <em>Raven</em> is both a distributed</description></snippet><snippet id="FW14-e004-7015-03"><link cache="FW14-topics-docs/e004/7015_03.html" timestamp="2014-05-15 13:43:08">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.276.8949&rank=3</link><title>book In the Company of Crows and Ravens</title><description>book In the Company of Crows and Ravens by Marzluff Jm, John Marzluff, Tony Angell, Quote Reverend Henry Ward Beecher’s "... Book Reviews/Science in the Media Living with the Trickster: Crows, Ravens, and Human Culture ..." Abstract \- Add to MetaCart Few groups of wild animals inspire such extreme opinions in the humans who observe them than</description></snippet><snippet id="FW14-e004-7015-04"><link cache="FW14-topics-docs/e004/7015_04.html" timestamp="2014-05-15 13:43:09">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.395.6124&rank=4</link><title>Design by Raven Design</title><description>Design by Raven Design by Third-level Programmes, Edited Irene Sheridan, Dr Margaret Linehan "... by Raven Design Printed by City Print Ltd © CIT Press 2011 ISBN 978-1-906953-07-2 The toolkit includes ..." Abstract \- Add to MetaCart Work Placement in Third-Level Programmes is one of a number of significant outputs of the Roadmap for Employment–Academic Partnerships (REAP) Project. This report draws together for the first time perspectives on placement from all of the key stakeholders. In addition to providing a unique overview of the placement experience the project team have used the information gathered to develop a useful, transferable toolkit for placement. Publication Information Although every effort has been made to ensure the accuracy of the material contained in this publication, complete accuracy cannot be guaranteed. All or part of this publication may be reproduced without further</description></snippet><snippet id="FW14-e004-7015-05"><link cache="FW14-topics-docs/e004/7015_05.html" timestamp="2014-05-15 13:43:09">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.149.3392&rank=5</link><title>Basic objects in natural categories</title><description>Basic objects in natural categories by Eleanor Rosch, Carolyn B. Mervis, Wayne D. Gray, David M, Penny Boyes-braem \- Cognitive Psychology , 1976 "... , & Raven, 1966); and finally, the location of natural groupings at a particular level of abstraction ..." Abstract \- Cited by 487 (1 self) \- Add to MetaCart Categorizations which humans make of the concrete world are not arbitrary but highly determined. In taxonomies of concrete objects, there is one level of abstraction at which the most basic category cuts are made. Basic categories are those which carry the most information, possess the highest category cue validity, and are, thus, the most differentiated from one another. The four experiments of Part I define basic objects by demonstrating that in taxonomies of common concrete nouns in English based on class inclusion, basic objects are the most inclusive categories whose members: (a) possess significant numbers of attributes in common, (b) have motor programs which are similar to one another, (c) have similar shapes, and (d) can be identified from averaged shapes of members of the class. The eight experiments of Part II explore implications of the structure of categories. Basic objects are shown to be the most inclusive categories for which a concrete image of the category as a whole can be formed, to be the first categorizations made during perception of the environment, to be the earliest categories sorted and earliest named by children, and to be the categories</description></snippet><snippet id="FW14-e004-7015-06"><link cache="FW14-topics-docs/e004/7015_06.html" timestamp="2014-05-15 13:43:10">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.300.2871&rank=6</link><title>A Bayesian Model of Rule Induction in Raven’s Progressive Matrices</title><description>A Bayesian Model of Rule Induction in Raven’s Progressive Matrices by Daniel R. Little, Stephan Lewandowsky, Crawley Wa, Thomas L. Griffiths (tom "... A Bayesian Model of Rule Induction in Raven’s Progressive Matrices Daniel R. Little (daniel ..." Abstract \- Cited by 1 (0 self) \- Add to MetaCart <em>Raven’s</em> Progressive Matrices (<em>Raven</em>, <em>Raven</em>, & Court, 1998) is one of the most prevalent assays</description></snippet><snippet id="FW14-e004-7015-07"><link cache="FW14-topics-docs/e004/7015_07.html" timestamp="2014-05-15 13:43:11">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.225.3291&rank=7</link><title>A Structure-Mapping Model of Raven’s Progressive Matrices</title><description>A Structure-Mapping Model of Raven’s Progressive Matrices by Andrew Lovett, Kenneth Forbus, Jeffrey Usher "... A Structure-Mapping Model of Raven’s Progressive Matrices Andrew Lovett (andrew ..." Abstract \- Cited by 5 (2 self) \- Add to MetaCart We present a computational model for solving <em>Raven’s</em> Progressive Matrices. This model combines</description></snippet><snippet id="FW14-e004-7015-08"><link cache="FW14-topics-docs/e004/7015_08.html" timestamp="2014-05-15 13:43:12">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.231.3664&rank=8</link><title>RAVEN – Active Learning of Link Specifications</title><description>RAVEN – Active Learning of Link Specifications by Axel-cyrille Ngonga Ngomo, Jens Lehmann, Sören Auer, Konrad Höffner "... RAVEN – Active Learning of Link Specifications Axel-Cyrille Ngonga Ngomo, Jens Lehmann, Sören Auer ..." Abstract \- Cited by 7 (1 self) \- Add to MetaCart for a link discovery problem is a tedious task that must still be carried out manually. We present <em>RAVEN</em></description></snippet><snippet id="FW14-e004-7015-09"><link cache="FW14-topics-docs/e004/7015_09.html" timestamp="2014-05-15 13:43:13">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.103.8446&rank=9</link><title>RAVEN: Real-Time Analyzing and Verification Environment</title><description>RAVEN: Real-Time Analyzing and Verification Environment by Jürgen Ruf \- Journal on Universal Computer Science (J.UCS), Springer , 2001 "... RAVEN: Real-Time Analyzing and Verification Environment Jürgen Ruf (University of Tübingen ..." Abstract \- Cited by 16 (3 self) \- Add to MetaCart Abstract: In this paper we present the real-time verification and analysis tool <em>RAVEN</em>. <em>RAVEN</em></description></snippet><snippet id="FW14-e004-7015-10"><link cache="FW14-topics-docs/e004/7015_10.html" timestamp="2014-05-15 13:43:13">http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=EBAC5670A019281E8386BCB54B1D1398?doi=10.1.1.39.9827&rank=10</link><title>The Advantages of Evolutionary Computation</title><description>The Advantages of Evolutionary Computation by David B. Fogel , 1997 "... variants. Others (Atmar, 1979; Raven and Johnson, 1986, pp. 400-401) have suggested that it is more ..." Abstract \- Cited by 396 (5 self) \- Add to MetaCart Evolutionary computation is becoming common in the solution of difficult, realworld problems in industry, medicine, and defense. This paper reviews some of the practical advantages to using evolutionary algorithms as compared with classic methods of optimization or artificial intelligence. Specific advantages include the flexibility of the procedures, as well as the ability to self-adapt the search for optimum solutions on the fly. As desktop computers increase in speed, the application of evolutionary algorithms will become routine. 1 Introduction Darwinian evolution is intrinsically a robust search and optimization mechanism. Evolved biota demonstrate optimized complex behavior at every level: the cell, the organ, the individual, and the population. The problems that biological species have solved are typified by chaos, chance, temporality, and nonlinear interactivities. These are also characteristics of problems that have proved to be especially intractable to classic methods of o...</description></snippet></snippets></search_results></samples>
import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.InputStream; import java.util.HashMap; import java.util.Map; import java.util.Scanner; import java.util.Map.Entry; import java.util.Vector; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.helpers.DefaultHandler; import org.apache.lucene.document.*; public class SAXXMLDocument extends DefaultHandler{ private StringBuilder elementBuffer = new StringBuilder(); private Map<String, String> attributeMap = new HashMap<String, String>(); private HashMap<String, String> vertical = new HashMap<String, String>(); private HashMap<String, String> urls = new HashMap<String, String>(); private Vector<Document> docs; private Document doc; public Document getDocument(InputStream is) throws Exception { SAXParserFactory spf = SAXParserFactory.newInstance(); try { SAXParser parser = spf.newSAXParser(); parser.parse(is, this); } catch (Exception e) { throw new Exception("Cannot parse XML document", e); } return doc; } public void startDocument() { //doc = new Document(); } private String queryid, querytext, engineid, enginename, verticalid, snippetid, engineurl; public void startElement(String uri, String localName, String qName, Attributes atts) { elementBuffer.setLength(0); attributeMap.clear(); int numAtts = atts.getLength(); if(numAtts > 0) { for(int i=0; i<numAtts; i++) { attributeMap.put(atts.getQName(i), atts.getValue(i)); } } if(qName.equals("snippet")) { //doc = new Document(); snippetid = attributeMap.get("id"); } } public void characters(char[] text, int start, int length) { elementBuffer.append(text, start, length); } public void endElement(String uri, String localName, String qName) { if(qName.equals("query")) { /* for(Entry<String, String> attribute : attributeMap.entrySet()) { String attName = attribute.getKey(); } */ queryid = attributeMap.get("id"); querytext = elementBuffer.toString(); System.out.println(attributeMap.get("id")); System.out.println(elementBuffer.toString()); } else if(qName.equals("engine")) { engineid = attributeMap.get("id"); engineurl = urls.get(engineid); enginename = attributeMap.get("Name"); verticalid = vertical.get(engineid); System.out.println(attributeMap.get("id")); System.out.println(elementBuffer.toString()); System.out.println("v: "+ engineid + vertical.get(engineid)); } else if(qName.equals("link")) { System.out.println("link: " + elementBuffer.toString()); } else if(qName.equals("title")) { System.out.println("title: " + elementBuffer.toString()); } else if(qName.equals("description")) { System.out.println("description: " + elementBuffer.toString()); } else if(qName.equals("snippet")) { //docs.add(doc); //文件结束 System.out.println("________________________________________________"); //System.out.println("snippet" + elementBuffer.toString()); System.out.println("________________________________________________"); } } public static void main(String[] args) throws FileNotFoundException, Exception { // TODO Auto-generated method stub SAXXMLDocument handler = new SAXXMLDocument(); handler.initResourceInfo(); String input_file = "E:\\FW14-topics-search\\e004\\7015.xml"; Document doc = handler.getDocument(new FileInputStream(new File(input_file))); System.out.println(doc); } public void initResourceInfo() throws FileNotFoundException { Scanner cin = new Scanner(new File("E:\\resources_fedweb2014.txt")); cin.nextLine(); while(cin.hasNext()) { String line = cin.nextLine(); String[] s = line.split("\t"); String engineid = s[0]; String urlid = s[2]; String engineVertical = s[3]; //System.out.println(engineid + " # " + engineVertical); vertical.put(engineid, engineVertical); urls.put(engineid, urlid); } } }
相关文章推荐
- JAVA解析XML的方式DOM、SAX、DOM4J、JDOM、StAX之详解与比较
- XML学习05-Java中SAX方式解析XML文件
- JAVA解析xml(SAX)
- JAVA SAX解析XML文件
- Java SAX 解析xml文件
- Java解析xml的主要解析器: SAX和DOM的选择(附上新方法--Pull解析)
- Java SAX解析器解析XML配置文件,连接数据库
- JAVA解析XML之SAX方式
- Java中使用Dom和Sax两种思想解析XML
- Java中的两种XML解析技术DOM和SAX
- 手写服务器httpserver_xml配置文件_sax解析基础应用JAVA205-206
- Java之DOM,SAX,JDOM,DOM4J,四种解析xml方法比较
- 【慕课笔记】3-4 应用SAX方式解析XML—使用SAX解析将XML的内容和结构存入JAVA对象
- Java Sax 解析 xml
- java sax解析xml
- Java之xml文件解析二(SAX方式解析xml文件)
- Java中使用SAX方式解析XML的问题
- SAX 解析XML 将xml转换成javaBean
- Java解析xml的主要解析器: SAX和DOM的选择(附上新方法--Pull解析)
- Java之SAX解析XMl文件