您的位置:首页 > 其它

Lucene教程(三)- 理解搜索过程的核心类

2013-10-29 22:24 363 查看
上一篇博客,我们学习了索引过程的核心类,并且重构了IndexFiles,

在这一篇博客,我们学习一下搜索过程的核心类,并重构一下SearchFiles类。

// Now search the index:
DirectoryReader ireader = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);
// Parse a simple query that searches for "text":
QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "fieldname", analyzer);
Query query = parser.parse("text");
ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
assertEquals(1, hits.length);
// Iterate through the results:
for (int i = 0; i < hits.length; i++) {
Document hitDoc = isearcher.doc(hits[i].doc);
assertEquals("This is the text to be indexed.", hitDoc.get("fieldname"));
}
ireader.close();
directory.close();


1.IndexSearcher

        IndexSearcher用于搜索IndexWriter类所创建的索引。可以将IndexSearcher类看作是一个以只读方式打开索引的类,并提供了一些search()方法。

2. Term

        项(Term)是用于搜索的一个基本单元。如同域对象一样,它包括了一对字符串元素:与域中的域名和域值相对应。

3. Query

        Lucene有很多的具体的查询(Query)子类,后面会详细讲解。



下面,我们来重构一下SearchFiles类,删除了分页操作,和一些参数什么的。

package org.ygy.lucene;

import java.io.File;
import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;

public class SearchFiles {

public static void main(String[] args) throws Exception {

String index = "F:\\Lucene_index";
String field = "contents";
String queryString = "aa";

int hitsPerPage = 10;

//读取索引
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));

//查询索引
IndexSearcher searcher = new IndexSearcher(reader);

//分析器
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);

//解析器
QueryParser parser = new QueryParser(Version.LUCENE_45, field, analyzer);

Query query = parser.parse(queryString);

System.out.println("Searching for: " + query.toString(field));

doSearch(searcher, query, hitsPerPage);

reader.close();
}

public static void doSearch(IndexSearcher searcher, Query query, int hitsPerPage) throws IOException {
TopDocs results = searcher.search(query, 5 * hitsPerPage);
ScoreDoc[] hits = results.scoreDocs;

int numTotalHits = results.totalHits;
System.out.println(numTotalHits + " total matching documents");

int start = 0;
int end = Math.min(numTotalHits, hitsPerPage);

//遍历查询结果
for (int i = start; i < end; i++) {
Document doc = searcher.doc(hits[i].doc);

String path = doc.get("path");
if (path != null) {
System.out.println(i + 1 + ". " + path);
} else {
System.out.println(i + 1 + ". " + "No path for this document");
}
}
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: