您的位置:首页 > 编程语言

lucene中文分词搜索的核心代码

2017-04-23 12:49 435 查看
public static void search(String indexDir,String q)throws Exception{
Directory dir=FSDirectory.open(Paths.get(indexDir));
IndexReader reader=DirectoryReader.open(dir);
IndexSearcher is=new IndexSearcher(reader);
// Analyzer analyzer=new StandardAnalyzer(); // 标准分词器
SmartChineseAnalyzer analyzer=new SmartChineseAnalyzer();
QueryParser parser=new QueryParser("desc", analyzer);
Query query=parser.parse(q);
long start=System.currentTimeMillis();
TopDocs hits=is.search(query, 10);
long end=System.currentTimeMillis();
System.out.println("匹配 "+q+" ,总共花费"+(end-start)+"毫秒"+"查询到"+hits.totalHits+"个记录");

QueryScorer scorer=new QueryScorer(query);
Fragmenter fragmenter=new SimpleSpanFragmenter(scorer);
SimpleHTMLFormatter simpleHTMLFormatter=new SimpleHTMLFormatter("<b><font color='red'>","</font></b>");
Highlighter highlighter=new Highlighter(simpleHTMLFormatter, scorer);
highlighter.setTextFragmenter(fragmenter);
for(ScoreDoc scoreDoc:hits.scoreDocs){
Document doc=is.doc(scoreDoc.doc);
System.out.println(doc.get("city"));
System.out.println(doc.get("desc"));
String desc=doc.get("desc");
if(desc!=null){
TokenStream tokenStream=analyzer.tokenStream("desc", new StringReader(desc));
System.out.println(highlighter.getBestFragment(tokenStream, desc));
}
}
reader.close();
}


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: