您的位置：首页 > 编程语言 > Java开发

安装elasticsearch及中文分词器、客户端连接示例

2016-04-24 19:18 483 查看

本文记录了linux下如何安装elasticsearch及分词器，以及如何通过『spring-data-elasticsearch』连接服务器，并进行索引、搜索。

1、下载elasticsearch

我这里下载的是『elasticsearch-2.2.0.tar.gz』，下载地址如下：
https://www.elastic.co/downloads/elasticsearch

2、安装elasticsearch

安装过程十分简单，官网的描述是：1、下载安装文件并解压；2、运行解压后的bin目录中的elasticsearch文件；3、至此已安装完毕，访问浏览器『http://localhost:9200/』即可看到服务器输出的信息。

如果需要修改默认的配置，可以修改解压后的config文件夹中的 elasticsearch.yml 文件：

node.name节点名称

path.data数据存储路径

path.logs日志文件存储路径

network.host绑定到哪个ip，可设置为 _global_ 表示绑定到任意的ip，或设置为具体的本机的ip，如『192.168.0.100』或『127.0.0.1』或公网ip等

http.port服务器提供rest（HTTP）服务的端口，默认为9200

discovery.zen.ping.unicast.hosts: ["host1", "host2"] 如果启动后需要加入现有的集群中，则指定集群中的某几台机器的ip，用于发现集群中的其他机器，以便加入现有集群

discovery.zen.minimum_master_nodes 集群总节点数的大多数，即总结点数的一半加一，用于防止网络分区、脑裂问题

3、安装分词器

执行如下命令，可安装 ik 分词器

git clone https://github.com/medcl/elasticsearch-analysis-ik
复制分词器的代码
cd elasticsearch-analysis-ik
进入代码的目录
mvn clean package
用maven编译代码
mkdir /mnt/elasticsearch-2.2.0/plugins/ik
在elasticsearch的plugins目录下创建ik目录
cd /mnt/elasticsearch-2.2.0/plugins/ik/
将maven编译后的zip文件解压到ik目录
unzip /mnt/setupFiles/elasticsearch-analysis-ik/target/releases/elasticsearch-analysis-ik-*.zip
在elasticsearch.yml中加入以下内容后，重启即可
index.analysis.default.type: elasticsearch-analysis-ik

执行如下命令，可安装
mmseg 分词器

安装过程和ik分词器的安装过程相同，只是目录和名称不同。

git clone https://github.com/medcl/elasticsearch-analysis-mmseg.git cd elasticsearch-analysis-mmseg/
mvn clean package
mkdir /mnt/elasticsearch-2.2.0/plugins/mmseg
cd /mnt/elasticsearch-2.2.0/plugins/mmseg/
unzip /mnt/setupFiles/elasticsearch-analysis-mmseg/target/releases/elasticsearch-analysis-mmseg-1.8.0.zip
在elasticsearch.yml中加入以下内容后，重启
index.analysis.default.type: mmseg_complex

配置参数index.analysis.default.type是指默认的中文分词器，设置为ik或者mmseg_complex之一即可。

4、编码调用elasticsearch

maven依赖如下

<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-elasticsearch</artifactId>
<version>1.3.4.RELEASE</version>
<exclusions>
<exclusion>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>2.2.0</version>
<exclusions>
<!--<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
</exclusion>-->
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-backward-codecs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queries</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-memory</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-highlighter</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-suggest</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-join</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-spatial</artifactId>
</exclusion>
</exclusions>
</dependency>

这里排除了lucene相关的依赖，因为客户端采用TransportClient的方式的时候，不需要用到相关的jar包。

java调用代码示例如下

importorg.elasticsearch.action.index.IndexResponse;
import
org.elasticsearch.action.search.SearchResponse;
import
org.elasticsearch.client.transport.TransportClient;
import
org.elasticsearch.common.transport.InetSocketTransportAddress;
import
org.elasticsearch.common.xcontent.XContentFactory;
import
org.elasticsearch.index.query.QueryBuilders;

import
java.io.IOException;
import
java.net.InetSocketAddress;

public class
App {
private static
TransportClient
transportClient
= TransportClient.builder().build().addTransportAddress(
new
InetSocketTransportAddress(new
InetSocketAddress("123.56.179.51",
9300)));
public static void
main(String[] args) {
try
{
// http://123.56.179.51:9200/test_index6/_analyze?analyzer=mmseg_complex&pretty&text=中文内容 IndexResponse
indexResponse = transportClient.prepareIndex("test_index6",
"testType",
"1").setSource(XContentFactory.jsonBuilder()
.startObject()
.field("id",
1)
.field("type",
2)
.field("title",
"hello world")
.field("content",
"hello world content")
.field("content2",
"中文和英文内容")
.field("content4",
"中文和英文内容")
.endObject()).execute().actionGet();
System.out.println(indexResponse.isCreated());

indexResponse =
transportClient.prepareIndex("test_index6",
"testType",
"2").setSource(XContentFactory.jsonBuilder()
.startObject()
.field("id",
2)
.field("type",
3)
.field("title",
"hello world")
.field("content",
"hello world content")
.field("content2",
"中文内容和其他内容")
.field("content4",
"英文内容")
.endObject()).execute().actionGet();
} catch
(IOException e) {
e.printStackTrace();
}

SearchResponse searchResponse =
transportClient.prepareSearch("test_index6").setTypes("testType")
.setQuery(QueryBuilders.termQuery("type",
2)).execute().actionGet();
System.out.println(searchResponse);
System.out.println("termQuery准确查询");

searchResponse =
transportClient.prepareSearch("test_index6").setTypes("testType")
.setQuery(QueryBuilders.matchPhraseQuery("content4",
"英文内容")).execute().actionGet();
System.out.println(searchResponse);
System.out.println("matchPhraseQuery只查询连在一起的");

searchResponse =
transportClient.prepareSearch("test_index6").setTypes("testType")
.setQuery(QueryBuilders.matchQuery("content4",
"中文内容")).execute().actionGet();
System.out.println(searchResponse);
System.out.println("matchQuery查询包括不连在一起的");
}
}

注意，如果代码抛出异常，提示index不存在，则可以先调用『transportClient』的创建index的方法。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： elasticsearch 中文分词分词器 transportClient spring-data-elastics

相关文章推荐

新的分享

章节导航