您的位置：首页 > 其它

ElasticSearch-多索引检索与批量操作

2017-10-19 17:18 537 查看

ElasticSearch的速度已经很快了，但甚至能更快。将多个请求合并成一个，避免单独处理每个请求话费的网络延时和开销。如果你需要从ElasticSearch中检索很多文档，那么使用

multi_get

或者

mget

API来将这些检索请求放在一个请求中，将比逐个文档请求更快的检索到全部文档。

mget

API要求有一个

docs

数组作为参数，每个元素包含检索文档的元数据，包括

_index

，

_type

，

_id

。如果你想检索一个或多个特定的字段，那么你可以通过

_source

参数来指定这些字段的名字

curl -XGET 'http://localhost:9200/_mget' -d '
{
"docs":[
{
"_index":"csdn",
"_type":"blog",
"_id":"1"
},
{
"_index":"grade3",
"_type":"class2",
"_id":"1",
"_source":["name","age"]
}
]
}
'

_index：索引名称

_type：类型

_id：id

_source：过滤输出想要的字段

如果你想要的数据在同一个

_index

(或者同一个

_type

)中，那么你可以URL中指定默认的

/_index

或者

/_index/_type

，但你仍然可以覆盖这些值：

curl -XGET 'http://localhost:9200/csdn/blog/_mget' -d '
{
"docs":[
{
"_id":"1"
},
{
"_index":"grade3",
"_type":"class2",
"_id":"1",
"_source":["name","age"]
}
]
}
'

如果你想要的数据都在同一

_index

并且在同一

_type

中，那么你只需要传递一个

ids

数据即可:

curl -XGET 'http://localhost:9200/csdn/blog/_mget' -d '
{
"ids":["7","8"]
}
'

返回数据结果如下：

{
"docs" : [
{
"_index" : "csdn",
"_type" : "blog",
"_id" : "7",
"_version" : 6,
"found" : true,
"_source" : {
"name" : "python developer",
"addr" : "广东省 深圳市",
"count" : 2,
"favorite" : [
"music",
"football"
]
}
},
{
"_index" : "csdn",
"_type" : "blog",
"_id" : "8",
"found" : false
}
]
}

注意：ID为8的文档未找到，但这并不影响ID为7的文档可以被找到。从数据可以看出，ID为8的数据未找到时，返回{“found” : false}。并且数据的顺序跟请求时，ID在列表的顺序一致。

批量操作-bulk

与

mget

API可以一次性取回多个文档的方式相同，

bulk

允许在一个步骤进行多次

create

、

index

、

update

和

delete

请求。如果你需要索引一个数据量，比如日志事件，他可以排队和索引数百或数千批次。

bulk基本格式如下：

{action:{metadata}}\n
{request body}\n
{action:{metadata}}\n
{request body}\n
...

这种格式类似一个有效的JSON文档流，它通过换行符(\n)连接到了一起。注意两个要点:

每行以\n结尾，包括最后一行，它是一个结束标记，也是一个连接的标记符

每行不能包含未转义的换行符，因为他们将会对解析造成干扰

{action:{metadata}}：动作，并且指定哪一个文档

{request body}：具体信息

action必须是以下选项之一：

create：如果文档不存在就创建它

index：创建一个新文档或者覆盖旧文档

update：部分更新文档

delete：删除文档

例如，一个完整的

update

请求应该是这样的:

{"update":{"_index":"csdn","_type":"blog","_id":"1","_retry_on_conflict":5}}
{"title":"测试"}

上面代码为更新标题操作，值得注意的是：

delete

请求是没有

request body

的，即一个完整的

delete

请求如下:

{"delete":{"_index":"csdn","_type":"blog","_id":"1"}}

把所有的请求整合到一起，一个完整的

_bulk

请求如下:

curl -XPOST 'http://localhost:9200/_bulk' -d '
{"delete":{"_index":"csdn","_type":"blog","_id":"1"}}
{"create":{"_index":"csdn","_type":"blog","_id":"1"}}
{"title":"测试"}
{"index":{"_index":"csdn","_type":"blog"}}
{"title":"index测试"}
{"update":{"_index":"csdn","_type":"blog","_id":"1","_retry_on_conflict":5}}
{"doc":{"title":"测试1"}}
'

将会返回如下结果：

{
"took" : 46,
"errors" : false,
"items" : [
{
"delete" : {
"found" : true,
"_index" : "csdn",
"_type" : "blog",
"_id" : "1",
"_version" : 15,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"status" : 200
}
},
{
"create" : {
"_index" : "csdn",
"_type" : "blog",
"_id" : "1",
"_version" : 16,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"created" : true,
"status" : 201
}
},
{
"index" : {
"_index" : "csdn",
"_type" : "blog",
"_id" : "AV8zdG7wEgThxiHZqqaM",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"created" : true,
"status" : 201
}
},
{
"update" : {
"_index" : "csdn",
"_type" : "blog",
"_id" : "1",
"_version" : 17,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"status" : 200
}
}
]
}

看返回的数据可以看出，将会返回一个

{"errors" : false}

，表示所有的请求都成功完成了，若有相关的请求并未完成或出错，那么将会是

{"errors" : true}

，并在相应的请求中出现错误明细。

delete请求后面不能带有请求体，delete请求后面不能带有请求体，delete请求后面不能带有请求体；最后一行也要换行，最后一行也要换行，最后一行也要换行。重要的事情说三遍

也许你索引的数据到相同的

index

和

type

中，为每一个文档指定相同的元数据是一种浪费。

_bulk

API也具有类似于

_mget

API功能相似的功能，可以URL中指定默认的

/_index

或者

/_index/_type

，但你仍然可以覆盖这些值：

curl -XPOST 'http://localhost:9200/csdn/blog/_bulk' -d '
{"delete":{"_id":"1"}}
{"create":{"_id":"1"}}
{"title":"测试"}
{"index":{"_index":"grade3","_type":"class2"}}
{"title":"index测试"}
'

轻量搜索之多索引多类型

在上一篇博客ElasticSearch-简介中，介绍了

轻量搜索

，我们知道了如何通过URL进行一些简单的搜索，但那只能针对于在同一索引下并且在同一类型下搜索，然而在很多情况下，我们希望能够在多个索引并且在多个类型下进行搜索，我们也可以通过URL来指定特殊的索引和类型到达这种效果：

在

csdn

和

grade3

索引下进行搜索

curl -XGET 'http://localhost:9200/csdn,grade3/_search'

在以c开头或以g开头的索引下进行搜索

curl -XGET 'http://localhost:9200/c*,g*/_search'

在csdn和grade3索引、blog类型和class2类型下进行搜索

curl -XGET 'http://localhost:9200/csdn,grade3/blog,class2/_search'

在所有索引下进行搜索

curl -XGET 'http://localhost:9200/_all/_search'

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： elasticsearch

相关文章推荐

新的分享

章节导航