您的位置：首页 > 其它

测试使用-批量往es索引中添加数据,es的使用小结。

2017-07-25 00:02 351 查看

# encoding:utf8
from datetime import datetime
from elasticsearch import Elasticsearch
import elasticsearch.helpers
import random

es = Elasticsearch(['172.18.1.22:9200', '172.18.1.23:9200', '172.18.1.24:9200', '172.18.1.25:9200', '172.18.1.26:9200'])

es.indices.create(index='test_index', ignore=400)
#es.index(index="skynet_social_twitter_v6", doc_type="test-type", id=42, body={"any": "data", "timestamp": datetime.now()})

package = []
for i in range( 10 ):
row = {
"@timestamp":datetime.now().strftime( "%Y-%m-%dT%H:%M:%S.000+0800" ),
"count" : random.randint(  1, 100 )
}
package.append( row )

actions = [
{
'_op_type': 'index',
'_index': "test_index",
'_type': "test-type",
'_source': d
}
for d in package
]

elasticsearch.helpers.bulk( es, actions )

他人博客总结的：他人总结的es使用小结

给索引取别名，这样告诉使用者别名就ok了。

curl -XPOST 'http://172.18.1.22:9200/_aliases' -d
{
"actions": [
{"add": {"index": "info-test", "alias": "wyl"}}
]
}

移除别名：

curl -XPOST 'http://localhost:9200/_aliases' -d
{
"actions": [
{"remove": {"index": "test1", "alias": "alias1"}}
]
}

重命名一个别名就是一个简单的remove然后add的操作，也是使用相同的API。这个操作是原子的。

重命名:

curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions": [
{"remove": {"index": "test1", "alias": "alias1"}},
{"add": {"index":"test1", "alias": "alias2"}}
]
}'

将一个别名同多个的索引关联起来：

curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions": [
{"add": {"index": "test1", "alias":"alias1"}},
{"add": {"index": "test2", "alias":"alias1"}}
]
}'

向一个指向多个索引的别名去索引数据会引发一个错误。

1、查看集群的所有节点

http://172.24.5.149:9200/_cat/nodes?v

2、查看集群的健康情况

http://172.24.5.149:9200/_cat/health?v

3、查看集群中所有的索引

http://172.24.5.149:9200/_cat/indices?v

4、删除info-test索引

curl -XDELETE 'http://172.24.5.149:9200/info-test'

5、创建info-test索引

curl -XPUT 'http://172.24.5.149:9200/info-test'

6、向索引中插入一个ID为1的文档

curl -XPUT "localhost:9200/info-test/people/1?
{
"name": "John Doe"
}"

7、在没有ID的情况下向索引中插入文档，ES会随机生成一个ID：

curl -XPOST "localhost:9200/info-test/people?
{
"name": "John Doe"
}"

8、根据ID查询文档

curl -XGET 'localhost:9200/info-test/people/1?

9、更新ID为1的文档，将name字段的值改为Jane Doe

curl -XPOST "localhost:9200/info-test/people/1/_update?
{
"doc": { "name": "Jane Doe" }
}"

10、更新ID为1的文档，将name字段的值改为Jane Doe，同时加上age字段

curl -XPOST "localhost:9200/info-test/people/1/_update?
{
"doc": { "name": "Jane Doe", "age": 20 }
}

11、通过脚本来执行，给ID为1的文档的age属性值加5

curl -XPOST "localhost:9200/info-test/people/1/_update?
{
"script" : "ctx._source.age += 5"
}"

在上面的例子中，ctx._source指向当前要被更新的文档。

12、删除ID为2的文档

curl -XDELETE "localhost:9200/info-test/people/2?"
可以设置超时时间
curl -XDELETE 'http://localhost:9200/twitter/tweet/1?timeout=5m'

13、删除名字中包含“John”的所有文档

curl -XDELETE "localhost:9200/info-test/people/_query?
{
"query": { "match": { "name": "John" } }
}

14、批量插入ID为1和ID为2的文档

curl -XPOST 'localhost:9200/info-test/people/_bulk? {"index":{"_id":"1"}}{"name": "John Doe" }{"index":{"_id":"2"}}{"name": "Jane Doe" }'

15、批量更新ID为1的文档，删除ID为2的文档

curl -XPOST 'localhost:9200/customer/external/_bulk?
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}'

16、搜索info-test索引中的所有文档

curl 'localhost:9200/info-test/_search?q=*'

17、使用POST请求体搜索info-test索引中的所有文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} }
}'

18、使用POST请求体搜索info-test索引中的所有文档，但只要求返回一个文档（默认返回10个）

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"size": 1
}'

19、使用POST请求体搜索info-test索引中的所有文档，返回第11到第20个文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"from": 10,
"size": 10
}'

如果不指定from的值，它默认就是0。

20、使用POST请求体搜索info-test索引中的所有文档并按照name属性降序排列

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"sort": { "name": { "order": "desc" } }
}'

21、使用POST请求体搜索info-test索引中的所有文档，但是只要求返回部分字段

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"_source": ["age", "name"]
}'

22、使用POST请求体搜索info-test索引中age属性值为20的文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match": { "age": 20 } }
}

23、使用POST请求体搜索info-test索引中address属性值包含mill lane的文档.（Jane Doe相当于一个短语）

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_phrase": { "address": "mill lane" } }
}'

24、使用POST请求体搜索info-test索引中address属性值包含”mill”和”lane”的文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}'
must：and。 should: or。 must_not:非。

25、使用POST请求体搜索info-test索引中balance的属性值在2000大于等于20000并且小于等于30000的文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": {
"filtered": {
"query": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 20000,
"lte": 30000
}
}
}
}
}
}'

26、使用POST请求体搜索info-test索引中的文档，并按照state属性分组

curl -XPOST ‘localhost:9200/info-test/_search?

{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state"
}
}
}
}'

响应（其中一部分）是：

"hits" : {
"total" : 1000,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"group_by_state" : {
"buckets" : [ {
"key" : "al",
"doc_count" : 21
}, {
"key" : "tx",
"doc_count" : 17
}, {
"key" : "id",
"doc_count" : 15
}, {
"key" : "ma",
"doc_count" : 15
}, {
"key" : "md",
"doc_count" : 15
}, {
"key" : "pa",
"doc_count" : 15
}, {
"key" : "dc",
"doc_count" : 14
}, {
"key" : "me",
"doc_count" : 14
}, {
"key" : "mo",
"doc_count" : 14
}, {
"key" : "nd",
"doc_count" : 14
} ]
}
}
}

27、在先前聚合的基础上，现在这个例子计算了每个州的账户的平均余额

curl -XPOST 'localhost:9200/bank/_search?
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}'

28、基于前面的聚合，现在让我们按照平均余额进行排序：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state",
"order": {
"average_balance": "desc"
}
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}'

29、使用年龄段（20-29，30-39，40-49）分组，然后在用性别分组，然后为每一个年龄段的每一个性别计算平均账户余额：

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_age": {
"range": {
"field": "age",
"ranges": [
{
"from": 20,
"to": 30
},
{
"from": 30,
"to": 40
},
{
"from": 40,
"to": 50
}
]
},
"aggs": {
"group_by_gender": {
"terms": {
"field": "gender"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}
}
}'

30、给已有的mapping新增一个字段

POST /information/_mapping/email1
{
"properties": {
"name": {
"type": "text",
"index": "analyzed"
}
}
}

31、设置索引的setting

PUT /atom/_settings
{
"settings": {

"index.mapping.total_fields.limit": 4000

},
"index": {
"refresh_interval": "30s",
"number_of_replicas":"0"
}
}

32、查看指定type的mapping（如果不指定type，则查看index下面所有type的mapping）

GET /atom/_mapping/人类

33、条件更新_update_by_query

POST /index/type/_update_by_query?conflicts=proceed
{
"script": {
"inline": "ctx._source.ontology_type=(params.tag)",
"lang": "painless",
"params": {
"tag": "event"
}
},
"query": {
"match_all": {}
}
}

34、查询某个type下面的所有数据

POST /atom/欧洲排球锦标赛/_search
{
"query": {
"match_all": {}
}
}

35、创建文档的时候带版本号

PUT twitter/tweet/1?version=2
{
"message" : "elasticsearch now has versioning support, double cool!"
}

version类型：internal、external or external_gt、external_gte

36、创建文档的时候带op_type参数

PUT twitter/tweet/1?op_type=create
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

或者

PUT twitter/tweet/1/_create
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

37、创建文档的时候自动生成id字段

POST twitter/tweet/
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

38、创建文档的时候指定路由字段

POST twitter/tweet?routing=kimchy
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

39、创建文档时设置超时时间

PUT twitter/tweet/1?timeout=5m
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}

40、查询时不要source字段

GET twitter/tweet/0?_source=false

41、查询时选择source中的字段

GET twitter/tweet/0?_source_include=*.id&_source_exclude=entities

或者

GET twitter/tweet/0?_source=*.id,retweeted

42、只获取source里面的字段

GET twitter/tweet/1/_source

也可以选择source里面的部分字段

GET twitter/tweet/1/_source?_source_include=*.id&_source_exclude=entities'

43、自定义routing

GET twitter/tweet/2?routing=user1

创建文档的时候指定了routing的话，查询时候也要带上routing

44、给指定的type创建mapping

POST /information/_mapping/email1
{
"properties": {
"name": {
"type": "text",
"index": "analyzed"
}
}
}

45、delete_by_query

POST atom_v3/news/_delete_by_query?conflicts=proceed
{
"query": {
"match": {
"docType": "news"
}
}
}

46、强制合并索引的segment

POST atom_v3/_forcemerge?max_num_segments=5

47、查看某个索引的segments

http://172.24.8.83:9200/atom_v3/_segments

或者

http://172.24.8.83:9200/_cat/segments/atom_v3

48、创建索引的同时创建mapping

PUT my_index
{
"mappings": {
"user": {
"_all": {
"enabled": false
},
"properties": {
"title": {
"type": "text"
},
"name": {
"type": "text"
},
"age": {
"type": "integer"
}
}
},
"blogpost": {
"_all": {
"enabled": false
},
"properties": {
"title": {
"type": "text"
},
"body": {
"type": "text"
},
"user_id": {
"type": "keyword"
},
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}

49、reindex:index之间的数据导入

POST _reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "new_twitter"
}
}

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航