您的位置:首页 > 其它

测试使用-批量往es索引中添加数据,es的使用小结。

2017-07-25 00:02 351 查看
# encoding:utf8
from datetime import datetime
from elasticsearch import Elasticsearch
import elasticsearch.helpers
import random

es = Elasticsearch(['172.18.1.22:9200', '172.18.1.23:9200', '172.18.1.24:9200', '172.18.1.25:9200', '172.18.1.26:9200'])

es.indices.create(index='test_index', ignore=400)
#es.index(index="skynet_social_twitter_v6", doc_type="test-type", id=42, body={"any": "data", "timestamp": datetime.now()})

package = []
for i in range( 10 ):
row = {
"@timestamp":datetime.now().strftime( "%Y-%m-%dT%H:%M:%S.000+0800" ),
"count" : random.randint(  1, 100 )
}
package.append( row )

actions = [
{
'_op_type': 'index',
'_index': "test_index",
'_type': "test-type",
'_source': d
}
for d in package
]

elasticsearch.helpers.bulk( es, actions )


他人博客总结的:他人总结的es使用小结

给索引取别名,这样告诉使用者别名就ok了。

curl -XPOST 'http://172.18.1.22:9200/_aliases' -d
{
"actions": [
{"add": {"index": "info-test", "alias": "wyl"}}
]
}


移除别名:

curl -XPOST 'http://localhost:9200/_aliases' -d
{
"actions": [
{"remove": {"index": "test1", "alias": "alias1"}}
]
}


重命名一个别名就是一个简单的remove然后add的操作,也是使用相同的API。这个操作是原子的。

重命名:

curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions": [
{"remove": {"index": "test1", "alias": "alias1"}},
{"add": {"index":"test1", "alias": "alias2"}}
]
}'


将一个别名同多个的索引关联起来:

curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions": [
{"add": {"index": "test1", "alias":"alias1"}},
{"add": {"index": "test2", "alias":"alias1"}}
]
}'


向一个指向多个索引的别名去索引数据会引发一个错误。

1、查看集群的所有节点

http://172.24.5.149:9200/_cat/nodes?v


2、查看集群的健康情况

http://172.24.5.149:9200/_cat/health?v


3、查看集群中所有的索引

http://172.24.5.149:9200/_cat/indices?v


4、删除info-test索引

curl -XDELETE 'http://172.24.5.149:9200/info-test'


5、创建info-test索引

curl -XPUT 'http://172.24.5.149:9200/info-test'


6、向索引中插入一个ID为1的文档

curl -XPUT "localhost:9200/info-test/people/1?
{
"name": "John Doe"
}"


7、在没有ID的情况下向索引中插入文档,ES会随机生成一个ID:

curl -XPOST "localhost:9200/info-test/people?
{
"name": "John Doe"
}"


8、根据ID查询文档

curl -XGET 'localhost:9200/info-test/people/1?


9、更新ID为1的文档,将name字段的值改为Jane Doe

curl -XPOST "localhost:9200/info-test/people/1/_update?
{
"doc": { "name": "Jane Doe" }
}"


10、更新ID为1的文档,将name字段的值改为Jane Doe,同时加上age字段

curl -XPOST "localhost:9200/info-test/people/1/_update?
{
"doc": { "name": "Jane Doe", "age": 20 }
}


11、通过脚本来执行,给ID为1的文档的age属性值加5

curl -XPOST "localhost:9200/info-test/people/1/_update?
{
"script" : "ctx._source.age += 5"
}"


在上面的例子中,ctx._source指向当前要被更新的文档。


12、删除ID为2的文档

curl -XDELETE "localhost:9200/info-test/people/2?"
可以设置超时时间
curl -XDELETE 'http://localhost:9200/twitter/tweet/1?timeout=5m'


13、删除名字中包含“John”的所有文档

curl -XDELETE "localhost:9200/info-test/people/_query?
{
"query": { "match": { "name": "John" } }
}


14、批量插入ID为1和ID为2的文档

curl -XPOST 'localhost:9200/info-test/people/_bulk? {"index":{"_id":"1"}}{"name": "John Doe" }{"index":{"_id":"2"}}{"name": "Jane Doe" }'


15、批量更新ID为1的文档,删除ID为2的文档

curl -XPOST 'localhost:9200/customer/external/_bulk?
{"update":{"_id":"1"}}
{"doc": { "name": "John Doe becomes Jane Doe" } }
{"delete":{"_id":"2"}}'


16、搜索info-test索引中的所有文档

curl 'localhost:9200/info-test/_search?q=*'


17、使用POST请求体搜索info-test索引中的所有文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} }
}'


18、使用POST请求体搜索info-test索引中的所有文档,但只要求返回一个文档(默认返回10个)

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"size": 1
}'


19、使用POST请求体搜索info-test索引中的所有文档,返回第11到第20个文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"from": 10,
"size": 10
}'


如果不指定from的值,它默认就是0。


20、使用POST请求体搜索info-test索引中的所有文档并按照name属性降序排列

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"sort": { "name": { "order": "desc" } }
}'


21、使用POST请求体搜索info-test索引中的所有文档,但是只要求返回部分字段

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_all": {} },
"_source": ["age", "name"]
}'


22、使用POST请求体搜索info-test索引中age属性值为20的文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match": { "age": 20 } }
}


23、使用POST请求体搜索info-test索引中address属性值包含mill lane的文档.(Jane Doe相当于一个短语)

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": { "match_phrase": { "address": "mill lane" } }
}'


24、使用POST请求体搜索info-test索引中address属性值包含”mill”和”lane”的文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": {
"bool": {
"must": [
{ "match": { "address": "mill" } },
{ "match": { "address": "lane" } }
]
}
}
}'
must:and。 should: or。 must_not:非。


25、使用POST请求体搜索info-test索引中balance的属性值在2000大于等于20000并且小于等于30000的文档

curl -XPOST 'localhost:9200/info-test/_search?
{
"query": {
"filtered": {
"query": { "match_all": {} },
"filter": {
"range": {
"balance": {
"gte": 20000,
"lte": 30000
}
}
}
}
}
}'


26、使用POST请求体搜索info-test索引中的文档,并按照state属性分组

curl -XPOST ‘localhost:9200/info-test/_search?

{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state"
}
}
}
}'


响应(其中一部分)是:

"hits" : {
"total" : 1000,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"group_by_state" : {
"buckets" : [ {
"key" : "al",
"doc_count" : 21
}, {
"key" : "tx",
"doc_count" : 17
}, {
"key" : "id",
"doc_count" : 15
}, {
"key" : "ma",
"doc_count" : 15
}, {
"key" : "md",
"doc_count" : 15
}, {
"key" : "pa",
"doc_count" : 15
}, {
"key" : "dc",
"doc_count" : 14
}, {
"key" : "me",
"doc_count" : 14
}, {
"key" : "mo",
"doc_count" : 14
}, {
"key" : "nd",
"doc_count" : 14
} ]
}
}
}


27、 在先前聚合的基础上,现在这个例子计算了每个州的账户的平均余额

curl -XPOST 'localhost:9200/bank/_search?
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}'


28、基于前面的聚合,现在让我们按照平均余额进行排序:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_state": {
"terms": {
"field": "state",
"order": {
"average_balance": "desc"
}
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}'


29、使用年龄段(20-29,30-39,40-49)分组,然后在用性别分组,然后为每一个年龄段的每一个性别计算平均账户余额:

curl -XPOST 'localhost:9200/bank/_search?pretty' -d '
{
"size": 0,
"aggs": {
"group_by_age": {
"range": {
"field": "age",
"ranges": [
{
"from": 20,
"to": 30
},
{
"from": 30,
"to": 40
},
{
"from": 40,
"to": 50
}
]
},
"aggs": {
"group_by_gender": {
"terms": {
"field": "gender"
},
"aggs": {
"average_balance": {
"avg": {
"field": "balance"
}
}
}
}
}
}
}
}'


30、给已有的mapping新增一个字段

POST /information/_mapping/email1
{
"properties": {
"name": {
"type": "text",
"index": "analyzed"
}
}
}


31、设置索引的setting

PUT /atom/_settings
{
"settings": {

"index.mapping.total_fields.limit": 4000

},
"index": {
"refresh_interval": "30s",
"number_of_replicas":"0"
}
}


32、查看指定type的mapping(如果不指定type,则查看index下面所有type的mapping)

GET /atom/_mapping/人类


33、条件更新_update_by_query

POST /index/type/_update_by_query?conflicts=proceed
{
"script": {
"inline": "ctx._source.ontology_type=(params.tag)",
"lang": "painless",
"params": {
"tag": "event"
}
},
"query": {
"match_all": {}
}
}


34、查询某个type下面的所有数据

POST /atom/欧洲排球锦标赛/_search
{
"query": {
"match_all": {}
}
}


35、创建文档的时候带版本号

PUT twitter/tweet/1?version=2
{
"message" : "elasticsearch now has versioning support, double cool!"
}


version类型:internal、external or external_gt、external_gte

36、创建文档的时候带op_type参数

PUT twitter/tweet/1?op_type=create
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}


或者

PUT twitter/tweet/1/_create
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}


37、创建文档的时候自动生成id字段

POST twitter/tweet/
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}


38、创建文档的时候指定路由字段

POST twitter/tweet?routing=kimchy
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}


39、创建文档时设置超时时间

PUT twitter/tweet/1?timeout=5m
{
"user" : "kimchy",
"post_date" : "2011-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}


40、查询时不要source字段

GET twitter/tweet/0?_source=false


41、查询时选择source中的字段

GET twitter/tweet/0?_source_include=*.id&_source_exclude=entities


或者

GET twitter/tweet/0?_source=*.id,retweeted


42、只获取source里面的字段

GET twitter/tweet/1/_source


也可以选择source里面的部分字段

GET twitter/tweet/1/_source?_source_include=*.id&_source_exclude=entities'


43、自定义routing

GET twitter/tweet/2?routing=user1


创建文档的时候指定了routing的话,查询时候也要带上routing

44、给指定的type创建mapping

POST /information/_mapping/email1
{
"properties": {
"name": {
"type": "text",
"index": "analyzed"
}
}
}


45、delete_by_query

POST atom_v3/news/_delete_by_query?conflicts=proceed
{
"query": {
"match": {
"docType": "news"
}
}
}


46、强制合并索引的segment

POST atom_v3/_forcemerge?max_num_segments=5


47、查看某个索引的segments

http://172.24.8.83:9200/atom_v3/_segments


或者

http://172.24.8.83:9200/_cat/segments/atom_v3


48、创建索引的同时创建mapping

PUT my_index
{
"mappings": {
"user": {
"_all": {
"enabled": false
},
"properties": {
"title": {
"type": "text"
},
"name": {
"type": "text"
},
"age": {
"type": "integer"
}
}
},
"blogpost": {
"_all": {
"enabled": false
},
"properties": {
"title": {
"type": "text"
},
"body": {
"type": "text"
},
"user_id": {
"type": "keyword"
},
"created": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}
}


49、reindex:index之间的数据导入

POST _reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "new_twitter"
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: