mongodb学习记录之三:索引
2014-03-26 22:11
211 查看
索引简介
索引就是用来加速查询的现在要按照某个键进行查询:
>db.students.find({"name":"李明"});
当查询中仅含有一个键时,可以对该键创建索引,以提高查询速度。
本例中对name创建索引,创建索引使用ensureIndex方法
>db.students.ensureIndex({"name":1})
对于同一个集合,同样的索引只需要创建一次即可。
对某个键创建的索引会加速对该键的查询,而对其他键的查询并不起作用,即使查询中包含了索引的键。
实践证明,一定要创建查询中用到的所有键的索引。
索引的方向
如果有多个键,就得考虑索引的方向。例如还是之前的students集合
{"_id":...,"name":"smith","age":48,"uid":1} {"_id":...,"name":"smith","age":32,"uid":2} {"_id":...,"name":"joe","age":36,"uid":3} {"_id":...,"name":"joe","age":35,"uid":4} {"_id":...,"name":"john","age":33,"uid":5}
如果按照{name:1,age:-1}这种方式创建索引,mongo会按照如下方式组织:
{"_id":...,"name":"joe","age":36,"uid":3} {"_id":...,"name":"joe","age":35,"uid":4} {"_id":...,"name":"john","age":33,"uid":5} {"_id":...,"name":"smith","age":48,"uid":1} {"_id":...,"name":"smith","age":32,"uid":2}
用户名按照字母升序排序,age按照降序排序
一般来说,如果索引包含n个键,则对于前几个键的查询都会有帮助。
例如,索引为{"a":1,"b":1,"c":1,...,"z":1},实际上是有了{"a":1}{"a":1,"b":1}{"a":1,"b":1,"c":1}...等的索引。
但是使用{"b":1}{"a":1,"c":1}却不被优化。
创建索引的缺点就是,每次插入,更新和删除时都会产生额外的开销。这是因为数据库不仅要执行这些操作,还要将这些操作在集合的索引中标记。因此要尽可能少的创建索引。
为内嵌文档创建索引
为内嵌文档创建索引和为普通的键创建索引没有什么区别。例如,要想按日期搜索博客的评论,可以在由内嵌的comments文档组成的数组中对date键创建索引>db.blog.ensureIndex({"comments.date":1})
为排序创建索引
在数据量很大时,如果没有对索引进行排序,那么mongodb需要将所有的数据提取到内存进行排序。
唯一索引
唯一索引可以确保文档中指定的键都有唯一值。>db.students.ensureIndex({"name":1},{"unique":true})
这样我们便给name键做了唯一索引。确保name键不能出现重复的值
当一个记录被插入到唯一性索引文档时,缺失的字段会以null为默认值被插入文档
上面这句话有点迷糊,下面写个例子测试一下就明白了。
新建一个person集合,结构如下:
{ "_id" : ObjectId("5332777c74d09b6a7018fbf4"), "name" : "张明明", "sex" : "男", "age" : 23, "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] }
有5个键,下面我们在name和age上创建一个唯一索引
>db.person.ensureIndex({"name":1,"age":-1},{"unipue":true})
创建完毕后用getIndexes查看刚才创建的索引。
{ "v" : 1, "key" : { "name" : 1, "age" : -1 }, "unique" : true, "ns" : "test.person", "name" : "name_1_age_-1" }
下面我们插入数据进行测试.第一条数据已完整的插入,如上
我们先插入一条姓名为"张明",年龄为23的文档,成功。结果如下:
db.person.insert({name:"张明",age:23,sex:"男",books:["诗人的世界","Java编程思想","HTML5实战"]}) 结果: {"name" : "张明明", "sex" : "男", "age" : 23, "books" : [ "诗人的世界","Java编程思想", "HTML5实战" ] } {"name" : "张明", "age" : 23, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] }
下面我们再插入一条:
db.person.insert({name:"张明",age:22,sex:"男",books:["诗人的世界","Java编程思想","HTML5实战"]}) 结果: { "name" : "张明明", "sex" : "男", "age" : 23, "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] { "name" : "张明", "age" : 23, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "name" : "张明", "age" : 22, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] }
我们再插入一条姓名和年龄都相同的数据看看:
db.person.insert({name:"张明",age:22,sex:"男",books:["诗人的世界","Java编程思想","HTML5实战"]}) 结果: E11000 duplicate key error index: test.person.$name_1_age_-1 dup key: { : "张明", : 22.0 }
上面两个例子可以看出,创建了唯一索引后,创建索引的键组合是唯一的,不能重复的。
如果创建了索引后,在插入时不包含索引的键会怎样?
db.person.insert({age:22,sex:"男",books:["诗人的世界","Java编程思想","HTML5实战"]}) 结果: { "name" : "张明明", "sex" : "男", "age" : 23, "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "name" : "张明", "age" : 23, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "name" : "张明", "age" : 22, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "age" : 22, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] }
db.person.insert({name:null,age:null,sex:"男",books:["诗人的世界","Java编程思想","HTML5实战"]}) 结果: { "name" : "张明明", "sex" : "男", "age" : 23, "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "name" : "张明", "age" : 23, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "name" : "张明", "age" : 22, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "age" : 22, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] } { "name" : null, "age" : null, "sex" : "男", "books" : [ "诗人的世界", "Java编程思想", "HTML5实战" ] }
这时候我们再插入一条数据,没有姓名,没有年龄:
db.person.insert({sex:"男",books:["诗人的世界","Java编程思想","HTML5实战"]}) 结果: E11000 duplicate key error index: test.person.$name_1_age_-1 dup key: { : null, : null }
由此我们就可以慢慢理解了,当一个键值为null时,其实和没有这个键是一样的。这也就是说,如果在插入时,唯一索引缺失的键会以null为默认值
消除重复
有时候我们在创建索引之前,数据库中已存在了重复的键值,此时创建索引失败。我们可以使用dropDups消除重复,只保留发现的第一个文档,删除其他重复的文档,然后创建索引。db.person.ensureIndex({"name":1,"age":-1},{"unique":true,"dropDups":true})
地理空间索引(二维索引)
这是我从百度地图上截下来的图,下面是我自己定义的坐标点(x,和y的范围都是从1到100)
var maps = [ {"name":"公厕1","gis":{"x":40,"y":8}}, {"name":"金水桥","gis":{"x":50,"y":10}}, {"name":"故宫博物院南门","gis":{"x":50,"y":0}}, {"name":"公厕2","gis":{"x":65,"y":10}}, {"name":"西华门","gis":{"x":5,"y":13}}, {"name":"公厕3","gis":{"x":25,"y":15}}, {"name":"故宫博物院","gis":{"x":50,"y":18}}, {"name":"主敬殿","gis":{"x":74,"y":15}}, {"name":"东华门","gis":{"x":95,"y":13}}, {"name":"浴德堂","gis":{"x":20,"y":22}}, {"name":"体仁阁","gis":{"x":64,"y":23}}, {"name":"右翼门","gis":{"x":41,"y":35}}, {"name":"左翼门","gis":{"x":65,"y":35}}, {"name":"中国第一历史档案馆","gis":{"x":8,"y":37}}, {"name":"中和殿","gis":{"x":50,"y":50}}, {"name":"箭亭","gis":{"x":68,"y":53}}, {"name":"公厕4","gis":{"x":39,"y":57}}, {"name":"公厕5","gis":{"x":66,"y":60}}, {"name":"皇极右门","gis":{"x":82,"y":62}}, {"name":"故宫礼品店","gis":{"x":43,"y":62}}, {"name":"春花门","gis":{"x":24,"y":73}}, {"name":"凤彩门","gis":{"x":46,"y":75}}, {"name":"景耀门","gis":{"x":64,"y":73}}, {"name":"右鼓馆","gis":{"x":92,"y":70}}, {"name":"宁寿宫","gis":{"x":85,"y":78}}, {"name":"履和门","gis":{"x":65,"y":80}}, {"name":"翊坤宫","gis":{"x":42,"y":83}}, {"name":"咸福宫","gis":{"x":39,"y":85}}, {"name":"故宫商店1","gis":{"x":55,"y":90}}, {"name":"颐和轩","gis":{"x":86,"y":93}}, {"name":"故宫商店2","gis":{"x":46,"y":94}}, {"name":"倦勤斋","gis":{"x":80,"y":94}}, {"name":"珍妃灵堂","gis":{"x":90,"y":90}}, {"name":"故宫博物院北门","gis":{"x":50,"y":100}} ] for(var i=0;i<maps.length;i++){ db.maps.insert(maps[i]); }
首先根据坐标建立2d索引
db.maps.ensureIndex({"gis":"2d"},{min:-1,max:101});
假设我现在最中间的中和殿,我突然来急了,想去厕所,怎样找到最近的厕所呢?
db.maps.find({"name":{"$regex":"公厕.*"},"gis":{"$near":[50,50]}}); 查询结果: { "_id" : ObjectId("53329d6c99589bfa25a5300f"), "name" : "公厕4", "gis" : { "x" : 39, "y" : 57 } } { "_id" : ObjectId("53329d6c99589bfa25a53010"), "name" : "公厕5", "gis" : { "x" : 66, "y" : 60 } } { "_id" : ObjectId("53329d6c99589bfa25a53002"), "name" : "公厕2", "gis" : { "x" : 65, "y" : 10 } } { "_id" : ObjectId("53329d6c99589bfa25a53004"), "name" : "公厕3", "gis" : { "x" : 25, "y" : 15 } } { "_id" : ObjectId("53329d6c99589bfa25a52fff"), "name" : "公厕1", "gis" : { "x" : 40, "y" : 8 } }
也就是说,最近的公厕是公厕4,直接去公厕4方便就好了。
上完公厕后,我想看看附近20米范围内有哪些好玩的地方,怎么找呢?
db.maps.find({"gis":{"$within":{"$center":[[39,57],20]}}},{"name":1,"gis":1,"_id":0}) 查询结果: { "name" : "公厕4", "gis" : { "x" : 39, "y" : 57 } } { "name" : "中和殿", "gis" : { "x" : 50, "y" : 50 } } { "name" : "凤彩门", "gis" : { "x" : 46, "y" : 75 } } { "name" : "故宫礼品店", "gis" : { "x" : 43, "y" : 62 } } 注意:这里并不是按照距离远近排序的,而是20米内所有的点
中和殿刚去过了,不去了,去故宫礼品店看看吧,顺便给朋友带点小礼品啥的。
从故宫礼品店出来,到故宫博物院北门还有一段距离,看看以这两个点为对角线上的矩形区域内还有哪些景点吧
db.maps.find({"gis":{"$within":{"$box":[[43,62],[50,100]]}}},{"name":1,"gis":1,"_id":0}) 查询结果: { "name" : "故宫商店2", "gis" : { "x" : 46, "y" : 94 } } { "name" : "故宫博物院北门", "gis" : { "x" : 50, "y" : 100 } } { "name" : "凤彩门", "gis" : { "x" : 46, "y" : 75 } } { "name" : "故宫礼品店", "gis" : { "x" : 43, "y" : 62 } }
顺路又玩了一会,出了北门,走了。
总结
空间索引也是通过ensureIndex来创建,只不过创建的时候参数不是1,而是2d,创建的时候,如果不指定范围,默认是-180到180。也可以通过{min:min,max:max}手动指定范围
空间索引查询有$near和$within,$near是查询距离,$within是查询形状范围。$within里有两种,一种是圆形,一种是矩形
圆形以$center指定圆心和半径,矩形以$box指定对角线上的两个点
相关文章推荐
- MongoDB学习记录—索引
- MongoDB学习记录06-索引
- mongodb学习笔记之索引(转)
- 学习记录之mongodb
- Mongodb学习使用记录
- MongoDB学习(四)——MongoDB修改记录
- MongoDB学习记录
- MongoDB 学习笔记(四):索引
- M102: MongoDB for DBAs chapter 2学习记录
- M102: MongoDB for DBAs chapter 1学习记录
- M102: MongoDB for DBAs chapter 1学习记录
- MongoDB学习日记 - java代码(六):索引 index
- MongoDB学习 索引
- mysql 学习记录(九)--索引、视图
- mongodb学习(索引详解)
- mongodb之索引学习
- MongoDB 基础命令使用学习记录
- MongoDB学习笔记(六) MongoDB索引用法和效率分析
- 【MongoDB学习笔记22】MongoDB的索引管理
- [置顶] mongodb之索引学习