MongoDb 中 serverStatus was very slow 的原因分析
2017-09-15 11:10
375 查看
在同一个集群中,两个从节点 db117,db118 都各出现了1次 “serverStatus was very slow”
db117出现 very slow:
2017-09-14T18:55:49.599+0800 [conn13113916] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters:
0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after recordStats: 1214, after repl: 1214, at end: 1214 }
前后没有出现大的查询(>10秒以上)。其它性能指标也正常。
只是前面看到空间增加了30个 连接。
2017-09-14T18:55:48.157+0800 [initandlisten] connection accepted from
10.116.207.64:57026 #13113915 (264 connections now open)
...................
2017-09-14T18:55:49.551+0800 [initandlisten] connection accepted from
10.174.177.10:64840 #13113946 (295 connections now open)
2017-09-14T18:55:49.599+0800 [conn13113916] serverStatus was very slow:
应该是突然查询数量上来,虽然每个查询的时间不长,但IO,CPU 有个波动的原因。
(但在监控数据中也没看到有突增)
此现象在DB118 中昨天也出现一次,但不是在同时。
2017-09-14T14:12:16.102+0800 [initandlisten] connection accepted from
10.116.207.112:10800 #13301443 (286 connections now open)
..............................................................
2017-09-14T14:12:19.780+0800 [initandlisten] connection accepted from
10.168.168.212:51264 #13301552 (395 connections now open)
2017-09-14T14:12:19.819+0800 [conn13300638] query Mallcoo.MallCard query: { UID: 17433632, Status: 1, MallID: 273 } planSummary: IXSCAN { UID: 1.0, MallID: 1.0 } ntoreturn:1 ntoskip:0 nscanned:1 nscannedObjects:1 keyUpdates:0 numYields:0 locks(micros) r:4079097
nreturned:1 reslen:99 4079ms
2017-09-14T14:12:19.844+0800 [conn13301520] serverStatus was very slow:
{ after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters: 0, after locks: 0, after network: 0, after opcounters: 0, after
opcountersRepl: 0, after recordStats: 1516, after repl: 1516, at end: 1526 }
2017-09-14T14:12:19.844+0800 [conn13301520] command admin.$cmd command: serverStatus { serverStatus: 1.0 } keyUpdates:0 numYields:0 locks(micros) r:83 reslen:7456 1557ms
DB118 14:12 时间,IO有一个高峰点,从 2%-->12%。而db117 却没有,问题原因比较难找。
从两次的slow 日志来看,
2017-09-14T14:12:19.844+0800 [conn13301520] serverStatus was very slow:
慢的原因都在:after recordStats: 1516 ,
也就是说有查询,而内存里没数据,出现了大量的 page faults.
再就是IO ,CPU 有个暂时的繁忙状态。
这两个从节点内存现在为64G,虽然不是很高。但暂时因为业务相对增长比较慢,不会增加内存。
如果有大量的此类预警日志出现,现场可用 mongostat 查看 faults 状态值情况,如果一直都高于 5以上。可以认为是内存不足了。
db117出现 very slow:
2017-09-14T18:55:49.599+0800 [conn13113916] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters:
0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after recordStats: 1214, after repl: 1214, at end: 1214 }
前后没有出现大的查询(>10秒以上)。其它性能指标也正常。
只是前面看到空间增加了30个 连接。
2017-09-14T18:55:48.157+0800 [initandlisten] connection accepted from
10.116.207.64:57026 #13113915 (264 connections now open)
...................
2017-09-14T18:55:49.551+0800 [initandlisten] connection accepted from
10.174.177.10:64840 #13113946 (295 connections now open)
2017-09-14T18:55:49.599+0800 [conn13113916] serverStatus was very slow:
应该是突然查询数量上来,虽然每个查询的时间不长,但IO,CPU 有个波动的原因。
(但在监控数据中也没看到有突增)
此现象在DB118 中昨天也出现一次,但不是在同时。
2017-09-14T14:12:16.102+0800 [initandlisten] connection accepted from
10.116.207.112:10800 #13301443 (286 connections now open)
..............................................................
2017-09-14T14:12:19.780+0800 [initandlisten] connection accepted from
10.168.168.212:51264 #13301552 (395 connections now open)
2017-09-14T14:12:19.819+0800 [conn13300638] query Mallcoo.MallCard query: { UID: 17433632, Status: 1, MallID: 273 } planSummary: IXSCAN { UID: 1.0, MallID: 1.0 } ntoreturn:1 ntoskip:0 nscanned:1 nscannedObjects:1 keyUpdates:0 numYields:0 locks(micros) r:4079097
nreturned:1 reslen:99 4079ms
2017-09-14T14:12:19.844+0800 [conn13301520] serverStatus was very slow:
{ after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters: 0, after locks: 0, after network: 0, after opcounters: 0, after
opcountersRepl: 0, after recordStats: 1516, after repl: 1516, at end: 1526 }
2017-09-14T14:12:19.844+0800 [conn13301520] command admin.$cmd command: serverStatus { serverStatus: 1.0 } keyUpdates:0 numYields:0 locks(micros) r:83 reslen:7456 1557ms
DB118 14:12 时间,IO有一个高峰点,从 2%-->12%。而db117 却没有,问题原因比较难找。
从两次的slow 日志来看,
2017-09-14T14:12:19.844+0800 [conn13301520] serverStatus was very slow:
{ after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters: 0, after locks: 0, after network: 0, after opcounters: 0, after
opcountersRepl: 0, after recordStats: 1516, after repl: 1516, at end: 1526 }
慢的原因都在:after recordStats: 1516 ,也就是说有查询,而内存里没数据,出现了大量的 page faults.
再就是IO ,CPU 有个暂时的繁忙状态。
这两个从节点内存现在为64G,虽然不是很高。但暂时因为业务相对增长比较慢,不会增加内存。
如果有大量的此类预警日志出现,现场可用 mongostat 查看 faults 状态值情况,如果一直都高于 5以上。可以认为是内存不足了。
相关文章推荐
- 关于The status code returned from the server was: 500的错误
- MongoDB创始人分析FourSquare宕机原因
- The requested URL /nagios/cgi-bin/statusmap.cgi was not found on this server
- was unable to refresh its cache! status = Cannot execute request on any known server
- MongoDb move chunk失败分析和处理(SERVER-5351)
- the status code returned from the server was:500
- 一次RegionServer GC导致会话超时下线的原因分析记录
- 开启apache的server-status辅助分析工具
- 运行asp.net时弹出the status code returned from the server was:404错误提示
- mongodb的serverstatus
- 安装sql2005,没有SQL Server management studio的原因分析及解决方案
- 关于使用springmvc时Failed to load resource: the server responded with a status of 400 (Bad Request)错误可能原因
- The requested URL /server-status was not found on this server
- Master只能控制其中1台RegionServer,无法控制其它RegionServer原因分析
- 如何开启apache的server-status辅助分析工具
- Last packet sent to the server was 0 ms ago.异常的原因及解决办法
- Mongodb {serverStatus:1} 详细解释
- 关于The status code returned from the server was:500的解决
- The status code returned from the server was: 500
- pache HTTP Server停止工作 原因分析(一)