内存控制器错误信息[备忘]
2015-03-02 10:03
162 查看
参考日志错误信息:
参考信息2:
模块信息
官方解释:
启用 mcelog
查询日志
相关评估
因此最终决定忽略该信息
[root@hh-yun-compute-130125 ~]# cat /var/log/messages | grep -i error Mar 1 04:58:05 hh-yun-compute-130125 kernel: sbridge: HANDLING MCE MEMORY ERROR Mar 1 04:58:06 hh-yun-compute-130125 kernel: EDAC MC1: CE row 2, channel 0, label "CPU_SrcID#1_Channel#2_DIMM#0": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=6 Err=0008:00c2 (ch=2), addr = 0x16113a9000 => socket=1, Channel=2(mask=4), rank=0 Mar 1 10:27:08 hh-yun-compute-130125 kernel: sbridge: HANDLING MCE MEMORY ERROR Mar 1 10:27:09 hh-yun-compute-130125 kernel: EDAC MC1: CE row 2, channel 0, label "CPU_SrcID#1_Channel#2_DIMM#0": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=6 Err=0008:00c2 (ch=2), addr = 0x15e1c49000 => socket=1, Channel=2(mask=4), rank=0 Mar 1 13:52:56 hh-yun-compute-130125 kernel: sbridge: HANDLING MCE MEMORY ERROR Mar 1 13:52:57 hh-yun-compute-130125 kernel: EDAC MC1: CE row 2, channel 0, label "CPU_SrcID#1_Channel#2_DIMM#0": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=6 Err=0008:00c2 (ch=2), addr = 0x160e949000 => socket=1, Channel=2(mask=4), rank=0 Mar 2 04:16:56 hh-yun-compute-130125 kernel: sbridge: HANDLING MCE MEMORY ERROR Mar 2 04:16:56 hh-yun-compute-130125 kernel: sbridge: HANDLING MCE MEMORY ERROR Mar 2 04:16:57 hh-yun-compute-130125 kernel: EDAC MC1: CE row 2, channel 0, label "CPU_SrcID#1_Channel#2_DIMM#0": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=6 Err=0008:00c2 (ch=2), addr = 0x1613a61000 => socket=1, Channel=2(mask=4), rank=0 Mar 2 04:16:57 hh-yun-compute-130125 kernel: EDAC MC1: CE row 2, channel 0, label "CPU_SrcID#1_Channel#2_DIMM#0": 1 Unknown error(s): memory scrubbing on FATAL area : cpu=6 Err=0008:00c2 (ch=2), addr = 0x1613a79000 => socket=1, Channel=2(mask=4), rank=0
参考信息2:
[root@hh-yun-compute-130125 ~]# cat /sys/devices/system/edac/mc/mc?/ce*count 0 0 8 0 [root@hh-yun-compute-130125 ~]# cat /sys/devices/system/edac/mc/mc1/ce_count 8
模块信息
[root@hh-yun-compute-130125 ~]# modinfo sb_edac filename: /lib/modules/2.6.32-504.3.3.el6.x86_64/kernel/drivers/edac/sb_edac.ko description: MC Driver for Intel Sandy Bridge and Ivy Bridge memory controllers - Ver: 1.1.0 author: Red Hat Inc. (http://www.redhat.com) author: Mauro Carvalho Chehab <mchehab@redhat.com> license: GPL srcversion: 01CFEEBE911D55B6FE660BE alias: pci:v00008086d00002FA0sv*sd*bc*sc*i* alias: pci:v00008086d00000EA8sv*sd*bc*sc*i* alias: pci:v00008086d00003CA8sv*sd*bc*sc*i* depends: edac_core vermagic: 2.6.32-504.3.3.el6.x86_64 SMP mod_unload modversions parm: edac_op_state:EDAC Error Reporting state: 0=Poll,1=NMI (int) [root@hh-yun-compute-130125 ~]# modinfo edac_core filename: /lib/modules/2.6.32-504.3.3.el6.x86_64/kernel/drivers/edac/edac_core.ko description: Core library routines for EDAC reporting author: Doug Thompson www.softwarebitmaker.com, et al license: GPL srcversion: C21E296292A2174839A086C depends: vermagic: 2.6.32-504.3.3.el6.x86_64 SMP mod_unload modversions parm: check_pci_errors:Check for PCI bus parity errors: 0=off 1=on (int) parm: edac_pci_panic_on_pe:Panic on PCI Bus Parity error: 0=off 1=on (int) parm: edac_mc_panic_on_ue:Panic on uncorrected error: 0=off 1=on (int) parm: edac_mc_log_ue:Log uncorrectable error to console: 0=off 1=on (int) parm: edac_mc_log_ce:Log correctable error to console: 0=off 1=on (int) parm: edac_mc_poll_msec:Polling period in milliseconds
官方解释:
Total Correctable Errors count attribute file: 'ce_count' This attribute file displays the total count of correctable errors that have occurred on this csrow. This count is very important to examine. CEs provide early indications that a DIMM is beginning to fail. This count field should be monitored for non-zero values and report such information to the system administrator.
启用 mcelog
[root@hh-yun-compute-130125 ~]# service mcelogd restart Stopping mcelog [确定] Starting mcelog daemon [确定] [root@hh-yun-compute-130125 ~]# mcelog mcelog: Family 6 Model 3e CPU: only decoding architectural errors
查询日志
[root@hh-yun-compute-130125 ~]# tail /var/log/mcelog mcelog: failed to prefill DIMM database from DMI data mcelog: mcelog server already running
相关评估
This is a harmless warning message. The DIMM database prefill relies on a specific non-standard format of the DIMMs in the DMI BIOS tables. If this format is not used by the BIOS, mcelog will only discover DIMMs as they get their first error (if the CPU reports DIMMs in machine check errors). Please understand for the most part, mcelog should be ignored.
因此最终决定忽略该信息
相关文章推荐
- RPM 编译错误信息备忘
- IIS 添加网站显示错误消息 “无更多可用的内存以更新安全信息” 解决方法
- 异常处理及日志错误信息备忘——都是那么低级的错误,唉~
- Linux基础备忘_01:audit2allow的错误信息输出问题
- IIS 添加网站显示错误消息 “无更多可用的内存以更新安全信息” 解决方法
- DOI报错信息解析“文档界面发生内存保护错误”
- 由于文件不可访问,或者内存或磁盘空间不足,所以无法打开数据库 'msdb'。有关详细信息,请参阅 SQL Server 错误日志。 (Microsoft SQL Server,错误: 945)
- ASP.NET MVC控制器里捕获视图的错误验证信息(ErrorMessage)
- VM配置文件所在磁盘空间小于其配给内存时的错误信息
- 修改桌面堆栈大小,解决运行大量程序时出现”Out of Memory”(内存不足)错误信息的问题
- 使用jQuery异步传递Model到控制器方法,并异步返回错误信息
- Valgrind 内存检测错误信息
- 修改桌面堆栈大小,解决运行大量程序时出现”Out of Memory”(内存不足)错误信息的问题
- 备忘:maven 错误信息: Plugin execution not covered by lifecycle configuration
- PHP单个脚本超过内存限制上限时候的错误提示信息
- 备忘:maven 错误信息: Plugin execution not covered by lifecycle configuration
- 错误信息:内存位置访问无效。 (Exception from HRESULT: 0x800703E6)
- postgre常见错误信息备忘
- Oracle内存结构(一)----SGA的区域信息
- 当某些Outlook 2000客户端进行连接时信息存储进程占用大量内存及CPU时间