您的位置:首页 > 运维架构 > Linux

CMCI介绍以及常用日志解析

2018-08-17 13:41 5335 查看

CMCI

Starting with 45 nm Intel 64 processor on which CP

UID reports DisplayFamily_DisplayModel as 06H_1AH (see

CPUID instruction in Chapter 3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software

Developer’s Manual, Volume 2A), the processor can report information on corrected machine-check errors and deliver a programmable interrupt for software to respond to MC errors, referred to as corrected machine-check error interrupt (CMCI). See Section 15.5 for detail.

用来探测45nm64位intelcpu的错误的工具。他会针对cpu中发生的错误进行计数,如果计数超过了阈值就会进行报错。有两种模式:中断模式(interrupt mode)和轮询模式(poll mode)


错误信息的存放

The machine-check error reporting mechanism that Pentium processors use is similar to that used in Pentium 4, Intel Xeon, Intel Atom, and P6 family processors. When an error is detected, it is recorded in P5_MC_TYPE and P5_MC_ADDR; the processor then generates a machine-check exception (#MC)

当检测到错误的时候,CMCI架构会把这些信息存放到P5_MC_TYPE和P5_MC_ADDR寄存器中。


日志中的体现

kernel: CMCI storm subsided: switching to interrupt mode kernel: CMCI storm detected: switching to poll mode

上面是在message中输出的错误信息。我们知道CMCI架构每遇到一个错误就会产生一个中断。如果这个错误产生频率太高,CMCI架构就会切换到轮询模式(隔几秒报告一次),以减小对cpu的影响。当报错信息频率降下来之后,就会切换回中断模式。

我们通常可以在/var/log/mcelog中找到相关报错信息


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  硬件 linux CMCI