您的位置:首页 > 其它

使用DUMP 文件调试分析内核驱动

2010-11-17 13:25 169 查看
最近数据分离驱动总是偶然的出现蓝屏问题,很难确定原因, 只能通过Dump 文件分析了

Dump 文件分析很大程度上就是分析蓝屏产生的原因。这种系统级的错误算是Windows提示错误中比较严重的一种(更严重的还有启动黑屏等硬件或软件兼容性错误等等)。说它是比较严重,是因为毕竟Windows还提供了dump文件给用户分析,至少能比较容易的找到错误的原因。一般蓝屏要么是内核程序中的异常或违规,要么是数据结构的损坏,也有boot或shutdown的时候内核出错。有时候蓝屏是一闪而过,紧接着是系统重启;有时候是蓝屏等待。总之蓝屏的时候都提示了一些停止代码和错误信息,不过这些提示是不全面的,最多知道哪个模块出错(比如驱动)。想了解进一步的信息,或者通过搜索引擎,最好的方式当然是dump文件分析。当然,如果有更进一步研究的欲望,内核调试是更好的方法,不过这需要某些软件支持和调试技巧。

类型
Dump文件有三种:完整内存转储,内核内存转储,小内存转储。System Properties中的高级选项中可以看到这些设置。
完整内存转储太大,一般是物理内存大小或多一些,包括了用户进程页面,这种方式不实用,2GB的物理内存转储出来至少要2GB的磁盘空间(还有文件头信息)。内核转储一般是200MB大小(物理内存小于4GB),它只是包含了所有属于内核模式的物理内存。小内存转储一般是64KB(64位上是 128KB),这两种方式是更常用的。
小内存转储在/Windows /Minidump下生成了一个叫Mini日期+序列号.dmp的文件,这个珍贵的资源就是系统Crash时刻的状态,只不过小内存转储只记录的有限的信息,而且在你分析时,如果windbg没有设置符号服务器的路径(关于符号服务器,请参考Windbg内核调试之二: 常用命令),那么你的当前系统必须和发生蓝屏的系统的Ntoskrnl.exe版本相同,否则就有找不到符号的问题产生。
启动windbg,用 Open Crash Dump打开dump文件,或者直接拖动文件到windbg中,windbg显示如下信息:

Microsoft (R) Windows Debugger Version 6.12.0002.633 X86
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:/Documents and Settings/xinyuan/桌面/MEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available

Symbol search path is: *** Invalid ***
****************************************************************************
* Symbol loading may be unreliable without a symbol search path. *
* Use .symfix to have the debugger choose a symbol path. *
* After setting your symbol path, use .reload to refresh symbol locations. *
****************************************************************************
Executable search path is:
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
*** ERROR: Symbol file could not be found. Defaulted to export symbols for ntkrpamp.exe -
Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (8 procs) Free x86 compatible
Product: Server, suite: Enterprise TerminalServer SingleUserTS
Built by: 3790.srv03_sp2_rtm.070216-1710
Machine Name:
Kernel base = 0x80800000 PsLoadedModuleList = 0x808a6ea8
Debug session time: Sun Nov 14 10:39:57.213 2010 (UTC + 8:00)
System Uptime: 204 days 6:25:27.625
*********************************************************************
* Symbols can not be loaded because symbol path is not initialized. *
* *
* The Symbol Path can be set by: *
* using the _NT_SYMBOL_PATH environment variable. *
* using the -y <symbol_path> argument when starting the debugger. *
* using .sympath and .sympath+ *
*********************************************************************
*** ERROR: Symbol file could not be found. Defaulted to export symbols for ntkrpamp.exe -
Loading Kernel Symbols
...............................................................
...........................................
Loading User Symbols
PEB is paged out (Peb.Ldr = 7ffdd00c). Type ".hh dbgerr001" for details
Loading unloaded module list
..........
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 8E, {c0000005, f7248be2, b7818668, 0}

***** Kernel symbols are WRONG. Please fix symbols to do analysis.

*** ERROR: Symbol file could not be found. Defaulted to export symbols for fltMgr.sys -
*** ERROR: Symbol file could not be found. Defaulted to export symbols for halmacpi.dll -
*** ERROR: Module load completed but symbols could not be loaded for DIOMonitor.sys
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
*************************************************************************
*** ***
*** ***
*** Your debugger is not using the correct symbols ***
*** ***
*** In order for this command to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: nt!_KPRCB ***
*** ***
*************************************************************************
Probably caused by : DIOMonitor.sys ( DIOMonitor+2739 )

Followup: MachineOwner
---------

大致上提示了引起蓝屏的原因

命令
通过lm命令查看模块列表。另外,如果出现Unable to load image,说明没有找到这个文件,这个时候需要查看是否加载了正确的符号文件。设置符号服务器路径(.symfix命令)是很有必要的,因为调试机器和 Crash机器的环境很可能不一致。
运行命令kb,显示调用栈的信息。如果有正确的符号设置,可以看到调用的函数名。如果你在调试自己驱动程序的蓝屏问题,请确保设置正确该驱动程序的符号路径,不然就会出现Stack unwind information not available的问题。加入正确的符号文件(pdb)后,可以用命令!reload重新加载符号文件。
通过!thread 和!process,可以显示当前进程和线程。或者通过dt nt!_KTHREAD 地址和dt nt!_EPROCESS地址来查看线程和进程结构。

Windbg 提供了自动分析dump文件的机制。通过命令!analyze –v,windbg可以自动做分析,显示如下信息:

ADDITIONAL_DEBUG_TEXT:
Use '!findthebuild' command to search for the target build information.
If the build information is available, run '!findthebuild -s ; .reload' to set symbol path and load symbols.

FAULTING_MODULE: 80800000 nt

DEBUG_FLR_IMAGE_TIMESTAMP: 4c62488f

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - "0x%08lx"

FAULTING_IP:
fltMgr!FltParseFileNameInformation+e
f7248be2 668b4e02 mov cx,word ptr [esi+2]

TRAP_FRAME: b7818668 -- (.trap 0xffffffffb7818668)
ErrCode = 00000000
eax=00000000 ebx=89044690 ecx=00000000 edx=783f0002 esi=00000000 edi=b7818744
eip=f7248be2 esp=b78186dc ebp=b78186ec iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
fltMgr!FltParseFileNameInformation+0xe:
f7248be2 668b4e02 mov cx,word ptr [esi+2] ds:0023:00000002=????
Resetting default scope

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0x8E

CURRENT_IRQL: 0

LAST_CONTROL_TRANSFER: from 8082d800 to 80827c63

STACK_TEXT:
WARNING: Stack unwind information not available. Following frames may be wrong.
b7818234 8082d800 0000008e c0000005 f7248be2 nt!KeBugCheckEx+0x1b
b78185f8 8088a262 b7818614 00000000 b7818668 nt!KeTerminateThread+0xee2
b781868c 80a5c456 00000000 00000000 00000023 nt!Kei386EoiHelper+0x1d2
b78186ec b8059739 00000000 89800338 898003ec hal!KfLowerIrql+0x62
b7818950 f7232b73 89800394 b7818974 00000000 DIOMonitor+0x2739
b78189b8 f7234fc2 00800338 00000000 89800338 fltMgr!FltRequestOperationStatusCallback+0x5bd
b78189cc f72354f1 89800338 8a1cae48 b7818a0c fltMgr!FltGetIrpName+0x57a
b78189dc f7235b83 894ca7d0 8a1cae48 89800338 fltMgr!FltGetIrpName+0xaa9
b7818a0c f72435de b7818a2c 00000000 00000000 fltMgr!FltGetIrpName+0x113b
b7818a48 8081df65 894ca7d0 8a1cae48 8a1cae48 fltMgr!FltProcessFileLock+0x220c
b7818a5c 808f8f71 b7818c04 8af86018 00000000 nt!IofCallDriver+0x45
b7818b44 80937942 8af86030 00000000 8a7fad60 nt!NtWriteFile+0x647d
b7818bc4 80933a76 00000000 b7818c04 00000040 nt!NtMakePermanentObject+0xe10
b7818c18 808eae25 00000000 00000000 814a9001 nt!ObOpenObjectByName+0xea
b7818c94 808ec0bf 0219f9b4 80100080 0219f950 nt!IoCreateController+0x507
b7818cf0 808eeb4e 0219f9b4 80100080 0219f950 nt!IoCreateFile+0xa3
b7818d30 8088978c 0219f9b4 80100080 0219f950 nt!NtCreateFile+0x30
b7818d64 7c9585ec badb0d00 0219f918 00000000 nt!KeReleaseInStackQueuedSpinLockFromDpcLevel+0xb64
b7818d68 badb0d00 0219f918 00000000 00000000 0x7c9585ec
b7818d6c 0219f918 00000000 00000000 00000000 0xbadb0d00
b7818d70 00000000 00000000 00000000 00000000 0x219f918

STACK_COMMAND: kb

FOLLOWUP_IP:
DIOMonitor+2739
b8059739 85c0 test eax,eax

SYMBOL_STACK_INDEX: 4

SYMBOL_NAME: DIOMonitor+2739

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: DIOMonitor

IMAGE_NAME: DIOMonitor.sys

BUCKET_ID: WRONG_SYMBOLS

Followup: MachineOwner

一般是按照如下:停止码解释,陷阱帧寄存器信息,蓝屏属性(有些除零错误就在这里显示),栈调用,错误指令位置(FOLLOWUP_IP),出错源代码和汇编代码行,错误代码行,出错模块信息(包括负责人等信息),来组织自动分析信息。

通过r命令,可以显示Crash时刻寄存器的状态和最后的命令状态。

通过d命令,可以显示当前内存的地址。在定位了错误代码行了之后,就可以进一步进行内核调试和系统调试了。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: