您的位置:首页 > 其它

《WINDOWSPE权威指南》学习笔记(二)- PE文件结构及字段说明

2016-08-02 17:02 387 查看
学完Win32asm编程后,发现《PE》书中第二章给出的小工具的界面功能其实可以更丰富些,也不难,之后一定会给改一下。今天看了第三章的大部分,到PE编程前停住了,下面把笔记贴下来,还有一些MSDN的注释,理解起来可能更充分吧。

MSDN中的一张PE结构的图:



1
1.5
初识PE文件
目的:使EXE文件能在不同的CPU工作指令下工作(跟编译环境有关),应该不是移植性
意思是,编译出的文件的格式统一
大部分exe和dll文件都是PE文件

静态程序
偏移	内容
0x0000	PE头
0x0400	代码段
0x0600	引入函数段
0x0800	数据段
3
3.1
PE的数据组织方式
头部 + 身体
结构 + 	字节码
3.2
地址
VA虚拟内存地址
进程的基地址 + RVA
RVA相对虚拟内存地址
FOA文件偏移地址
和内存无关,指静态的偏移
特殊地址
从某个特定的位置算起
指针
存储地址的字段是指针
对齐
内存对齐
节在内存中的对齐大小至少是一个页的大小,32位4KB(1000h),64位8KB(2000h)
文件对齐
节在磁盘上以一个扇区521B(200h)对齐
资源数据对齐
4个字节
Unicode字符串
容量
使用范围:资源文件
结尾判断:不一定以\0结尾
3.3-3.4
DOS MZ头
由IMAGE_DOS_HEADER定义,编译器自动生成(64B),DOS下参数
DOS 块
在DOS下的可执行代码,编译器自动生成,可自己修改(虚拟机可试验)
PE头
IMAGE_DOS_HEADER中e_lfanew字段(DOS MZ头的最后一个双字字段)定位PE头的位置

PE头标识(4B):PE\0\0
标准PE头(20B)
由IMAGE_FILE_HEADER定义,定义了全局信息
拓展PE头
由IMAGE_OPTIONAL_HEADER32定义,定义了更详细的信息
其中最后一个字段IMAGE_DATA_DIRECTORY定义了不同数据的RVA和长度
节表(40B×n)
由IMAGE_SECTION_HEADER定义,定义了节的信息
3.5
Signature:PE\0\0//改写系统将无法加载,防病毒自启动

IMAGE_FILE_HEADER
WORD Machine//机型不符提示不是有效的Win32程序
0x14d Intel i860
0x14c Intel I386 (same ID used for 486 and 586)
0x162 MIPS R3000
0x166 MIPS R4000
0x183 DEC Alpha AXP
WORD NumberOfSections//不能小于1,不能超过96,除没有节之外必须与实际对应,否则提示不是有效的Win32程序
The number of sections in the file.
DWORD TimeDateStamp//没啥用,跟“创建时间”“修改时间”“访问时间”无关
The time that the linker (or compiler for an OBJ file) produced this file.
This field holds the number of seconds since December 31st, 1969, at 4:00 P.M.
DWORD PointerToSymbolTable//此值为0
The file offset of the COFF symbol table. This field is only used in OBJ files and PE files with COFF debug information.
PE files support multiple debug formats, so debuggers should refer to the IMAGE_DIRECTORY_ENTRY_DEBUG entry in the data directory (defined later).
DWORD NumberOfSymbols//此值为0
The number of symbols in the COFF symbol table. See above.
WORD SizeOfOptionalHeader//32位00e0h,64位00f0h,跟CPU无关,看程序设定
The size of IMAGE_OPTIONAL_HEADER32.
In OBJs, the field is 0.
WORD Characteristics//文件属性,一位代表一个信息
可执行文件:010fh//可执行,不包含重定位信息,不含符号和行号信息,只支持32位
dll文件:210eh//DLL,不包含重定位信息,不含符号和行号信息,只支持32位

IMAGE_OPTIONAL_HEADER32
WORD Magic//文件类型
PE32:0x010B
ROM映像:0x0107
PE32+:0x020B//64位
BYTE MajorLinkerVersion
BYTE MinorLinkerVersion
The version of the linker that produced this file.
The numbers should be displayed as decimal values, rather than as hex.
A typical linker version is 2.23.
DWORD SizeOfCode//文件对齐,代码节的总和,512B的倍数
The combined and rounded-up size of all the code sections.
Usually, most files only have one code section, so this field matches the size of the .text section.
DWORD SizeOfInitializedData
This is supposedly the total size of all the sections that are composed of initialized data (not including code segments.)
However, it doesn't seem to be consistent with what appears in the file.
DWORD SizeOfUninitializedData
The size of the sections that the loader commits space for in the virtual address space,
but that don't take up any space in the disk file.
These sections don't need to have specific values at program startup, hence the term uninitialized data.
Uninitialized data usually goes into a section called .bss.
DWORD AddressOfEntryPoint//启动地址,RVA相对于整个文件的基址,病毒程序、加密程序、补丁程序会劫持这个值
The address where the loader will begin execution.
This is an RVA, and usually can usually be found in the .text section.
DWORD BaseOfCode//.text代码节的起始地址
The RVA where the file's code sections begin.
The code sections typically come before the data sections and after the PE header in memory.
This RVA is usually 0x1000 in Microsoft Linker-produced EXEs.
Borland's TLINK32 looks like it adds the image base to the RVA of the first code section and stores the result in this field.
DWORD BaseOfData//.data数据节的起始地址
The RVA where the file's data sections begin.
The data sections typically come last in memory, after the PE header and the code sections.
DWORD ImageBase//优先装入地址,无需重定位,可执行文件0x40 0000,DLL文件0x1000 0000
When the linker creates an executable, it assumes that the file will be memory-mapped to a specific location in memory.
That address is stored in this field, assuming a load address allows linker optimizations to take place.
If the file really is memory-mapped to that address by the loader, the code doesn't need any patching before it can be run.
In executables produced for Windows NT, the default image base is 0x10000. For DLLs, the default is 0x400000.
In Windows 95, the address 0x10000 can't be used to load 32-bit EXEs because it lies within a linear address region shared by all processes.
Because of this, Microsoft has changed the default base address for Win32 executables to 0x400000.
Older programs that were linked assuming a base address of 0x10000 will take longer to load under Windows 95
because the loader needs to apply the base relocations.
DWORD SectionAlignment//内存地址对齐长度,32位0x1000,64位0x2000
When mapped into memory, each section is guaranteed to start at a virtual address that's a multiple of this value.
For paging purposes, the default section alignment is 0x1000.
DWORD FileAlignment//文件地址对齐长度,0x0200 = 512B,扇区的大小
In the PE file, the raw data that comprises each section is guaranteed to start at a multiple of this value.
The default value is 0x200 bytes,
probably to ensure that sections always start at the beginning of a disk sector(which are also 0x200 bytes in length).
This field is equivalent to the segment/resource alignment size in NE files.
Unlike NE files, PE files typically don't have hundreds of sections,
so the space wasted by aligning the file sections is almost always very small.
WORD MajorOperatingSystemVersion//略
WORD MinorOperatingSystemVersion//略
The minimum version of the operating system required to use this executable.
This field is somewhat ambiguous since the subsystem fields (a few fields later) appear to serve a similar purpose.
This field defaults to 1.0 in all Win32 EXEs to date.
WORD MajorImageVersion//略
WORD MinorImageVersion//略
A user-definable field.
This allows you to have different versions of an EXE or DLL. You set these fields via the linker /VERSION switch.
For example, "LINK /VERSION:2.0 myobj.obj".
WORD MajorSubsystemVersion//略
WORD MinorSubsystemVersion//略
Contains the minimum subsystem version required to run the executable.
A typical value for this field is 3.10 (meaning Windows NT 3.1).
DWORD Reserved1//略
Seems to always be 0.
DWORD SizeOfImage//在内存中的映射尺寸,文件头1000h + 1000h × 节数量
This appears to be the total size of the portions of the image that the loader has to worry about.
It is the size of the region starting at the image base up to the end of the last section.
The end of the last section is rounded up to the nearest multiple of the section alignment.
DWORD SizeOfHeaders//在有头+节表在文件对齐后的大小,200h的倍数
The size of the PE header and the section (object) table.
The raw data for the sections starts immediately after all the header components.
DWORD CheckSum//校验和,一般PE为0,内核驱动和系统DLL不为0
Supposedly a CRC checksum of the file. As in other Microsoft executable formats, this field is ignored and set to 0.
The one exception to this rule is for trusted services and these EXEs must have a valid checksum.
WORD Subsystem//界面子系统
The type of subsystem that this executable uses for its user interface.
WINNT.H defines the following values:
NATIVE  1 Doesn't require a subsystem (such as a device driver)
WINDOWS_GUI  2 Runs in the Windows GUI subsystem
WINDOWS_CUI  3 Runs in the Windows character subsystem (a console app)
OS2_CUI  5 Runs in the OS/2 character subsystem (OS/2 1.x apps only)
POSIX_CUI  7 Runs in the Posix character subsystem
WORD DllCharacteristics//文件装载属性
A set of flags indicating under which circumstances a DLL's initialization function (such as DllMain) will be called.
This value appears to always be set to 0, yet the operating system still calls the DLL initialization function for all four events.
The following values are defined:
1  Call when DLL is first loaded into a process's address space
2  Call when a thread terminates
4  Call when a thread starts up
8  Call when DLL exits
DWORD SizeOfStackReserve//初始化栈时保留的大小,1M
The amount of virtual memory to reserve for the initial thread's stack.
Not all of this memory is committed, however (see the next field).
This field defaults to 0x100000 (1MB).
If you specify 0 as the stack size to CreateThread, the resulting thread will also have a stack of this same size.
DWORD SizeOfStackCommit//初始化栈时提交的大小, 4K
The amount of memory initially committed for the initial thread's stack.
This field defaults to 0x1000 bytes (1 page) for the Microsoft Linker while TLINK32 makes it two pages.
DWORD SizeOfHeapReserve//初始化堆时保留的大小
The amount of virtual memory to reserve for the initial process heap.
This heap's handle can be obtained by calling GetProcessHeap. Not all of this memory is committed (see the next field).
DWORD SizeOfHeapCommit//初始化堆时提交的大小
The amount of memory initially committed in the process heap. The default is one page.
DWORD LoaderFlags//调试支持,一般为0
From WINNT.H, these appear to be fields related to debugging support.
I've never seen an executable with either of these bits enabled, nor is it clear how to get the linker to set them.
The following values are defined:
1. Invoke a breakpoint instruction before starting the process
2. Invoke a debugger on the process after it's been loaded
DWORD NumberOfRvaAndSizes//数据目录中结构的数量,一般为0010h
The number of entries in the DataDirectory array (below). This value is always set to 16 by the current tools.
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES] //数据目录结构数组
0导出表,在.edata,包含导出函数和资源
1导入表,在.idata,包含导入符号
2异常表,在.pdata,包含异常处理函数表项数组
3资源表,在.rsrc,包含各种资源的地址,多层二叉排序树
4属性证书表,包含属性证书表项
5基址重定位信息表,在.reloc,包含重定位信息
6调试表,在.debug,包含IMAGE_DEBUG_DERECTORY结构数组
7预留,必须为0
8GlobalPtr,全局指针寄存器的值
9tls技术用
10seh技术用
11绑定导入数据表
12导入地址表
13延迟但如数据表
14clr数据表,在.cormeta,.net框架用
15系统预留,未定义

IMAGE_SECTION_HEADER
BYTE Name[IMAGE_SIZEOF_SHORT_NAME]//节名
This is an 8-byte ANSI name (not UNICODE) that names the section.
Most section names start with a . (such as ".text"), but this is not a requirement, as some PE documentation would have you believe.
You can name your own sections with either the segment directive in assembly language,
or with "#pragma data_seg" and "#pragma code_seg" in the Microsoft C/C++ compiler.
It's important to note that if the section name takes up the full 8 bytes, there's no NULL terminator byte.
If you're a printf devotee, you can use %.8s to avoid copying the name string to another buffer where you can NULL-terminate it.
union { DWORD PhysicalAddress 	DWORD VirtualSize } Misc;//节对齐前的真实尺寸
This field has different meanings, in EXEs or OBJs.
In an EXE, it holds the actual size of the code or data.
This is the size before rounding up to the nearest file alignment multiple.
The SizeOfRawData field (seems a bit of a misnomer) later on in the structure holds the rounded up value.
The Borland linker reverses the meaning of these two fields and appears to be correct.
For OBJ files, this field indicates the physical address of the section.
The first section starts at address 0.
To find the physical address in an OBJ file of the next section, add the SizeOfRawData value to the physical address of the current section.
DWORD VirtualAddress//节的RVA地址
In EXEs, this field holds the RVA to where the loader should map the section.
To calculate the real starting address of a given section in memory,
add the base address of the image to the section's VirtualAddress stored in this field.
With Microsoft tools, the first section defaults to an RVA of 0x1000. In OBJs, this field is meaningless and is set to 0.
DWORD SizeOfRawData//节在文件对齐后的大小
In EXEs, this field contains the size of the section after it's been rounded up to the file alignment size.
For example, assume a file alignment size of 0x200.
If the VirtualSize field from above says that the section is 0x35A bytes in length,
this field will say that the section is 0x400 bytes long.
In OBJs, this field contains the exact size of the section emitted by the compiler or assembler.
In other words, for OBJs, it's equivalent to the VirtualSize field in EXEs.
DWORD PointerToRawData//节在文件对齐后的偏移地址
This is the file-based offset of where the raw data emitted by the compiler or assembler can be found.
If your program memory maps a PE or COFF file itself (rather than letting the operating system load it),
this field is more important than the VirtualAddress field.
You'll have a completely linear file mapping in this situation, so you'll find the data for the sections at this offset,
rather than at the RVA specified in the VirtualAddress field.
DWORD PointerToRelocations//指向重定位表的指针,可执行文件中为0
In OBJs, this is the file-based offset to the relocation information for this section.
The relocation information for each OBJ section immediately follows the raw data for that section.
In EXEs, this field (and the subsequent field) are meaningless, and set to 0.
When the linker creates the EXE, it resolves most of the fixups,
leaving only base address relocations and imported functions to be resolved at load time.
The information about base relocations and imported functions is kept in their own sections,
so there's no need for an EXE to have per-section relocation data following the raw section data.
DWORD PointerToLinenumbers//指向行号表,调试用
This is the file-based offset of the line number table.
A line number table correlates source file line numbers to the addresses of the code generated for a given line.
In modern debug formats like the CodeView format, line number information is stored as part of the debug information.
In the COFF debug format, however, the line number information is stored separately from the symbolic name/type information.
Usually, only code sections (such as .text) have line numbers.
In EXE files, the line numbers are collected towards the end of the file, after the raw data for the sections.
In OBJ files, the line number table for a section comes after the raw section data and the relocation table for that section.
WORD NumberOfRelocations//重定位表的个数
The number of relocations in the relocation table for this section (the PointerToRelocations field from above).
This field seems relevant only for OBJ files.
WORD NumberOfLinenumbers//行号的数量
The number of line numbers in the line number table for this section (the PointerToLinenumbers field from above).
DWORD Characteristics//节属性
What most programmers call flags, the COFF/PE format calls characteristics.
This field is a set of flags that indicate the section's attributes (such as code/data, readable, or writeable,).
For a complete list of all possible section attributes, see the IMAGE_SCN_XXX_XXX #defines in WINNT.H.
代码节一般为0x6000 0020,可执行,可读,节中包含代码
数据节一般为0xc000 0040,可读,可写,包含已初始化数据
常量节一般为0x4000 0040,可读,包含已初始化数据
资源节同常量节


[align=left]
[/align]
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息