您的位置:首页 > 运维架构 > Linux

linux 生成可执行文件的链接过程和原理

2011-05-16 19:37 597 查看

2.2 Overview of Linkers and the Linking Process

Figure 2.2

illustrates how different tools take various input files and generate
appropriate output files to ultimately be used in building an executable

Figure 2.2:

Creating an image file for the target system.

The developer writes the program in the C/C++ source
files and header files. Some parts of the program can be written in
assembly language and are produced in the corresponding assembly source
files. The developer creates a makefile

for the make

utility to facilitate an environment that can easily track the file
modifications and invoke the compiler and the assembler to rebuild the
source files when necessary. From these source files, the compiler and
the assembler produce object files that contain both machine binary code
and program data. The archive utility concatenates a collection of
object files to form a library. The linker takes these object files as
input and produces either an executable image or an object file that can
be used for additional linking with other object files. The linker
command file instructs the linker on how to combine the object files and
where to place the binary code and data in the target embedded system.

The main function of the linker is to combine multiple
object files into a larger relocatable object file, a shared object
file, or a final executable image. In a typical program, a section of
code in one source file can reference variables defined in another
source file. A function in one source file can call a function in
another source file. The global variables and non-static functions are
commonly referred to as global symbols.

In source files, these symbols have various names, for example, a global variable called foo_bar

or a global function called func_a

In the final executable binary image, a symbol refers to an address
location in memory. The content of this memory location is either data
for variables or executable code for functions.
The compiler creates a symbol table containing the
symbol name to address mappings as part of the object file it produces.
When creating relocatable output, the compiler generates the address
that, for each symbol, is relative to the file being compiled.
Consequently, these addresses are generated with respect to offset 0.
The symbol table contains the global symbols defined in the file being
compiled, as well as the external symbols referenced in the file that
the linker needs to resolve. The linking process performed by the linker
involves symbol resolution and symbol relocation.
Symbol resolution

is the process
in which the linker goes through each object file and determines, for
the object file, in which (other) object file or files the external
symbols are defined. Sometimes the linker must process the list of
object files multiple times while trying to resolve all of the external
symbols. When external symbols are defined in a static library, the
linker copies the object files from the library and writes them into the
final image.
Symbol relocation

is the process
in which the linker maps a symbol reference to its definition. The
linker modifies the machine code of the linked object files so that code
references to the symbols reflect the actual addresses assigned to
these symbols. For many symbols, the relative offsets change after
multiple object files are merged. Symbol relocation requires code
modification because the linker adjusts the machine code referencing
these symbols to reflect their finalized addresses. The relocation table
tells the linker where in the program code to apply the relocation
action. Each entry in the relocation table contains a reference to the
symbol table. Using this reference, the linker can retrieve the actual
address of the symbol and apply it to the program location as specified
by the relocation entry. It is possible for the relocation table to
contain both the address of the symbol and the information on the
relocation entry. In this case, there is no reference between the
relocation table and the symbol table.
Figure 2.3

illustrates these two concepts in a simplified view and serves as an example for the following discussions.

Figure 2.3:

Relationship between the symbol table and the relocation table.

For an executable image, all external symbols must be
resolved so that each symbol has an absolute memory address because an
executable image is ready for execution. The exception to this rule is
that those symbols defined in shared libraries may still contain
relative addresses, which are resolved at runtime (dynamic linking).

A relocatable object file may contain unresolved
external symbols. Similar to a library, a linker-reproduced relocatable
object file is a concatenation of multiple object files with one main
difference—the file is partially resolved and is used for further
linking with other object files to create an executable image or a
shared object file. A shared object file has dual purposes. It can be
used to link with other shared object files or relocatable object
modules, or it can be used as an executable image with dynamic linking.










加载(loadable)的段,并调用函数 mmap()把段内容加载到内存中。在加载之前,内核把段的标记直接传递给
对内存的保护功能。著名的Shellcode(参考资料 17)的编写技巧则是突破此保护功能的一个实际例子。
2:内核分析出ELF文件标记为 PT_INTERP 的段中所对应的动态连接器名称,并加载动态连接器。现代 LINUX 系统的动态连接器通常是 /lib/ld-linux.so.2,相关细节在后面有详细描述。
7:动态连接器执行在ELF文件中标记为 .init 的节的代码,进行程序运行的初始化。在早期系统中,初始化代码对应函数 _init(void)(函数名强制固定),在现代系统中,则对应形式为

8:动态连接器把控制传递给程序,从 ELF 文件头部中定义的程序进入点开始执行。在 a.out 格式和ELF格式中,程序进入点的值是显式存在的,在 COFF 格式中则是由规范隐含定义。
1。整个系统有一张POT,全局的Global Offset Table,有且只有一个,里面每条项目都代码一个全局的函数和变量的入口地址。
2.每个进程有一个PLT,局部的Procedure Linkage Table,第个进程有且只有一个,里面每期对应的本地符号对应全局的地址值,一个跳转语句。

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息