您的位置：首页 > 运维架构 > Linux

Linux 下系统调用的三种方法

2015-11-09 12:47 701 查看

系统调用（System Call）是操作系统为在用户态运行的进程与内核进行交互提供的一组接口。当用户进程需要发生系统调用时，CPU 通过软中断切换到内核态开始执行内核系统调用函数。

下面介绍Linux 下三种发生系统调用的方法：

通过 glibc 提供的库函数

glibc 是 Linux 下使用的开源的标准 C 库，它是 GNU 发布的 libc 库，即运行时库。glibc 为程序员提供丰富的 API（Application Programming Interface），除了例如字符串处理、数学运算等用户态服务之外，最重要的是封装了操作系统提供的系统服务，即系统调用的封装。许多系统调用在glibc中都有同名的封装函数。一般这些函数的语义与系统调用的语义是一致的，但也有些进行了改变，比如系统调用brk与glibc提供的brk的语义是不一致的(主要是返回值进行了改变)。也存在一些系统调用是glibc未提供的，比如_llseek，glibc提供了lseek64对其进行了封装。glibc不提供一些系统调用，主要出于移植的考虑，比如_llseek在64位中并不存在，所以用lseek64函数替代了它。

以下，我们通过 glibc 提供的chmod 函数来改变文件 etc/passwd 的属性为 444：

#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#include <stdio.h>

int main()
{
int rc;

rc = chmod("/etc/passwd", 0444);
if (rc == -1)
fprintf(stderr, "chmod failed, errno = %d\n", errno);
else
printf("chmod success!\n");
return 0;
}

在普通用户下编译运用，输出结果为：

chmod failed, errno = 1

上面系统调用返回的值为-1，说明系统调用失败，错误码为1，在 /usr/include/asm-generic/errno-base.h 文件中有如下错误代码说明：

#define EPERM       1                /* Operation not permitted */

即无权限进行该操作，我们以普通用户权限是无法修改 /etc/passwd 文件的属性的，结果正确。

使用 syscall 直接调用

使用上面的方法有很多好处，首先你无须知道更多的细节，如 chmod 系统调用号，你只需了解 glibc 提供的 API 的原型；其次，该方法具有更好的移植性，你可以很轻松将该程序移植到其他平台，或者将 glibc 库换成其它库，程序只需做少量改动。

但有点不足是，如果 glibc 没有封装某个内核提供的系统调用时，我就没办法通过上面的方法来调用该系统调用。如我自己通过编译内核增加了一个系统调用，这时 glibc 不可能有你新增系统调用的封装 API，此时我们可以利用 glibc 提供的syscall 函数直接调用。该函数定义在 unistd.h 头文件中，函数原型如下：

long int syscall (long int sysno, ...)

sysno 是系统调用号，每个系统调用都有唯一的系统调用号来标识。在 sys/syscall.h 中有所有可能的系统调用号的宏定义。

… 为剩余可变长的参数，为系统调用所带的参数，根据系统调用的不同，可带0~5个不等的参数，如果超过特定系统调用能带的参数，多余的参数被忽略。

返回值返回-1，错误号放入errno中

还以上面修改 /etc/passwd 文件的属性为例，这次使用 syscall 直接调用：

#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <errno.h>

int main()
{
int rc;
rc = syscall(SYS_chmod, "/etc/passwd", 0444);

if (rc == -1)
fprintf(stderr, "chmod failed, errno = %d\n", errno);
else
printf("chmod succeess!\n");
return 0;
}

在普通用户下编译执行，输出的结果与上例相同。

通过 int 指令陷入

如果我们知道系统调用的整个过程的话，应该就能知道用户态程序通过软中断指令int 0x80 来陷入内核态（在Intel Pentium II 又引入了sysenter指令），参数的传递是通过寄存器，eax 传递的是系统调用号，ebx、ecx、edx、esi和edi 来依次传递最多五个参数，当系统调用返回时，返回值存放在 eax 中。

仍然以上面的修改文件属性为例，将调用系统调用那段写成内联汇编代码：

#include <stdio.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <errno.h>

int main()
{
long rc;
char *file_name = "/etc/passwd";
unsigned short mode = 0444;

asm(
"int $0x80"
: "=a" (rc)
: "0" (SYS_chmod), "b" ((long)file_name), "c" ((long)mode)
);

if ((unsigned long)rc >= (unsigned long)-132) {
errno = -rc;
rc = -1;
}

if (rc == -1)
fprintf(stderr, "chmode failed, errno = %d\n", errno);
else
printf("success!\n");

return 0;
}

如果 eax 寄存器存放的返回值（存放在变量 rc 中）在 -1~-132 之间，就必须要解释为出错码（在/usr/include/asm-generic/errno.h 文件中定义的最大出错码为 132），这时，将错误码写入 errno 中，置系统调用返回值为 -1；否则返回的是 eax 中的值。

上面程序在 32位Linux下以普通用户权限编译运行结果与前面两个相同！

参考资料

Understanding The Linux Kernel, the 3rd edtion

The GNU C Library Reference Manual, for version 2.18

附录

x86 linux系统调用接口使用int 0x80实现，也就是软中断。用户可以使用int 0x80指令切入内核获取资源。切入内核后，将执行内核已写好的代码。代码执行完成在返回给用户。int 0x80执行前需要设置相应的寄存器来调用相应的系统调用，来传递相应的参数。Linux会有6个寄存器被用来传递这些参数：eax(存放系统调用号)、ebx、ecx、edx、esi及edi来存放这些额外的参数（以字母递增的顺序）。linux提供一些宏以便于我们调用系统调用，原理就是使用嵌入式汇编设置eax,ebx…寄存器并调用0x80。可以看以下宏代码：

unistd.h文件

#ifndef _LINUX_UNISTD_H
#define _LINUX_UNISTD_H

#define __NR_setup        0
#define __NR_exit         1
#define __NR_fork         2
#define __NR_read         3
#define __NR_write        4
#define __NR_open         5
#define __NR_close        6
#define __NR_waitpid          7
#define __NR_creat        8
#define __NR_link         9
#define __NR_unlink      10
... 还有的在此处省略

extern int errno;
#define _syscall0(type,name) \
type name(void) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name)); \
if (__res >= 0) \
return (type) __res; \
errno = -__res; \
return -1; \
}

#define _syscall1(type,name,atype,a) \
type name(atype a) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(a))); \
if (__res >= 0) \
return (type) __res; \
errno = -__res; \
return -1; \
}

#define _syscall2(type,name,atype,a,btype,b) \
type name(atype a,btype b) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(a)),"c" ((long)(b))); \
if (__res >= 0) \
return (type) __res; \
errno = -__res; \
return -1; \
}

#define _syscall3(type,name,atype,a,btype,b,ctype,c) \
type name(atype a,btype b,ctype c) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(a)),"c" ((long)(b)),"d" ((long)(c))); \
if (__res>=0) \
return (type) __res; \
errno=-__res; \
return -1; \
}

#define _syscall4(type,name,atype,a,btype,b,ctype,c,dtype,d) \
type name (atype a, btype b, ctype c, dtype d) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(a)),"c" ((long)(b)), \
"d" ((long)(c)),"S" ((long)(d))); \
if (__res>=0) \
return (type) __res; \
errno=-__res; \
return -1; \
}

#define _syscall5(type,name,atype,a,btype,b,ctype,c,dtype,d,etype,e) \
type name (atype a,btype b,ctype c,dtype d,etype e) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(a)),"c" ((long)(b)), \
"d" ((long)(c)),"S" ((long)(d)),"D" ((long)(e))); \
if (__res>=0) \
return (type) __res; \
errno=-__res; \
return -1; \
}

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航