您的位置:首页 > 运维架构 > Linux

Linux poll机制精彩分析

2012-04-25 21:44 351 查看

Linux poll机制精彩分析

2011-07-17 16:32
355人阅读 评论(0)
收藏
举报

原始地址:http://blogold.chinaunix.net/u3/102839/showart_2283496.html (偶这里有一定改动)

所有的系统调用,基于都可以在它的名字前加上“sys_”前缀,这就是它在内核中对应的函数。比如系统调用open、read、write、poll,与之对应的内核函数为:sys_open、sys_read、sys_write、sys_poll。

对于系统调用poll或select,它们对应的内核函数都是sys_poll。分析sys_poll,即可理解poll机制。

一、内核框架

1.1 sys_poll函数

sys_poll函数源代码:它对超时参数稍作处理后,直接调用do_sys_poll。

[cpp]
view plaincopyprint?

asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds, long timeout_msecs) { s64 timeout_jiffies; if (timeout_msecs > 0) { #if HZ > 1000 /* We can only overflow if HZ > 1000 */ if (timeout_msecs / 1000 > (s64)0x7fffffffffffffffULL / (s64)HZ) timeout_jiffies = -1; else #endif timeout_jiffies = msecs_to_jiffies(timeout_msecs); } else { /* Infinite (< 0) or no (0) timeout */ timeout_jiffies = timeout_msecs; } return do_sys_poll(ufds, nfds, &timeout_jiffies); }

asmlinkage long sys_poll(struct pollfd __user *ufds, unsigned int nfds,
long timeout_msecs)
{
s64 timeout_jiffies;
if (timeout_msecs > 0) {
#if HZ > 1000
/* We can only overflow if HZ > 1000 */
if (timeout_msecs / 1000 > (s64)0x7fffffffffffffffULL / (s64)HZ)
timeout_jiffies = -1;
else
#endif

timeout_jiffies = msecs_to_jiffies(timeout_msecs);
}
else {
/* Infinite (< 0) or no (0) timeout */
timeout_jiffies = timeout_msecs;
}
return do_sys_poll(ufds, nfds, &timeout_jiffies);
}


1.2 do_sys_poll函数

do_sys_poll函数也位于位于fs/select.c文件中,我们忽略其他代码:

[cpp]
view plaincopyprint?

int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds, s64 *timeout) { …… poll_initwait(&table); …… fdcount = do_poll(nfds, head, &table, timeout); …… }
int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds, s64 *timeout)
{
……
poll_initwait(&table);
……

fdcount = do_poll(nfds, head, &table, timeout);
……
}
poll_initwait函数非常简单,它初始化一个poll_wqueues变量table:

poll_initwait > init_poll_funcptr(&pwq->pt, __pollwait); > pt->qproc = qproc;

即table->pt->qproc =
__pollwait,__pollwait将在驱动的poll函数里用到。

1.3 do_poll函数

源代码如下:

[cpp]
view plaincopyprint?

static int do_poll(unsigned int nfds, struct poll_list *list,
struct poll_wqueues *wait, struct timespec *end_time)
{
poll_table* pt = &wait->pt;
ktime_t expire, *to = NULL;
int timed_out = 0, count = 0;
unsigned long slack = 0;

/* Optimise the no-wait case */
if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
pt = NULL;
timed_out = 1;
}

if (end_time && !timed_out)
slack = estimate_accuracy(end_time);

for (;;) {
struct poll_list *walk;

for (walk = list; walk != NULL; walk = walk->next) {
struct pollfd * pfd, * pfd_end;

pfd = walk->entries;
pfd_end = pfd + walk->len;
for (; pfd != pfd_end; pfd++) {
/*
* Fish for events. If we found one, record it
* and kill the poll_table, so we don't
* needlessly register any other waiters after
* this. They'll get immediately deregistered
* when we break out and return.
*/
if (do_pollfd(pfd, pt)) {
count++;
pt = NULL;
}
}
}
/*
* All waiters have already been registered, so don't provide
* a poll_table to them on the next loop iteration.
*/
pt = NULL;
if (!count) {
count = wait->error;
if (signal_pending(current))
count = -EINTR;
}
if (count || timed_out)
break;

/*
* If this is the first loop and we have a timeout
* given, then we convert to ktime_t and set the to
* pointer to the expiry value.
*/
if (end_time && !to) {
expire = timespec_to_ktime(*end_time);
to = &expire;
}

if (!poll_schedule_timeout(wait, TASK_INTERRUPTIBLE, to, slack))
timed_out = 1;
}
return count;
}<pre name="code" class="cpp">

static int do_poll(unsigned int nfds,  struct poll_list *list,
struct poll_wqueues *wait, struct timespec *end_time)
{
poll_table* pt = &wait->pt;
ktime_t expire, *to = NULL;
int timed_out = 0, count = 0;
unsigned long slack = 0;

/* Optimise the no-wait case */
if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
pt = NULL;
timed_out = 1;
}

if (end_time && !timed_out)
slack = estimate_accuracy(end_time);

for (;;) {
struct poll_list *walk;

for (walk = list; walk != NULL; walk = walk->next) {
struct pollfd * pfd, * pfd_end;

pfd = walk->entries;
pfd_end = pfd + walk->len;
for (; pfd != pfd_end; pfd++) {
/*
* Fish for events. If we found one, record it
* and kill the poll_table, so we don't
* needlessly register any other waiters after
* this. They'll get immediately deregistered
* when we break out and return.
*/
if (do_pollfd(pfd, pt)) {
count++;
pt = NULL;
}
}
}
/*
* All waiters have already been registered, so don't provide
* a poll_table to them on the next loop iteration.
*/
pt = NULL;
if (!count) {
count = wait->error;
if (signal_pending(current))
count = -EINTR;
}
if (count || timed_out)
break;

/*
* If this is the first loop and we have a timeout
* given, then we convert to ktime_t and set the to
* pointer to the expiry value.
*/
if (end_time && !to) {
expire = timespec_to_ktime(*end_time);
to = &expire;
}

if (!poll_schedule_timeout(wait, TASK_INTERRUPTIBLE, to, slack))
timed_out = 1;
}
return count;
}<pre name="code" class="cpp">
注意:应用程序执行poll调用后,如果①②的条件不满足,进程就会进入休眠。那么,谁唤醒呢?除了休眠到指定时间被系统唤醒外,还可以被驱动程序唤醒──记住这点,这就是为什么驱动的poll里要调用poll_wait的原因
1.4 do_pollfd函数

[cpp]
view plaincopyprint?

static inline unsigned int do_pollfd(struct pollfd *pollfd, poll_table *pwait) { …… if (file->f_op && file->f_op->poll) mask = file->f_op->poll(file, pwait); …… }

static inline unsigned int do_pollfd(struct pollfd *pollfd, poll_table *pwait)
{
……
if (file->f_op && file->f_op->poll)
mask = file->f_op->poll(file, pwait);
……
}
其实就是调用驱动程序里面,咱们自己实现的poll函数。


二、驱动程序

驱动程序里与poll相关的地方有两处:

一是构造file_operation结构时,要定义自己的poll函数;

二是通过poll_wait来调用上面说到的__pollwait函数,poll_wait的代码如下:

[cpp]
view plaincopyprint?

static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) { if (p && wait_address) p->qproc(filp, wait_address, p); }

static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p)
{
if (p && wait_address)
p->qproc(filp, wait_address, p);
}
poll_wait的作用是把进程加入poll_table中,以供系统唤醒。(前面说过饿)

执行到驱动程序的poll_wait函数时,进程并没有休眠,我们的驱动程序里实现的poll函数是不会引起休眠的。让进程进入休眠,是前面分析的do_sys_poll函数。
poll_wait只是把本进程挂入某个队列,应用程序调用poll > sys_poll > do_sys_poll > poll_initwait,do_poll
> do_pollfd > 我们自己写的poll函数后,再调用schedule_timeout进入休眠。如果我们的驱动程序发现情况就绪,可以把这个队列上挂着的进程唤醒。可见,poll_wait的作用,只是为了让驱动程序能找到要唤醒的进程。即使不用poll_wait,我们的程序也有机会被唤醒:chedule_timeout(__timeout),只是休眠__time_out这段时间。

三,总结

现在来总结一下poll机制:

1. poll > sys_poll > do_sys_poll > poll_initwait,poll_initwait函数注册一下回调函数__pollwait,它就是我们的驱动程序执行poll_wait时,真正被调用的函数。

2. 接下来执行file->f_op->poll,即我们驱动程序里自己实现的poll函数

它会调用poll_wait把自己挂入某个队列,这个队列也是我们的驱动自己定义的;

它还判断一下设备是否就绪。

3. 如果设备未就绪,do_sys_poll里会让进程休眠一定时间

4. 进程被唤醒的条件有2:一是上面说的“一定时间”到了,二是被驱动程序唤醒。驱动程序发现条件就绪时,就把“某个队列”上挂着的进程唤醒。

这个队列,就是前面通 过poll_wait把本进程挂过去的队列。

5. 如果驱动程序没有去唤醒进程,那么chedule_timeout(__timeou)超时后,会重复2、3动作,直到应用程序的poll调用传入的时间到达。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: