您的位置:首页 > 其它

用信号量做进程同步解决生产者和消费者遇到的奇怪问题

2009-10-19 17:43 666 查看
看了APUE关于信号量部分的内容后,决定用它来实现一下生产者消费者问题,程序写好运行后,总是有问题,生产者每调用32767次就会报错,检查了semop的返回值为ERANGE。不知道是什么原因。搜到一篇具有同样问题的帖子,帖子的解答时在用信号量设置undo flag后,在每个进程中对信号量的PV操作必须“对称”,由于生产者消费者是两个不同的进程,对满槽数和空槽数的PV操作不是对称的,导致undo积累的值不断增大,最终导致记录undo累计值的变量(ushort类型)溢出。帖子内容如下:



Michael B Allen <mba2000@ioplex.com> writes:

>Hi,

>I'm seeing some strange behavior with the semop(2) SEM_UNDO flag. I have
>a function like this:

> int
> svsem_wait(int semid)
> {
> struct sembuf wait;
>
> wait.sem_num = 0;
> wait.sem_op = -1;
> wait.sem_flg = SEM_UNDO;
>
> return semop(semid, &wait, 1);
> }

>In a producer/consumer test program, after precisely 32,767 calls to this
>function it stops decrementing the semaphore value which permits the
>producer to run uncontrolled.

>If I remove the SEM_UNDO flag the problem does not occur and the test
>program completes successfully.

>Any idea what the problem might be? Considering 32768 is a power of 2 I
>suspect I'm doing something wrong with SEM_UNDO that's causing a limit
>to be exceeded.

Are you doing the semop(-1) in one process and the +1 in another or
are you doing the -1 with UNDO and the +1 w/o?

Note then that the undo counts accumulate over a process; all operations
are undone when the process exists. A short is the typical value to
hold the undo aggregate.

TYpically, using SEM_UNDO is only correct when the application using
it must have a 0 net effect from main() to exit(). It's nearly always
wrong when used in a producer/consumer situation.

Casper







On Sun, 16 Nov 2003 16:41:49 -0500, Casper H.S. *** wrote:

> Michael B Allen <mba2000@ioplex.com> writes:
>
>>Hi,
>
>>I'm seeing some strange behavior with the semop(2) SEM_UNDO flag. I have
>>a function like this:
>
>> int
>> svsem_wait(int semid)
>> {
>> struct sembuf wait;
>>
>> wait.sem_num = 0;
>> wait.sem_op = -1;
>> wait.sem_flg = SEM_UNDO;
>>
>> return semop(semid, &wait, 1);
>> }
>
>>In a producer/consumer test program, after precisely 32,767 calls to
>>this function it stops decrementing the semaphore value which permits
>>the producer to run uncontrolled.
>
>>If I remove the SEM_UNDO flag the problem does not occur and the test
>>program completes successfully.
>
>>Any idea what the problem might be? Considering 32768 is a power of 2 I
>>suspect I'm doing something wrong with SEM_UNDO that's causing a limit
>>to be exceeded.
>
> Are you doing the semop(-1) in one process and the +1 in another

Yes. The calls are not symmetric.

> or are
> you doing the -1 with UNDO and the +1 w/o?
>
> Note then that the undo counts accumulate over a process; all operations
> are undone when the process exists. A short is the typical value to
> hold the undo aggregate.
>
> TYpically, using SEM_UNDO is only correct when the application using it
> must have a 0 net effect from main() to exit(). It's nearly always
> wrong when used in a producer/consumer situation.

Right. Kurtis' explaination was right on. I didn't get the significance
of per-process undo state after reading the Stevens' books.

The code I am referring to is just a test program that looks roughly like
the following but one process calls produce() and the other consume():

int
produce(struct linkedlist *l, int mutex, int empty, int full)
{
for ( ;; ) {
svsem_wait(empty);
svsem_wait(mutex);

/* put something into l */

svsem_post(mutex);
svsem_post(full);
}

return 0;
}
int
consume(struct linkedlist *l, int n, int mutex, int empty, int full)
{
for ( ;; ) {
svsem_wait(full);
svsem_wait(mutex);

/* remove something from l */

svsem_post(mutex);
svsem_post(empty);
}

return 0;
}

I think this counting semaphore producer/consumer example was straight
out of the Tanenbaum book on operating systems. As Kurtis suggested
I should not use the semaphore as the counter but use a mutex to lock,
change a separate counter, and unlock in each process separately so that
each calls wait an equal number of times over the lifetime of the program.

Mike
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐