您的位置:首页 > 运维架构

Operating System: Three Easy Pieces --- Locks: Pthread Locks (Note)

2015-11-22 12:53 363 查看
The name that the POSIX library uses for a lock is mutex, as it is used to provide mutual exclusion

between threads, i.e., if one thread is in the critical sections, it excludes the others from entering

until it has completed the section. Thus, when you see the following POSIX threads code, you

should understand that it is doing the same thing as above (we again use our wrappers that

check for errors upon lock and unlock):

pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock(&lock);
balance = balance + 1;
pthread_mutex_unlock(&lock);


You might also notice here that the POSIX version passes a variable to lock and unlock, as we

may be using different locks to protect different variables. Doing so can increase concurrency:

instead of one big lock that is used any time any critical section is accessed (a coarse-grained

locking strategy), one will often protect different data and data structures with different locks,

thus allowing more threads to be in locked code at once (a more fine-grained approach).

                      Building A Lock

By now, you should have some understanding of how a lock works, from the perspective of a

programmer. But how should we build a lock ? What hardware support is needed ? What OS

support ? It is this set of questions we address in the rest of this chapter.

                      How to build a lock

How can build an efficient lock? Efficient locks provided mutual exclusion at low cost, and also

might attain a few other properties we discuss below. What hardware support is needed ? What

OS support ?

To build a working lock, we will need some help from our old friend, the hardware, as well as our

good pal, the OS. Over the years, a number of different hardware primitives have been added to

the instruction sets of various computer architecture; while we won't study how these instructions

are implemented (that, after all, is the topic of computer architecture class), we will study how to

use them in order to build a mutual exclusion primitive like a lock. We will also study how the OS

gets involved to complete the picture and enable us to build a sophisticated locking library.

                      Evaluating Locks

Before building any locks, we should first understand what our goals are, and thus we ask how

to evaluate the efficiency of a particular lock implementation. To evaluate whether a lock works

and works well, we should first establish some basic criteria. The first is whether the lock does its

basic task, which is to provide mutual exclusion. Basically, does the lock work, preventing

multiple threads from entering a critical section?

The second is fariness. Does each thread contending for the lock get a fair shot at acquring it once

it is free? Another way to look at this is by examining the more extreme case: does any thread

contending for the lock starve while doing so, thus never obtaining it ?

The final criterion is performance, specifically the time overheads added by using the lock. There

are a few different cases that are worth considering here. One is the case of no contention; when

a single thread is running and grabs and releases the lock, what is the overhead of doing so ?

Another is the case where multiple threads are contending for the lock on a single CPU; in this

case, are there performance concerns ? Finally, how does the lock perfrom when there are

multiple CPUs involved. and threads on each contending for the lock ? By conparing these

different scenarios, we can better understand the performance impact of using various locking

techniques, as described below.

                    Controlling Interrupts

One of the earliest solutions used to provide mutual exclusion was to disable interrupts for critical

sections; this solution was invented for single-processor system. The code would look like this:

void lock() {
DisableInterrupt();
}

void unlock() {
EnableInterrupt();
}


Assume we are running on such a single-processor system. By turning off interrupts (using some

kind of special hardware instruction) before entering a critical section, we ensure that the code

inside the critical section will not be interrupted, and thus will execute as if it were atomic. When

we are finished, we re-enable interrupts (again, via a hardware instruction) and thus the program

proceeds as usual.

Tha main positive of this approach is its simpility. You certainly don't have to scratch your head

too hard to figure out why this works. Without interruption, a thread can be sure that the code it

executes will execute and that no other thread will interfere with it.

The negatives, unfortunately, are many. First, this approach requires us to allow any allowing

threads to perfrom a privileged operation (turning intertuprs on and off), and thus trust that this

facility is not abused. As you already know, any time we are required to trust an arbitrary program

, we are probably in trouble. Here, the trouble manifests in numerous ways: a greedy program

could call lock() at the beginning of its execution and thus monopolize the processor; worse, an

errant or malicious program could call lock() and go into an endless loop. In this latter case, the

OS never regains control of the system, and there is only one resource: restart the program. Using

interrupt disabling as a general-purpose synchronization solution requires too much trust in

applications.

Second, the approach does not work on mutiprocessors. If multipe threads are running on

different CPUs, and each try to enter the same critical section, it does not matter whether

interrupts are disabled; threads will be able to run on other processors, and thus could enter the

critical section. As multiprocessots are now commonplace, our general solution will have to do

better than this.

Third, turning off interrupts for extended periods of time can lead to interrupts lost, which can

lead to serious systems problems. Imagine, for example, if the CPU missed the fact that a disk

device has finished a read request. How will the OS know to wake the process waiting for said

read ?

Finally, and probably least important, this approach can be inefficient. Compared to normal

instruction execution, code that masks or unmasks interrupts tends to be executed slowly by

modern CPUs.

For these reasons, turning off interrupts is only used in limited contexts as a mutual-exclusion

pritimive. For example, in some cases an operating system itself will use interrupt masking to

guarantee atomicity when accessing its own data structures, or at least to prevent certain messy

interrupt handling situations from arising. This usage makes sense, as the trust issue disappears

inside ths OS, which always trusts itself to perform privileged operations anyhow.

                  Aside, Dekker's and Peterson's Algorithm

In the 1960's, Dijkstra posed the concurrency probelm to his friends, and one of them, a

mathematician named Theodorous Dekker, came up with a solution. Unlike the solutions we

discuss here, which use special hardware instructions and even OS support, Dekker's algorithm

uses just loads and stores (assuming they are atomic with respect to each other, which was ture

on early hardware).

Dekker's approach was later refined by Peterson. Once again, just loads and stores are used, and

the idea is to ensure that two threads never enter a critical section at the same time. Here is

Peterson's algorithm (for two threads); see if you can understand the code. What are the flag and

ture variables used for ?

int flag[2];
int turn;

void init() {
flag[o] = flag[1] = 0;     // 1 ---> thread wants grab lock
turn = 0;                      //  whose turn ? (0 or 1) ?
}

void lock() {
flag[self] = 1;
turn = 1 - slef;
while ((flag[1 - self] == 1) && (turn == 1 - tuen)) {
SPIN;
}
}

void unlock() {
flag[self] = 0;

}


For some reason, developing locks that work without special hardware support became all the rage

for a while, giving theory-types a lot of problems to work on. Of course, this line of work became

quite useless when people realized it is much easier to assume a little hardware support (and

indeed that support had been around from the earliest days of multiprocessing). Further,

algorithms like the one above don't work on modern hardware due to relaxed memory consistence

models, thus making them even less useful than they were before. Yet more research relegated

to the dustbin history...
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: