您的位置:首页 > 运维架构

Operating System: Three Easy Pieces --- Why It Gets Worse: Shared Data (Note)

2015-11-06 13:43 344 查看
The simple thread example we showed above was useful in showing how threads are

created and how they can run in different orders depending on how the scheduler decides

to run them. What it doesn't show you, though, is how threads interact when they access

shared data.

          The Heart Of The Problem: Uncontrolled Scheduling

To understand why this happens, we must understand the code sequence that the compiler

generates for the update to counter. In this case, we wish to simple add a number 1 to counter.

Thus, the code sequence for doing so might look something like this (in X86);

mov 0x8049a1c, %eax

add $0x01, %eax

mov %eax, 0x8049a1c

This example assumes that the variable counter is located at address 0x8049a1c. In this

three-instruction sequence, the x86 mov instruction is used first to get the memory value at

the address and put it into register eax. Then, the add is performed, adding 1 to the contents

of the eax register, and finally, the contents of eax are stored back into memory at the same

address.

Let us imagine one of our two threads (Thread 1) enters this region of code, and is thus about

to increment counter by one. It loads the value of counter (let's say it's 50 to begin with) into

its register eax. Thus, eax=50 for thread 1. Then it adds one to the register; thus eax=51.

Now, something unfortunate happens: a timer interrupt goes off; thus, the OS saves the state

of the currently running thread (its PC, its registers including eax, etc) to the thread's TCB.

Now something worse happens: Thread 2 is chosen to run, and it enters this ame piece of code.

It also executes the first intruction, getting the value of counter and putting it into its eax (

remember: each thread when running has its own private registers; the registers are virtualized

by the context-switch code that saves and restores them). The value of counter is still 50 at this

point, and thus thread 2 has eax=50. Let's then assume that Thread 2 executes the next two

instructions, incrementing eax by 1 (thus eax=51), and then saving the contents of eax into

counter (address 0x8049a1c). Thus, the global variable counter now has the value 51.

Finally, another context switch occurs, and Thread 1 resumes running. Recall that it had just

executed the mov and add, and is now about to perform the final mov instruction. Recall also

that eax=51. Thus, the final mov instruction executes, and saves the value to memory; the

counter is set to 51 again.

Put simply, what has happened is this: the code to increment counter has been run twice, but

counter, which starts at 50, is now only equal to 51. A "correct" version of this program should

resulted in the variable counter equal to 52.

Let's look at a detailed execution trace to understand the problem better. Assume, for this

example, that the above code is loaded at address 100 in memory, like the following sequence

(note fotr those of you use to nice, RISC-like instruction sets: x86 has variable-length

instructions; this mov instruction takes up 5 bytes of memory, and the add only 3:

100 mov    0x8049a1c, %eax
101 add     $0x1,          %eax
102 mov    %eax,    0x8049a1c


With these assumptions, what happens is shown in Figure 26.7. Assume the counter starts at

value 50, and trace through this example to make sure you understand what is going on.

What we have demonstrated here is called race condition: the results depend on the timing

execution of the code. With some bad luck (i.e., context switches that occur at untimely points

in the execution), we get the wrong result. In fact, we may get a different result each time; thus,

instead of a nice determinstic computation (which we are used to from computers), we call this

result indeterminate, where it is not known that the output will be and it is indeed likely to be

different across runs.

Because multiple threads executing this code can result in a race condition, we call this code

a critical section. A critical section is a piece of code that accesses a shared variable (or more

generally, a shared resource) and must not be concurrently executed by more than one thread.

What we really want for this code is what we call mutual exclusion. This property guarantees

that if one thread is executing within the critical section, the others will be prevented from

doing so.

Virtually all of these terms, by the way, were coined by Edsger Dijkstra, who was a pioneer in

the field and indeed won the Turing Award because of this and other work; see his 1968 paper

on "Cooperating Sequential Processes' for an amazingly clear description of the problem. We

will hearing more about Dijkstra in this section of this book.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: