您的位置:首页 > 数据库

UNDERSTANDING PSQL'S MVCC

2015-06-01 12:43 260 查看


转载来自:http://eric.themoritzfamily.com/understanding-psqls-mvcc.html


UNDERSTANDING PSQL'S MVCC

PostgresSQL implements something called MVCC. Which alleviates the need for a Read-Lock
in many cases. However when Glenn Franxman and I were trying to figure out how this works we were having trouble understanding it completely.
If you ask about MVCC in the #postgres irc channel on Freenode, they will direct you
to the following three links

http://en.wikipedia.org/wiki/Multiversion_concurrency_control
http://www.developer.com/net/vb/article.php/877181
http://www.postgresql.org/docs/current/static/mvcc.html

With SQL there are four transaction isolation levels, unfortunately
the Wikipedia entry only describes one, serializable. So relying on the Wikipedia documentation to describe how
Postgres' opperates is not completely accurate. I think that internally Postgres may operate the way that the Wikipedia entry describes, but uses a write-lock to emulate a "Read Commit" isolation level to prevent the transaction abort that would normally occur
by one transaction writing and comitting data . I don't know if that's true or not.
Another confusing bit about Postgres is that Postgres only implements two levels, "Read Committed" and "Serializable" even though you can set the other isolation levels. In the documentation, the four
transaction isolation levels refers to the SQL standard and not Postgres' actually functionality.
So when you look at the four transaction isolation levels,
you'll notice that it says that a dirty read is possible if you set the isolation level to "READ UNCOMMITTED", however because "READ UNCOMMITTED" is the same as "READ COMMITTED" internally in Postgres, you can never get a dirty read even if you set the isolation
level to READ UNCOMMITTED. That's ok because, as the documentation states, "the four isolation levels only define which phenomena must not happen, they do not define which phenomena must happen."
I'll run some examples of concurrent transaction to show you how the four levels act.


READ COMMITTED

Transaction 1Transaction 2
mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# update mvcc_test set i = i + 1;
UPDATE 1


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


 
mvcc_test=# update mvcc_test set i = i + 1; -- 7


mvcc_test=# commit;
COMMIT


UPDATE 1
mvcc_test=# -- blocked until the other commited


 
mvcc_test=# commit;
COMMIT


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 3
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 3
(1 row)



READ UNCOMMITTED

With Read Uncommited I observed the same behavior as Read Commited. This reflects the transaction
documentation:

But internally, there are only two distinct isolation levels, which correspond to the levels Read Committed
and Serializable. When you select the level Read Uncommitted you really get Read Committed, and when you select
Repeatable Read you really get Serializable, so the actual isolation level might be stricter than what you select.

Transaction 1Transaction 2
mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ UNCOMMITTED;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ UNCOMMITTED;
SET


mvcc_test=# BEGIN; -- 1
BEGIN


mvcc_test=# BEGIN; -- 2.
BEGIN


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# update mvcc_test set i = i + 1;
UPDATE 1


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


 
mvcc_test=# update mvcc_test set i = i + 1;


mvcc_test=# commit;
COMMIT


UPDATE 1
mvcc_test=# -- blocked until the other commited


 
mvcc_test=# commit;
COMMIT


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 3
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 3
(1 row)



SERIALIZABLE

Transaction 1Transaction 2
mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# SERIALIZABLE;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# SERIALIZABLE;
SET


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# update mvcc_test set i = i + 1;
UPDATE 1


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


 
mvcc_test=# update mvcc_test set i = i + 1;


mvcc_test=# commit;
COMMIT


ERROR:  could not serialize access due to
concurrent update


 
mvcc_test=# select * from mvcc_test;
ERROR:  current transaction is aborted, commands
ignored until end of transaction block


 
mvcc_test=# commit;
ROLLBACK


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)



REPEATABLE READ

Refering to the same paragraph in the transaction documentation:

But internally, there are only two distinct isolation levels, which correspond to the levels Read Committed
and Serializable. When you select the level Read Uncommitted you really get Read Committed, and when you select
Repeatable Read you really get Serializable, so the actual isolation level might be stricter than what you select.<http://www.postgresql.org/docs/current/static/transaction-iso.html>

Repeatable Read is the same as Serializable.

Transaction 1Transaction 2
mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# REPEATABLE READ;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# REPEATABLE READ;
SET


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# update mvcc_test set i = i + 1;
UPDATE 1


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


 
mvcc_test=# update mvcc_test set i = i + 1;


mvcc_test=# commit;
COMMIT


ERROR:  could not serialize access due to
concurrent update


 
mvcc_test=# select * from mvcc_test;
ERROR:  current transaction is aborted, commands
ignored until end of transaction block


 
mvcc_test=# commit;
ROLLBACK


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)



WHAT IS POSSIBLE IN ISOLATION LEVELS

So, in the four transaction isolation levels table, it describes what is possible
in the different isolation levels. Since I've demostrated that there are only two isolation levels in Postgresql. I'll demostrate the two read phenomena that could happen in both Serializable and Read Committed.

Let me explain what those two phenomena are. Here is a good explanation of read phenomena. I'm just simply repeating
what is stated at that link

Nonrepeatable ReadS1 reads data which is later changed and commited by S2. If S1 reads the same data again (after S2's commit) and finds it to have changed or to be deleted (according to S2's changes), this is called a non-repeatable read.
It is called non-repeatable because the same select statement doesn't return the same data (within the same transaction).Phantom readS1 reads data (select) with a specific where condition. After this read, S2 inserts some data that meets the S1's where condition and commits the inserted data. When S1 issues a select statement with the same where condition,
it finds new records. It is called phantom read because the new records seem to be of phantom origin. A phantom read is thus a special case of a non-repeatable read.


READ COMMITTED / NONREPEATABLE READ ATTEMPT

The SQL Standard says that with a Read Committed Isolation level, a "Nonrepeatable Read" is possible. Here is my attempt in proving that assertion.

Transaction 1Transaction 2
mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# update mvcc_test set i = i + 1;
UPDATE 1


--


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# commit;
COMMIT


--


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


This is a Nonrepeatable Read. Because the data has change while inside a transaction.


READ COMMITTED / PHANTOM READ ATTEMPT

The SQL Standard says that with a Read Committed Isolation level, a "Phantom Read" is possible. Here is my attempt in proving that assertion.

mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# select * from mvcc_test
where id = 2;
id | i
----+---(0 rows)


mvcc_test=# select * from mvcc_test
where id = 2;
id | i
----+---(0 rows)


mvcc_test=# insert into mvcc_test (id, i)
values (2, 1);
INSERT 0 1


mvcc_test=# select * from mvcc_test
where id = 2;
id | i
----+---(0 rows)


mvcc_test=# commit;
COMMIT


mvcc_test=# select * from mvcc_test where id = 2;
id | i
----+---2 | 1
(1 row)


This is a phantom read because the row poofed into existance inside the transaction when it didn't exist before.


SERIALIZABLE / NONREPEATABLE READ ATTEMPT

The SQL Standard says that with a Serializable Isolation level, a "Nonrepeatable Read" is not possible. Here is my attempt in proving that assertion.

Transaction 1Transaction 2
mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# SERIALIZABLE;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# SERIALIZABLE;
SET


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# update mvcc_test set i = i + 1;
UPDATE 1


--


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


mvcc_test=# commit;
COMMIT


--


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 2
(1 row)


mvcc_test=# select * from mvcc_test;
id | i
----+---
1 | 1
(1 row)


This is not a Nonrepeatable Read. Because the data did not change while inside a transaction.


READ COMMITTED / PHANTOM READ ATTEMPT

The SQL Standard says that with a Read Committed Isolation level, a "Phantom Read" is possible. Here is my attempt in proving that assertion.

mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# BEGIN;
BEGIN
mvcc_test=# SET TRANSACTION ISOLATION LEVEL
mvcc_test-# READ COMMITTED;
SET


mvcc_test=# select * from mvcc_test
where id = 2;
id | i
----+---(0 rows)


mvcc_test=# select * from mvcc_test
where id = 2;
id | i
----+---(0 rows)


mvcc_test=# insert into mvcc_test (id, i)
values (2, 1);
INSERT 0 1


mvcc_test=# select * from mvcc_test
where id = 2;
id | i
----+---(0 rows)


mvcc_test=# commit;
COMMIT


mvcc_test=# select * from mvcc_test where id = 2;
id | i
----+---(0 row)


This is a phantom read because the new row did not appear inside the transaction after it was commit in the first transaction.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: