Before we demonstrate Isolation levels, we must define transactions and what ACID model is. Isolation levels are one of ACID model properties, Isolation.
A Transaction represents a collection of operations executed sequentially as one single operation. Consider the well know example of money transfer between two bank accounts of Alice and Bon. Money transfer involves the following operations:
- Read Alice’s account, by issuing a select statement, to check if Alice has X-amount or more for the transfer.
- Read Bob’s account, by issuing a select statement, to check if Bob’s account can receive money.
- Decrease Alice’s account by X-amount of money.
- Increase Bob’s account by X-amount of money.
In RDBMS for a unit of work to qualify as transaction it must comply to ACID model. So, what is ACID model ?
ACID model represents four properties that a transaction (single unit of work) must exhibit. We are not going to dive deep into ACID model, as there are tons of documents online that does so. We are going to define them only in a nutshell.
ACID stands for:
As mentioned, a transaction is a collection of operations. Other transaction must see these operations as one atomic unit of work, as one operation. Either all operations succeeds or they all fail !
Transaction must leave the data in a consistence state, after completion of the transaction. What consistence state means data that transaction writes must not break any data structure like B-trees and indexes and data must be valid according to constrains like unique and foreign keys, cascades, triggers or any combination theory.
This property is our main focus in this article. So, what are these isolation levels ?
Isolation levels define the degree/level of isolation of work (reading/modifying) of a number of transactions. When these transaction executes in parallel and work on same set of data. In other words, let T1 and T2 are two transactions that execute in parallel. Isolation level defines what modifications T2 can T1 see, during the execution of T1 and T2.
Here are definitions from well know sources:
- Transactions specify an isolation level that defines the degree to which one transaction must be isolated from resource or data modifications made by other transactions. Isolation levels are described in terms of which concurrency side-effects, such as dirty reads or phantom reads, are allowed. Miscrosoft
- Transaction isolation is one of the foundations of database processing. Isolation is the I in the acronym ACID; the isolation level is the setting that fine-tunes the balance between performance and reliability, consistency, and reproducibility of results when multiple transactions are making changes and performing queries at the same time. MySQL
- The SQL standard defines four levels of transaction isolation. The most strict is Serializable, which is defined by the standard in a paragraph which says that any concurrent execution of a set of Serializable transactions is guaranteed to produce the same effect as running them one at a time in some order. The other three levels are defined in terms of phenomena, resulting from interaction between concurrent transactions, which must not occur at each level. The standard notes that due to the definition of Serializable, none of these phenomena are possible at that level. (This is hardly surprising — if the effect of the transactions must be consistent with having been run one at a time, how could you see any phenomena caused by interactions?) PostgreSQL
- In database systems, isolation determines how transaction integrity is visible to other users and systems. For example, when a user is creating a Purchase Order and has created the header, but not the Purchase Order lines, is the header available for other systems/users, carrying out concurrent operations (such as a report on Purchase Orders), to see? Wikipedia
SQL standard defines four isolation levels of transaction. Each level has its own characteristics. Isolation levels are derived and defined in terms of phenomena (anomaly or circumstances) caused by interaction between transactions.
Following are definition of these anomalies:
Dirty reads occur when a transaction (T1) reads/sees data that is created or deleted by another uncommitted (in progress) transaction (T2).
A nonrepeatable read represents a situation where a transaction (T1) reads (executes a select query) some data. Meanwhile another transaction (T2) modifies this same data. When (T1) reads the modified data again (executes the same previous query again) it finds out that the data had been modified. Thus the name nonrepeatable reads. You can’t repeat the same query and get the same results.
A phantom read represents a change in the number of rows returned when a transaction (T1) executes the same query twice. Due to the fact that another transaction (T2) inserted some rows and committed its execution. In other words, transaction (T1) begins and executes a search, meanwhile transaction (T2) starts and inserts some rows(records) and commits. These rows satisfies transaction (T1)’s search. When transaction (T1) executes the previous search once again, it will find that the number of rows are different.
There are four isolation levels. Each isolation level is meant to prevent one or more of the above mentioned phenomena (situation or circumstances) .
This is the lowest isolation level. It allows dirty reads. Transaction (T1) can read data that transaction (T2) has just written before transaction (T1) commits. In other words, one transaction may see changes that another one is just making before it finishes and commits its work. T1 may see not-yet-committed changes made by T2
In this isolation level, a lock-based concurrency control DBMS implementation keeps write locks (acquired on selected data) until the end of the transaction, but read locks are released as soon as the SELECT operation is performed (so the non-repeatable reads phenomenon can occur in this isolation level, as discussed below).
Read committed is an isolation level that guarantees that any data read is committed at the moment it is read. It simply restricts the reader from seeing any intermediate, uncommitted, ‘dirty’ reads. It makes no promise whatsoever that if the transaction re-issues the read, it will find the same data; data is free to change after it is read. Wikipedia
It is worth to answer a question that might come to your mind. What is the difference between dirty ready and non-repeatable read ?
Yes, both means that some transaction (T1) can read modified data by (T2). But !
In dirty read transaction (T1) can see data that transaction (T2) added or modified but transaction (T2) is still not committed !. With non-repeatable reads transaction (T1) CAN’T see any addition or modifications, made by transaction (T2) until transaction (T2) is commited !
In this isolation level, a lock-based concurrency control DBMS implementation keeps read and write locks (acquired on selected data) until the end of the transaction. However, range-locks are not managed, so phantom reads can occur.
This is the strictest isolation level. Serializable isolation level simulation a situation where transaction executes one after another, in a series. So it doesn’t allow any transaction anomaly. However, this isolation represents a performance issue and it is rarely used for this reason. It is mostly used for debugging issues.
Here is a summary by PostgreSQL:
|Isolation Level||Dirty Read||Nonrepeatable Read||Phantom Read||Serialization Anomaly|
|Read uncommitted||Allowed, but not in PG||Possible||Possible||Possible|
|Read committed||Not possible||Possible||Possible||Possible|
|Repeatable read||Not possible||Not possible||Allowed, but not in PG||Possible|
|Serializable||Not possible||Not possible||Not possible||Not possible|
The final property of ACID model is durability. Durability states that committed data is saved by the system such that, even in the event of a failure and system restart, the data is available in its correct state.