Introduction
This article is designed to introduce you the concepts of isolation levels—the concepts that every database programmer needs to know to develop applications that allow multi-user access to the database.
Data Consistency
Data consistency is a term used to describe the accuracy of data. For example, suppose you have a table that maintains a count of vehicles available for sale in car dealerships. When a vehicle is transferred from dealership A to dealership B, the count has to decrease in dealership A and increase in dealership B. Both actions must take place; if one occurs and another does not, the data becomes inaccurate—data inconsistency occurs.
Data inconsistency can occur due to several factors:
- Data is accessed/updated by multiple users at the same time
- Transition fails to perform all necessary actions due to a crash
To prevent the occurrence of the above situations, a DBMS (database management system) relies on isolation levels and locks.
Isolation Levels
An isolation levels mechanism is used to isolate each transaction in a multi-user environment. The correct use of the isolation levels mechanism prevents applications from introducing errors that can occur from the following situations:
1. Lost Updates: This situation occurs when two transactions attempt to update the same data. Consider the following example:
- Transaction A reads row 1.
- Transaction B reads row 1.
- Transaction A updates row 1.
- Transaction B updates row 1, overlaying changes applied by Transaction A.
In the above situation, updates performed by Transaction A are lost.
2. Dirty Reads: This situation occurs when transactions read data that has not been committed. Consider the following example:
- Transaction A inserts row 1 without committing.
- Transaction B reads row 1.
- Transaction B rolls back row 1.
- Transaction A now has a row that physically does not exist.
3. Nonrepeatable Reads: This situation occurs when a transaction reads the same query multiple times and results are not the same each time. Consider the following example:
- Transaction A reads a row of data.
- Transaction B modifies this row and commits.
- Transaction A re-reads the same row and sets back different data values.
4. Phantoms: This situation occurs when a row of data matches the first time but does not match subsequent times. Consider the following example:
- Transaction A reads two rows based on a Query A where clause.
- Transaction B inserts a new row that happens to fall under Transaction A Query A’s where clause.
- Transaction A runs Query A again and now gets back three rows.
The above four phenomenona demonstrate that there is a need to utilize a mechanism called ISOLATION LEVELS. The next sections of this article will explain DB2 isolation levels. Most of the rules regarding isolation levels are the same across most transactional databases but some rules may vary; therefore, I advise you consult other sources when dealing with databases other than DB2. A DB2 database supports four isolation levels:
- REPEATABLE READ: Protects against Lost Updates, Dirty Reads, Nonrepeatable Reads, and Phantoms
- READ STABILITY: Protects against Lost Updates, Dirty Reads, and Nonrepeatable Reads. Read stability does not protect against Phantoms.
- CURSOR STABILITY: Protects against Nonrepeatable Reads and Phantoms. Cursor Stability does not protect against Lost Updates and Dirty Reads.
- UNCOMMITED READ: Protects against Lost Updates. Uncommitted Read does not protect against Phantoms, Dirty Reads, and Nonrepeatable Reads.
Repeatable Read
This isolation level is the most restrictive of the four. It makes sure that none of the phenomenona will occur. It is also the most restrictive when it comes to database concurrency. (Database concurrency controls ensure that transactions occur in an ordered fashion. The main job of these controls is to protect transactions issued by different users/applications from the effects of each other. They must preserve the four characteristics of database transactions: atomicity, isolation, consistency and durability.) When a transaction with isolation level REAPETABLE READ (RR) executes, it locks every row it references. It is important to understand that the lock is held for every row, not only rows that are updated, but also those that are merely selected. Consider the following situation:
A car dealership manager is executing a report that produces a total number of vehicles for sale. During this time, a salesperson is attempting to submit a purchase order that will decrement the number of vehicles on the lot. If the manager’s report is running with isolation level RR, the purchase order will not go through because all rows are being locked by the manager’s report.
The above situation reveals the importance of understanding the implications of using the Repeatable Read isolation level.
Now, look at another example but in more technical terms:
- Transaction A with Repeatable Read isolation level selects all rows on table A.
- Transaction B attempts to update any of the rows selected by Transaction A and fails because all rows selected by Transaction A are locked.
Read Stability
Read Stability is not as restrictive as the Repeatable Read isolation level; therefore, it does not prevent all the phenomenona. The difference between Read Stability and Repeatable Read is as follows:
In Read Stability, only rows that are retrieved or modified are locked, whereas in Repeatable Read, all rows that are being referenced are locked. This crucial difference makes Phantoms possible, but protects against Lost Updates, Dirty Reads, and Nonrepeatable Reads. Look at the following scenario:
- Transaction A with isolation level Read Stability selects row 1.
- Transaction B cannot update row 1 while Transaction B is holding a lock on row 1. This prevents against Lost Updates, Dirty Reads, and Nonrepeatable Reads.
- Transaction B can insert a new row that can show up in a result set of Transaction A. This allows for Phantoms.
Cursor Stability
Cursor Stability protects against Dirty Reads and Lost Updates, but does not protect against Phantoms and Nonrepeatable Reads. Cursor Stability only locks one row at a time—the row that is currently being referenced by a cursor. As soon as the cursor moves to the next row, the lock on the previous row is released. This provides for much more concurrency because other transactions can update rows before and after a row that is being referenced by a cursor with cursor stability. There are two exceptions to this rule:
- If the cursor with cursor stability retrieves rows using an index, now rows can be modified or inserted into the cursor’s result set.
- No transaction can modify or delete a row that has been updated by the Cursor Stability Cursor until the owning transaction is terminated.
This isolation level is a bit more complicated, so let me go through several scenarios:
- Transaction A opens cursor A with isolation level Cursor Stability and starts reading rows that fall under cursor’s where clause.
- While cursor A is executing and referencing row 2, transaction B cannot update or delete row 2, but can update and insert all other rows.
- Transaction B can update row 2 as soon as cursor A releases row 2 given that it has not modified it.
- If transaction A re-runs cursor A, there is no guarantee that resultset will be the same—a nonrepeatable read can occur
- While transaction A’s cursor is executing row 2, transaction B modifies row 3 and does not commit. Transaction A cannot see that change; therefore, it can in fact overlay transaction B updates—a lost update can occur.
Uncommited Read
An uncommitted read is the least restrictive of all isolation levels and provides the most database concurrency. This isolation level is mostly used to retrieve read-only data. Transactions running under this isolation level can see uncommitted data. An Uncommitted Read transaction locks only those rows that it modifies or if another transaction attempts to alter or drop the table the rows are being retrieved from. Uncommitted Read does not protect against dirty reads, nonrepeatable reads, and phantoms, but protects against lost updates.
Even though transactions running under Uncommitted Read will see uncommitted rows, they will not see tables, views, and indexes that have been created or dropped until the transaction that created/dropped them commits its changes. Consider the following situations:
- Transaction A, running with isolation level Uncommitted Read ,is updating row 1 and 2.
- While transaction A is running, transaction B (not running under Uncommitted Read) cannot read rows 1 and 2 because these rows are locked by Transaction A—lost updates phenomena cannot occur.
- While transaction A is running, transaction C (running under Uncommitted Read) can read rows 1 and 2—a dirty read can occur.
How to Specify the Isolation Level
The isolation level can be set in more than one way:
- The isolation Level can be set at the application level. For embedded SQL, the isolation level can be specified during BIND or PRECOMPILE.
ISOLATION [RR | RS | CS | UR]
- The isolation level can be set for CLI (Call Level Interface) using the SQLSetConnectAttr() function by passing the SQLATTR_TXN_ISOLATION attribute.
- The isolation level also can be set for ODBC (Open Database Connectivity) by changing the value of TXNISOLATION in the db2cli.ini file.
- The isolation level can be set in JDBC (Java Database Connection) at runtime. This can be done by invoking the setTransactionIsolation() method that resides in the java.sql.Connection class.
- The isolation Level can be assigned to a SELECT statement using a WITH clause.
WITH [RR | RS | CS | UR]
Note: If the isolation level is not explicitly set, the default isolation level used is CURSOR STABILITY.
Conclusion
In this article, you have learned about isolation levels, data concurrency, and database phenomena. You have also briefly touched a much broader subject of locks.
References
- http://databases.about.com/od/administration/g/concurrency.htm
- DB2 9 Fundamentals Certification Study Guide, First Edition (Exam 730) by Roger E. Sanders. MC Press, 2007
- Understanding DB2: Learning Visually with Examples by Raul F. Chong, Clara Liu, Sylvia F. Qi and Dwaine R. Snow , IBM Press, 2005
- DB2 Developer’s Guide, Fourth Edition by Craig S. Mullins, Sams, 2000
About the Author
Aleksey Shevchenko has been working with object-oriented languages for over seven years. For the past four years, he has served as a technical lead and a project manager. Aleksey has been implementing Enterprise IT solutions for Wall Street and the manufacturing and publishing industries.