Wednesday, 27 February 2013

Transaction and concurrency control schemes


Transaction management is one of the complex and crucial piece of the enterprise application. Nowadays software application get more complicated and are of distributed type in nature. This make its more complex to design and implement the transaction management.

What is Transaction:
A transaction is a unit of work containing one or more operations involving one or more shared resources. A transaction has the ACID properties which guarantee that a transaction is never incomplete, data is never inconsistent, concurrent transactions are independent and the effects of transaction are persistent.

What is Distributed Transaction:
A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources.

ACID Properties:

  1. Atomicity:  means that either all the operations done in the transaction must be performed, or none of it must be performed. Doing part of a transaction is not permitted.
  2. Consistency: When a transaction is completed, the system must be in a stable and consistent condition.
  3. Isolation: means that the partial work done in one transaction is not visible to other transactions until the transaction is committed, and that each process in a multi-user system can be programmed as if it was the only process accessing the system.
  4. Durability: On transaction commit, the changes made during the transaction are made persistent permanently.

To provide the guarantee of the above properties, a system needs to make work hard in case of two (or more) transactions are working on same instance of data at same time(also known as concurrent access). In concurrent scenarios, some kind of concurrency control is necessary. There are two different approaches for concurrent control:

  • Pessimistic Concurrency Control (a.k.a Pessimistic Locking) 
  • Optimistic Concurrency control (a.k.a. Optimistic Locking see 1).

Pessimistic Concurrency Control
As name suggest, because the system assumes the worst — it assumes that two or more users will want to update the same record at the same time, and then prevents that possibility by locking the record, no matter how unlikely conflicts actually are.

The locks are placed as soon as any piece of the row is accessed, making it impossible for two (or more) users to update the row at the same time. Other users might be able to read the data even though a lock has been placed, but it depends on the lock mode (shared, exclusive, or update, see 2).

Optimistic Concurrency Control
Assumes that although conflicts are possible, they will be very rare. Instead of locking every record every time that it is used, the system merely looks for indications that two users actually did try to update the same record at the same time. If that evidence is found, then one user's updates are discarded and the user is informed.
Implementation Strategies:
  • Use version or timestamp column in every row. This is most common and have good performance over other strategies and populate in hibernate.
  • Dirty checking: re-read the row and compare the values of the updated columns.
  • All: re-read the row and compare all values.

Disadvantages of Pessimistic locking:
A resource is locked from the time it is first accessed in a transaction until the transaction is finished (either by commit or rollback), making it inaccessible to other transactions during that time. Since locks are applied in fail-safe way, it results in high lock contention, even though most transactions simply look at the resource and never change it.

Footer Notes:
1. Even though the optimistic concurrency control mechanism is sometimes called optimistic locking, it is not a true locking scheme—the system does not place any locks when optimistic concurrency control is used. The term locking is used because optimistic concurrency control serves the same purpose as pessimistic locking by preventing overlapping updates.

2. Lock Modes
  1. Shared: allows multiple users to read the data, but do not allow any user to change the data.
  2. Exclusive: allows single user to update a particular data. As name suggest, no other type of lock may be placed on the row.
  3. Update: rows locked with in this mode, are not available to other users even for read. It ensures that the current user can later update the row. Update locks are similar to exclusive locks. The main difference between the two is that you can acquire an update lock when another user already has a shared lock on the same record. This lets the holder of the update lock read data without excluding other users. However, once the holder of the update lock changes the data, the update lock is converted into an exclusive lock.