Understanding InnoDB transaction isolation levels
Isolation is an important part of ACID properties that guarantee that transactions are processed in a reliable manner. Isolation ensures that concurrently running transactions do not interfere with each other. Isolation ensures data consistency. If the transactions are not isolated then one transaction could modify data that other transactions are reading hence creating data inconsistency.
Now that we have understood what isolation is let’s understand isolation levels. Isolation levels determine how isolated the transactions are from each other, it could be that there is no isolation or the maximum level of isolation which serializes transactions execution. Selecting the best isolation level really depends on your application needs, but for that you must first understand all the different isolation levels and the consequences of selecting any one of them.
InnoDB supports all the four SQL-standard isolation levels, which are listed and described in the following text.
The four isolation levels
READ-UNCOMMITTED isolation level there isn’t much isolation present between the transactions at all. The transactions can see changes to data made by other transactions that are not committed yet. That means transactions could be reading data that may not even exist eventually because the other transaction that was updating the data rolled-back the changes and didn’t commit. This is known as
dirty read. An application rarely needs to rely on dirty needs, in fact this really can’t be called an isolation level. Simply put, there isn’t really any isolation at all and hence such a system can’t really be called a transactional system.
READ-COMMITTED isolation level, the phenomenon of
dirty read is avoided, because any uncommitted changes is not visible to any other transaction, until the change is committed. Within this isolation level each SELECT uses its own snapshot of the committed data that was committed before the execution of the SELECT. Now because each SELECT has its own snapshot, so the same SELECT when run multiple times during the same transaction could return different result sets. This phenomenon is called
REPEATABLE-READ isolation level, the phenomenon of
non-repeatable read is avoided. This isolation level returns the same result set through out the transaction execution for the same SELECT run any number of times. A snapshot of the SELECT is taken the first time that SELECT is run during the transaction and the same snapshot is used through out the transaction when the same SELECT is executed. A transaction running in this isolation level does not take into account any changes to data made by other transactions, regardless of whether the changes have been committed or not. This ensures that reads are always consistent(repeatable). This isolation level is the default for InnoDB. Although this isolation level solves the problem of
non-repeatable read, but there is another possible problem
SERIALIZABLE isolation level, the phenomenon of
phantom reads is avoided. Transactions when run in this isolation level place locks on all records that are accessed, as well as locks the resource so that records cannot be appended to the table being operated on by the transaction. Transactions when run in this fashion, run in a serialized manner. This isolation level is the strongest possible isolation level.
SERIALIZABLE isolation level can be pretty much avoided!
As the text above describes,
phantom reads is the only problem that
SERIALIZABLE is helping you avoid, otherwise
SERIALIZABLE is pretty much same as
REPEATABLE-READ. InnoDB has a special locking feature called gap-locking which helps you avoid the
phantom reads problem. In the case of gap locking, a lock is set on the gap between index records, or on the gap before the first or after the last index record. Now all you have to do avoid phantom reads is to convert such reads to locking reads using SELECT with FOR UPDATE or LOCK IN SHARE MODE.
Locking and the isolation levels
READ-UNCOMMITTED isolation level has the least number of locking done, after that comes the
READ-COMMITTED isolation level which removes most of the gap-locking and hence produces fewer deadlocks, also in the case of
READ-COMMITTED, locking reads only lock the index records and not the gaps before/after them.
REPEATABLE-READ has a higher level of locking as compared to
READ-COMMITTED, UPDATE, DELETE use next-key locking, also locking reads also use next-key locking.
SERIALIZABLE has the highest level of locking, all the simple SELECTs are automatically converted to SELECT … LOCK IN SHARE MODE, and hence all records have shared locks.
Replication and isolation levels
The default replication type in MySQL is statement-based replication, and this replication type replicates the data changes by re-executing SQL statements on slaves that were executed on the master. This requires the isolation level to be stricter (involving more locking) so that the data changes are consistent in such a way that the same SQL when executed on the slave produces the same data changes. As mentioned above,
READ-COMMITTED creates a situation of non-repeatable read, hence its not safe for statement-based replication. Hence, when using statement-based replication either use
REPEATABLE-READ isolation level. If you have MySQL version >= 5.1 then you can use
READ-COMMITTED with the row-based replication, because with row-based replication you have exact information about each data row changes.
Performance and isolation levels
As I have mentioned in the “Locking and the isolation levels” section,
REPEATABLE-READ employ lots of locking and hence creating more deadlock situations, which in turn decreases performance. In fact
SERIALIZABLE is the least performant of the isolation levels, as it converts even plain reads into locking reads.
REPEATABLE-READ is better in terms of locking and deadlocks but
READ-COMMITTED is even better because there are fewer gap-locks. But locking and deadlocks is not the only thing when considering performance, there is another issue of mutex contention that needs consideration. There is a post by Mark Callaghan comparing both
READ-COMMITTED in the context of mutex contention. This post can be found here.
I have tried to provide a detailed description of the four isolation levels and causes and effects of using any one of them, I have also mentioned locking and performance as related to the isolation levels. Now
REPEATABLE-READ really are the only two isolation levels from which to make your selection. And that selection really depends on the kind of application you are running. I don’t think any generalized benchmark would help you make the selection between the two. So before making any decision run benchmarks specific to your application and then try to make a decision. I also sincerely hope that you are running MySQL version >= 5.1, because if you are running any other version
REPEATABLE-READ is pretty much your only choice.