Why do I recommend switching over from MyISAM to Innodb!

MyISAM has been the default storage engine for MySQL for years now. But that is soon going to change. With MySQL 5.5, Innodb is going to be the default storage engine (and I am pretty pleased about that, no more new users complaining about MySQL not being ACID compliant).

But why do I recommend Innodb over MyISAM. Following are the reasons:

I will start of with reasons not so obviously discussed during comparisons of Innodb and MyISAM storage engines. But these reasons are so compelling that I bet you would want to switch over to Innodb after reading the reasons.

Innodb is being constantly improved and worked upon

There hasn’t been much work put forth into improving the MyISAM storage engine, and with Oracle acquiring MySQL and owning Innodb, it has put its weight on the Innodb storage engine. While there hasn’t been much work done on MyISAM side, Innodb has seen improvements and work is being done aggressively. MySQL 5.5 will ship with a new version of Innodb – version 1.1, which has major performance improvements. Even if you are using MySQL 5.1, you can replace the built-in Innodb with the newer Innodb plugin – version 1.0.8, which has introduced major performance improvements.

Innodb supports row-level locking

Innodb not only supports table-level locking, but also supports row-level locking, making it the storage engine of choice for high concurrency applications.

Innodb supports hot backup

With MyISAM, backing up a live database, is pretty much an issue for many DBAs because of the way how MyISAM supports locking. MyISAM only supports table-level locks which pretty much means that the server must go down during the backup, for sometime depending upon the amount of data in the database. With Innodb this is no issue at all, Innodb supports row level locking which allows you to do hot backups.

Innodb is ideal for high-concurrency applications

Yet another big reason for using Innodb is that with Innodb you can achieve high-user-throughput and high-database-concurrency. That is again thanks to the row-level mechanism of Innodb, which makes it a big winner (compared to MyISAM) in cases where you need highly concurrent read-write operations.

Innodb is crash proof

What I essentially mean by being crash proof is the database, recovering back into a consistent state in case of a crash (be it an OS crash or a MySQL server crash). While in the case of MyISAM, a crash would mean that you would have to deal with recovering tables on the crash or partially executed statements, in the case of Innodb there would be essentially no data inconsistency and all the more its more of an automated recovery, where by Innodb would recover itself by replaying its logs.

Innodb supports clustered indexes

Innodb stores rows ordered physically in PRIMARY KEY order. Storing the rows in primary key order corresponds to what is known in MSSQL Server as “Clustered index” and in Oracle as “index organized tables”. This makes primary key fetches extremely efficient.

Innodb implies various internal optimizations that improve the performance further

There are many different optimizations that Innodb performs internally that improve performance drastically. Following are some of the optimizations:

  • Adaptive Hash Indexing:
    To quote from the MySQL manual: “If a table fits almost entirely in main memory, the fastest way to perform queries on it is to use hash indexes rather than B-tree lookups. InnoDB monitors searches on each index defined for a table. If it notices that certain index values are being accessed frequently, it automatically builds an in-memory hash table for that index.
  • Read Ahead:
    A read-ahead request is an I/O request sent by Innodb to pre-fetch data pages into the innodb buffer pool cache. Innodb sends these read-ahead requests based on the access patterns of the data pages already in the buffer pool cache. This pre-fetching can improve read performance drastically for data that is being fetched linearly.
  • Innodb Buffer Pool:
    While MyISAM relies on the OS cache for caching reads and writes, Innodb has its own cache which it manages on its own and which is of course fine-tuned for Innodb operations. And combine this with adaptive hash indexing and read-aheads, and you can imagine the performance gains.
  • Innodb change buffering:
    When INSERT, UPDATE, and DELETE operations are performed on a table, Innodb buffers these changes into the Innodb buffer pool instead of flushing them directly to disk instantly. These changes are flushed to the disk only when needed. This avoids expensive I/O operations and thus providing great performance boost for applications that do a lot of inserts/updates/deletes. To quote the MySQL manual: “Because it can result in fewer disk reads and writes, this feature is most valuable for workloads that are I/O-bound, for example applications with a high volume of DML operations such as bulk inserts.”
  • Innodb background flushes:
    Innodb performs many of the I/O operations using asynchronous background threads, operations like read-ahead requests, dirty-pages flushing etc. This ensures that these I/O operations do not block user operations on the database.

Now come the reasons that are discussed more often, mentioned in all the comparisons of Innodb and MyISAM. Although you might not need transactions or foreign key constraints in all of your applications, but these do make Innodb a complete winner when compared with MyISAM.

Innodb supports transactions

Innodb supports ACID-compliant transactions, what that means is that you can group SQL statements in a transaction and be sure that the database will be in a consistent state, either you commit the transaction or rollback, or even more so the server crashes in the middle of a SQL statement.

Innodb supports foreign key constraints

Innodb supports foreign keys, so that you can enforce parent-child relationships. Operations such as deleting child rows when parent is deleted, or updating the child rows on updating of parent row are supported and enforced. This ensures that the database takes care of ensuring that the relationships are consistent, without the onus being on the application developer.

Conclusion:

If you are still not convinced about Innodb and have some questions in mind you may discuss them out with me.

What’s coming?

I had written a post about tuning MySQL server settings applicable to the MyISAM storage engine, I will soon be posting about tuning Innodb related MySQL server settings. Besides that, I will soon be deploying MySQL 5.5 on the server hosting this blog and I will be sharing my findings soon. Besides that I plan on posting some benchmarking results for MyISAM vs Innodb.

  • http://www.ovaistariq.net/358/tuning-mysql-server-settings/ Tuning MySQL Server Settings | ovais.tariq

    [...] You might also want to read me recommending the Innodb storage engine here. [...]

  • http://topsy.com/www.ovaistariq.net/460/why-do-i-recommend-switching-over-from-myisam-to-innodb/?utm_source=pingback&utm_campaign=L2 Tweets that mention Why do I recommend switching over from MyISAM to Innodb! | ovais.tariq — Topsy.com

    [...] This post was mentioned on Twitter by Ovais Tariq and Ovais Tariq, Ovais Tariq. Ovais Tariq said: Why do I recommend switching over from MyISAM to Innodb! | ovais.tariq http://t.co/KAZYtji via @ovaistariq #mysql #innodb [...]

  • http://www.alltopreviews.info/?p=2810 is a acre mini laptop a good choice for a laptop? | All Top Reviews|Book Review|Movie Review|Music Review|Product Review

    [...] Why do I recomm&#101&#110&#100 switching over from MyISAM to Innodb! | ovais.tar&#105&#113&#10Reasons to Buy Laptops and Notebook Computers | La&#112&#116&#111p Price – The … [...]

  • http://openark.org Shlomi Noach

    While I certainly recommend InnoDB over MyISAM, I would like to relate to “InnoDB supports clustered indexes”.
    This in itself is not an advantage necessarily. Yes, primary key lookups are faster! But were you aware that secondary key lookups involve two index searches (one for secondary, then one fro primary)?
    How about the increased cost for inserting rows in non-ascending primary key order?
    Clustered indexes have advatages and disadvantages; and so does InnoDB’s implementation.

  • http://marksverbiage.blogspot.com/ Mark R

    You’ll want to consider carefully changing the engine. In particular, MyISAM supports the following features which InnoDB does not:

    Full-text index
    Spatial indexes

    So if you’re using them, then you cannot change unless you refactor your application to remove them.

    Also be mindful of the disc space consumption and performance considerations. A full performance test should be considered mandatory, unless you are confident that your tables are so tiny and performance so unimportant that you can skip it.

  • http://www.ovaistariq.net/ Ovais Tariq

    Yes Innodb doesn’t support Full-text searching, but if an application wants to have a serious search functionality, then I would recommend going the Sphinx or Lucene route, to me MySQL full-text search has lots of limitations.
    Yes that would mean refactoring your application, but not in all cases, in most cases Innodb can be used as a drop-in replacement for MyISAM.

    As far as disc consumption is concerned, yes Innodb uses uncompressed indexes unlike MyISAM, but that can be dealt with if we pick and choose what to index and what not to, other than that Innodb supports table compression as well which brings down the data to about 40% of its original, and as far as performance is concerned, Innodb is miles ahead of MyISAM (I will be posting some numbers soon).

    You are very right about the fact that the transition to Innodb cannot be made without a fully thought-out plan. You have to understand how Innodb works and other things before you can make the switch over,.

  • http://hostingreviewdirectory.com/2010/11/recommend-me-a-dedicated-or-vps-provider-web-hosting-talk/ Recommend me a Dedicated or VPS provider – Web Hosting Talk | Hosting Review Directory

    [...] Why do I recommend switching over from MyISAM to Innodb! | ovais.tariq [...]

  • http://www.ovaistariq.net/ Ovais Tariq

    Thanks for commenting. I have be following your blog for quite some time now, and its good to see you here sharing your thoughts :)

    Let me share my thoughts on the two issues you have pointed out!

    1. Secondary key lookups:
    ——————————
    I agree with what you say about secondary key lookups, the secondary key doesn’t hold a pointer to the leaf nodes (data nodes), but it does hold all the primary key columns, that means if the query is for a column that is neither a part of the secondary key nor the primary key, then after a secondary key lookup a primary key lookup has to be performed as well.
    But as far as I am thinking that comes down to correct index creating strategy, if you have indexed the right columns according to the queries that you are performing most of the time then the second lookup can be skipped all together and the secondary key can act as a “covering index”.

    Example suppose you have a table test as follows:

    create table test(
    col_a varchar(15) primary key,
    col_b varchar(25),
    col_c int,
    key(col_c)
    ) engine=InnoDB;

    Then a query such as

    select col_a from test where col_c = 10;

    is basically a lookup by secondary key, but there is only going to be one lookup in this case because of the way how innodb stores the index (secondary key including the primary key columns), and this secondary key will infact become a covering index for this query.

    So to me it basically comes down to utilizing the right indexing strategy.

    And even in the case that there is a need for a second lookup, with the amount of RAM that we have at our disposal these days, the second lookup is going to be very efficient (innodb has exceptional performance for primary key lookups when data fits in memory) and even more so the primary key holds the data itself quite unlike MyISAM which hold a pointer to the data.

    2. Inserting rows in non-ascending primary key order:
    ————————————————————–
    Well I would say that can be avoided, the data can of course be sorted in primary key order, before loading it in the table. This is what I have always advised to people. Yes that is a limitation, but it is something that we can get over with in the application logic and considering the many number of great benefits, I think this can be pretty much acceptable.

    Do share your thoughts on my views,.

  • http://www.ovaistariq.net/ Ovais Tariq

    Thanks for commenting. I have be following your blog for quite some time now, and its good to see you here sharing your thoughts :)

    Let me share my thoughts on the two issues you have pointed out!

    1. Secondary key lookups:
    ——————————
    I agree with what you say about secondary key lookups, the secondary key doesn’t hold a pointer to the leaf nodes (data nodes), but it does hold all the primary key columns, that means if the query is for a column that is neither a part of the secondary key nor the primary key, then after a secondary key lookup a primary key lookup has to be performed as well.
    But as far as I am thinking that comes down to correct index creating strategy, if you have indexed the right columns according to the queries that you are performing most of the time then the second lookup can be skipped all together and the secondary key can act as a “covering index”.

    Example suppose you have a table test as follows:

    create table test(
    col_a varchar(15) primary key,
    col_b varchar(25),
    col_c int,
    key(col_c)
    ) engine=InnoDB;

    Then a query such as

    select col_a from test where col_c = 10;

    is basically a lookup by secondary key, but there is only going to be one lookup in this case because of the way how innodb stores the index (secondary key including the primary key columns), and this secondary key will infact become a covering index for this query.

    So to me it basically comes down to utilizing the right indexing strategy.

    And even in the case that there is a need for a second lookup, with the amount of RAM that we have at our disposal these days, the second lookup is going to be very efficient (innodb has exceptional performance for primary key lookups when data fits in memory) and even more so the primary key holds the data itself quite unlike MyISAM which hold a pointer to the data.

    2. Inserting rows in non-ascending primary key order:
    ————————————————————–
    Well I would say that can be avoided, the data can of course be sorted in primary key order, before loading it in the table. This is what I have always advised to people. Yes that is a limitation, but it is something that we can get over with in the application logic and considering the many number of great benefits, I think this can be pretty much acceptable.

    Do share your thoughts on my views,.

  • inaam

    Ovais,

    Nice article. Some observations:

    - The idiom is to say InnoDB not Innodb.
    - Row-level locking as oppose to table level locks is only the part of the story. A very important part is MVCC which ensures that readers don’t block for writers when doing consistent reads.
    - Change buffering is different from buffering changes in the buffer pool. The changes are buffered in a btree pages of which can also be flushed out of the buffer pool. The idea behind using a separate tree to buffer the changes is to reduce random IO by doing merge in batches.

  • http://www.ovaistariq.net/ Ovais Tariq

    Well thanks for commenting :)

    - Yeah indeed its InnoDB.
    - Yeah even though row-level locking is only part of the story but still its something that application developers have to know when selecting which storage engine to use.
    - And when talking about MVCC it, it does eliminate the need for locking reads, I should have mentioned it as a separate point in this article. But there is more to MVCC then that, quoting Wikipedia “MVCC provides each user connected to the database with a snapshot of the database for that person to work with. Any changes made will not be seen by other users of the database until the transaction has been committed”
    - Agreed with you on “change buffering”.,

  • http://leadnetproreviewbonus.com/ leadnetpro bonus

    Fresh produce players tend to move from one side to the other as their career progresses, switching for a change, promotion or more money, as and when new …

  • http://bestbrisbaneseo.com brisbane seo

    on the list (but may have still gotten in – take Michaele and Tariq Salahi at the first Obama state dinner) and of course the fashion hits and misses. …

  • James Attard
  • Spartah!

    Well even 2 years later, i find MyISAM to be faster at inserting a lot of rows into a table serially (one after the other), InnoDB seemed to slowdown because of very high disk usage, instead of just writing to the database it was trying to read from it and cache it etc. got about 400 rows per second (via direct gigabit LAN input) with InnoDB, and 4k rows per second with MyISAM. for now im sticking with MyISAM.

  • http://www.ovaistariq.net/ Ovais Tariq

    The insert performance of InnoDB or the performance in general relies on proper configuration of InnoDB, and the default configuration is no good. There are not many parameters to tune in MyISAM and such it works out of the box but such is not the case with InnoDB (as with any other complex and high performant DBMS). To make sure that writes to InnoDB are fastest, you need to at least tune innodb_buffer_pool_size, innodb_log_file_size, innodb_flush_log_at_trx_commit, and also make sure that you do the inserts in primary key order.

    Majority of the users who care for data safety and ACID-compliance go for InnoDB, because its just the extra notch you have to go to configure it properly but the benefits are so many.

  • jorge

    Hi!

    What about innodb_file_per_table, what are the cons? It’s true has only pros? so why MySQL doesn’t come as default with innodb_file_per_table?

  • http://www.ovaistariq.net/ Ovais Tariq

    Well like every configuration option of InnoDB it has both pros and cons.
    The pros are:
    - innodb_file_per_table when enabled allows you to reclaim space by truncating tables or doing NOOP alter,
    - it allows you to get around the i-node serialization problem in extended filesystems,
    - it allows you to move certain tables to different storage volume for example if you have some very hot tables you can move them to a SSD backed volume for higher performance,
    - it allows you to backup and restore single tables

    The cons are:
    - drop table has additional cost when innodb_file_per_table is enabled
    - if you have thousands of tables, then it could become difficult to manage thousands of files
    - on write heavy workload, when the ibd file needs to be extended to insert more records the operation is not too optimal

    I would say that the pros out weigh the cons, however in the end it depends on your usage. Many of the MySQL/InnoDB configuration options have default values that cannot be considered optimal. A case in point is the default value of innodb_buffer_pool_size which is very small, however this single parameter is the most important parameter for InnoDB performance.

    InnoDB has a lot of configuration options and lets you control each and every aspect of it, therefore if you want optimal performance out of the box with zero configuration, you can’t get that. You have to tune InnoDB according to the needs and hardware specs to get the best performance out of it. And once its tuned appropriately, InnoDB can give you far better performance as compared to MyISAM.

  • Arnaldo Rivera

    my only concern is hard disk space used…. does it recycle the space?

  • http://www.ovaistariq.net/ Ovais Tariq

    InnoDB typically uses more space than MyISAM tables, but that’s the cost of extra work that InnoDB has to do to maintain ACID requirements. Regarding recycling of space, that depends on how you use InnoDB. If you do not have innodb_file_per_table enabled, then InnoDB stores all the data in shared tablespace typically a single file named ibdata1 which does not shrink even if you drop tables or databases. However, if you enable innodb_file_per_table_option then InnoDB would store each table in its own file and then space can be reclaimed by either dropping the table, optimizing the table or running a noop-ALTER.

blog comments powered by Disqus