Understanding InnoDB clustered indexes

Some people don’t probably know, but there is a difference between how indexes work in MyISAM and how they work in InnoDB, particularly when talking from the point of view of performance enhancement. Now since, InnoDB is starting to be widely used, it is important we understand how indexing works in InnoDB. Hence, the reason for this post!

The first and foremost thing to know is that InnoDB uses clustered index to store data in the table. Now what does clustered index mean?

Clustered Index

A clustered index determines the physical order of data in a table. When thinking of a clustered index think of a telephone directory, where data is physically arranged by the last name. Because the clustered index decides the physical storage order of the data in the table, a table can only have a single clustered index. But, a clustered index can comprise of multiple columns (a composite index), in the same way as a telephone directory is organized both by the first name and the last name.

Clustered Index with respect to InnoDB

InnoDB stores indexes as B+tree data structures, and same is the case with the clustered index. But the difference is that in the case of clustered index InnoDB actually stores the index and the rows together in the same structure. When a table has a clustered index, its rows are actually stored in the index’s leaf pages. Thus InnoDB tables can also be called index-organized tables.

Now lets consider how InnoDB decides which index to use as the clustered index!

How InnoDB selects a clustered index?

With InnoDB, typically PRIMARY KEY is synonymous with clustered index, but what if a PRIMARY KEY does not exist or there is not even a single index defined on the table. Then following is how InnoDB decides what to use as the clustered index:

  • If there is a PRIMARY KEY defined on the table, InnoDB uses it as the clustered index.
  • If there is no PRIMARY KEY defined on the table, InnoDB uses the first UNIQUE index where all the key columns are NOT NULL as the clustered index.
  • If there is no PRIMARY KEY or no suitable UNIQUE index present, InnoDB internally generates a hidden PRIMARY KEY and then uses this hidden key as the clustered index. This hidden PRIMARY KEY is a 6-byte field that increases monotonically as new rows are inserted.

Hence, my advice is that always define a PRIMARY KEY for each table that you create. If there is no logical key that can be created, add a new auto-increment column, and use it as the PRIMARY KEY.

Did you know that Secondary Index is related to the Primary Key?

In InnoDB, every SECONDARY INDEX contains the PRIMARY KEY column(s) together with the column(s) of the secondary index, automatically. That is because of the way InnoDB stores data, remember what I just told you when talking about how data is stored, a leaf node doesn’t store any pointer to the row’s physical location, but in fact stores the row’s data. So in other words the PRIMARY KEY is actually the pointer to the row data.

This makes us conclude on another interesting conclusion..

A secondary index requires two lookups! First a lookup for the secondary index itself, then a lookup for the primary key.

Advantages of clustering

Clustering provided by InnoDB has very significant performance benefits, some of which are mentioned below:

  • Because the data is physically stored according to the PRIMARY KEY, data lookups by PRIMARY KEY is very fast. For example, the fastest way to find a particular employee using the unique employee_id column is to create a PRIMARY KEY on the employee_id column.
  • With clustering, search for ranges can be extremely efficient. Suppose an application frequently searches records between a range of dates, a clustered index can quickly locate the row containing the beginning date, and then retrieve all adjacent rows in the table until the last date is reached. Thus improving the performance of range queries.
  • Another positive impact of clustering is on the performance of sorting data. Suppose there is a column that is used frequently to sort the data retrieved from a table, it can be advantageous to cluster the table on that column to save the cost of a sort each time the column is queried.
  • Also because clustered index holds both the index and the data together in one B-Tree, so retrieving rows from a clustered index is normally faster than a comparable lookup in a nonclustered index.
  • Secondary indexes can act as covering indexes, when the data that is requested include the primary key columns, because of the fact that secondary indexes automatically include primary key columns.

These benefits that I have mentioned can boost performance drastically, if you design your tables and queries accordingly. But clustered indexes have disadvantages as well.

Disadvantages of clustering

Following are some of the disadvantages of clustering:

  • If a large clustered index is defined, any secondary indexes that are defined on the same table will be significantly larger because the secondary indexes contain the clustering key.
  • Because of the way how the data is stored, secondary indexes require two lookups.
  • Clustered index can be expensive for columns that undergo frequent changes because it forces InnoDB to move each updated row to a new location.
  • Insertions can be slow, if the data is not inserted in PRIMARY KEY order, hence we can conclude that insert speeds depend heavily on insertion order. Inserting rows in primary key order is the fastest way to load data into an InnoDB table.

Update (thanks to sunny):

Following is another thing that one should know regarding secondary indexes:

The records in InnoDB secondary are never updated in place. Therefore, what that means is that an UPDATE of a secondary index column means deleting the old record and inserting a new one.

Although, I did point out some disadvantages, but the fact is that these disadvantages can not be weighted down by the tremendous amount of benefits that comes with clustering in InnoDB. If you study and understand the aspects that I have mentioned in this article and apply them accordingly, you are going to see great performance enhancements. After all, clustering is another important step in bringing MySQL closer to MSSQL and Oracle.

Ovais is a storage architect with keen focus on reliability, efficiency and performance of OLTP databases, more specifically MySQL. He currently works at Uber on ensuring storage reliability and scalability. Previously, he helped different web properties scale their MySQL infrastructure. He also worked at Percona as a Senior MySQL consultant and at a few other startups in various capacities.

  • Pingback: Tweets that mention Understanding InnoDB clustered indexes | ovais.tariq -- Topsy.com()

  • Ovais,

    I agree that a clustered index has big advantages for queries, and should be used. That is why the MySQL storage engine TokuDB (created by Tokutek where I work), allows users to define multiple clustering keys (http://tokutek.com/2009/05/introducing_multiple_clustering_indexes/), so that multiple indexes can produce faster query results. The disadvantages, slower insertion speed and disk usage, are mitigated by TokuDB’s 10x-80x faster indexed insertions(http://tokutek.com/downloads/mysqluc-2010-fractal-trees.pdf) and compression.

    Also, you mention that with clustered indexes, secondary indexes require two lookups. Storage engines that do not cluster the primary key require an index lookup and a fetch. With MyISAM, for example, a secondary index requires a lookup in the .MYI file, followed by a fetch in the .MYD file. The fetch still incurs a disk seek, which is expensive.

  • Hmm with respect to TokuDB, multiple clustered indexes does sound interesting,. but wouldn’t that be requiring more space then,. because say there are 3 clustered indexes,. then the table data would have to be stored 3 times., which certainly would mean 3x storage space required,. secondly., how are inserts handled,. having multiple clustered indexes would require major work during insertions and updates.,.

    And regarding the secondary indexes requiring two lookups., its just something that I have mentioned so that people would know., secondary index having the primary column as well can be an advantage as well as a disadvantage,.

    And regarding the comparison with MyISAM,. lookups by keys are more or less always going to be faster in case of InnoDB,.

  • Pingback: Understanding Dedicated Hosting | FREE ARTICLES | Hosting Review Directory()

  • Zardosht

    The downsides you mention exist, but TokuDB’s performance drastically reduces them. Yes, there is more storage space required, but TokuDB has great compression on data. We have seen results ranging from 5x to 15x compression over InnoDB. As for insertions, I would not categorize it as “major work”, but as “more work”. The insertions into the clustering indexes are of larger size over insertions into secondary indexes, but TokuDB has really fast indexed insertions, from 10x-80x faster than InnoDB.

    TokuDB’s performance characteristics make multiple clustering indexes an excellent tool for achieving great query performance in MySQL.

  • How do you think TokuDB’s compression compares with the compression offered by the Barracuda file format in InnoDB??

    Well could you explain, how the clustered index updates and insertions are not major work., besides that, with InnoDB the inserts have to be in primary key (Clustered key) order, otherwise there is an effect on performance,. does that hold for TokuDB as well,.

  • Sunny

    The two lookups on a secondary index are only required if the secondary index is not
    a covering index in the query or sometimes due to MVCC requirements. For the latter see:


  • Yeah although, secondary index requires two lookups only when its not a covering index,. but still its something that users should know,.

  • I have updated the article check it out,.

  • Ovais,

    Regarding compression, I am not know how the Barracuda file format compares with standard InnoDB and with TokuDB, so I cannot answer the question. I only know the 5x-15x improvement over standard InnoDB.

    The reason that updates and insertions are not major work is that TokuDB uses fractal trees (http://tokutek.com/2010/04/how-fractal-trees-work-talk-at-mysql-2010/) in place of B-trees. Insertions and updates into fractal trees are one to two orders of magnitude faster, because they do not incur the disk seeks that B-tree insertions do (http://tokutek.com/2010/05/disk-seeks-are-evil-so-lets-avoid-them-part-1/).

    With TokuDB, each clustered key maintains a copy of the data in the order of the defined clustering key, so the query performance with clustering keys is really fast, for all of the reasons you mention in your post.

  • TokuDB sounds quite promising,. I would certainly like to give it a try,.

  • Zardosht


    TokuDB is available for download at http://tokutek.com/products/tokudb-for-mysql-v4/. Usage is free for development licenses and deployments with data less than 50GB.

  • Thanks Zardosht, sure I will give it a try and post my findings,,

  • Pingback: Clustered and Non-clustered Index « IT Primer()

  • Some DBA

    A disk seek isn’t the same thing as a log lookup (log n) per record, so (n log n) + a disk seek for each.  That’s a big logical jump.  That said, multiple clustering indexes, esp. for a read environment could beat MyISAM too, assuming the overhead of a transactional engine, or heavier nodes in TokuDB, doesn’t weigh it down.

    TokuDB won’t do the adaptive hash indexing that InnoDB does either.  And a good hash table will beat anything.  In certain cases for databases less than tens of gigs, it might actually do better.  If you’re doing big data and OLAP, you’ll want bitmap indexes, which TokuDB won’t do.  I don’t think it can, either, as bitmap indexes kind of put the kibosh on fast inserts.

    If you’re doing OLTP or mixed OLAP+OLTP with big data, TokuDB will blow everything else away.  Does this sound about right?

  • I still have doubts about TokuDB write performance when maintaining multiple clustered indexes. Assuming you have 3 clustered indexes, then 1 write would translate to 3 reads + 3 and assuming the page is not in memory that would mean (read IO x height of tree 1 + write IO) + (read IO x height of tree 2 + write IO) + (read IO x height of tree 3 + write IO), assuming writes are buffered, so the write IO can be avoided here, but still too many random reads.
    Another question that I have in mind is what about the secondary indexes? In InnoDB secondary indexes have primary key columns in their leaf nodes, is it the same thing in TokuDB too?

  • Some DBA.

    I think what you’re neglecting is that TokuDB’s indices have recursively defined buffers that aggregate writes on each level and then flush to the next level.  The result is a better tree and fewer immediate I/Os.  The tree is never incorrect because the unflushed buffers are along the path to the correct leaf node where it will eventually reside.  I _think_ this is correct.

    Thing is, that doesn’t mean maintaining 3 copies of a large row is free, or that it compression will cancel out the copies.  It will still suck.  They can’t violate computer science, and it will _not_ perform any better than a a btree after an OPTIMIZE INDEX AFAIK.

    I would guess that, yes, TokuDB’s non-clustered indices may have an extra level of indirection, just as InnoDB does.  BDB also had that ‘problem.’  So it’s an extra logarithmic cost to find each row.

  • Zardosht

    Ovais, TokuDB uses Fractal tree indexes, which do not incur an I/O (not even a read I/O) on every write. This is why maintaining indexes (clustering and non-clustering) is much cheaper for TokuDB. We have seen users get a 20x-80x improvement in write performance. Look at http://www.tokutek.com/2011/11/how-fractal-trees-work-at-mit-today/ for details on how Fractal Tree indexes work.

  • Some DBA.


    Is my description mostly correct?

  • Zardosht

    @Some DBA,

    Your description is pretty much on the money.  Fractal Tree indexes are not just on-disk indexes with an in-memory buffer.  The buffering in a Fractal Tree index gives higher insertion performance (20x-80x over InnoDB) and, FWIW, is asymptotically better.  Insertion into a B-tree takes O((log N)/log B) I/Os, and insertion into a Fractal Tree index takes O((log N)/B) I/Os.

    You made some other comments that I’d like to address.  You say that keeping 3 indexes isn’t free.  That is certainly true.  TokuDB can maintain 1 index at insertion rates 3 times faster than it can maintain 3 indexes.  But even with 3 indexes, that is still much faster than InnoDB with 1 index.  And TokuDB has better compression than InnoDB (5x-15x better).  So, although 3 indexes take more space than 1 for a Fractal Tree index, it’s still considerably less space than what InnoDB would take, even for one index.

    The way I think about it is that you get much more insertion and compression performance, and you use that to improve query performance.  It’s not that the new indexes are free but that you have a bigger effective budget to pay for them.

  • Pingback: Why do aggregate functions alter the way my rows are fetched? | XL-UAT()

  • Richard Yang

    Thank you. Good explanation. But that would be better if you have some diagrams to help us understand.

  • I made a detailed presentation on InnoDB indexing at Percona Live conference. You can get it from the link below, it has a lot more details: