Understanding B+tree Indexes and how they Impact Performance

Indexes are a very important part of databases and are used frequently to speed up access to particular data item or items. So before working with indexes, it is important to understand how indexes work behind the scene and what is the data structure that is used to store these indexes, because unless you understand the inner working of an index, you will never be able to fully harness its power.

B+tree Indexes

Indexes are stored on disk in the form of a data structure known as B+tree. B+tree is in many ways similar to a binary search tree. B+tree follows on the same structure as of a binary search tree, in that each key in a node has all key values less than the key as its left children, and all key values more than the key as its right children.
But there are some very important differences,

  • B+tree can have more than 1 keys in a node, in fact thousands of keys is seen typically stored in a node and hence, the branching factor of a B+tree is very large and that allows the B+trees to be a lot shallower as compared to their binary search tree counterparts.
  • B+trees have all the key values in their leaf nodes. All the leaf nodes of a B+tree are at the same height, which implies that every index lookup will take same number of B+tree lookups to find a value.
  • Within a B+tree all leaf nodes are linked together in a linked-listed, left to right, and since the values at the leaf nodes are sorted, so range lookups are very efficient.

A typical B+tree

Following is a typical B+tree:

If you need more information on B+trees, you can have a detailed look at the article available on Wikipedia.

Why use B+tree?

B+tree is used for an obvious reason and that is speed. As we know that there are space limitations when it comes to memory, and not all of the data can reside in memory, and hence majority of the data has to be saved on disk. Disk as we know is a lot slower as compared to memory because it has moving parts. So if there were no tree structure to do the lookup, then to find a value in a database, the DBMS would have to do a sequential scan of all the records. Now imagine a data size of a billion rows, and you can clearly see that sequential scan is going to take very long.
But with B+tree, its possible to store a billion key values (with pointers to billion rows) at a height of 3, 4 or 5, so that every key lookup out of the billion keys is going to take 3, 4 or 5 disk accesses, which is a huge saving.

The reason why B+tree is chosen over other tree structures is that B+trees tend to be very shallow, and since every lookup translates to a disk access, the number of disk accesses required to fetch a value is directly proportional to the height of the tree, so the shallower the tree, the less number of disk accesses.

How is B+tree structured?

B+trees are normally structured in such a way that the size of a node is chosen according to the page size. Why? Because whenever data is accessed on disk, instead of reading a few bits, a whole page of data is read, because that is much cheaper.
Let us look at an example,
Consider InnoDB whose page size is 16KB and suppose we have an index on a integer column of size 4bytes, so a node can contain at most 16 * 1024 / 4 = 4096 keys, and a node can have at most 4097 children.
So for a B+tree of height 1, the root node has 4096 keys and the nodes at height 1 (the leaf nodes) have 4096 * 4097 = 16781312 key values.
This goes to show the effectiveness of a B+tree index, more than 16 million key values can be stored in a B+tree of height 1 and every key value can be accessed in exactly 2 lookups.

How important is the size of the index values?

As can be seen from the above example, the size of the index values plays a very important role for the following reasons:

  • The longer the index, the less number of values that can fit in a node, and hence the more the height of the B+tree.
  • The more the height of the tree, the more disk accesses are needed.
  • The more the disk accesses the less the performance.

So the size of the index values have a direct bearing on performance!

I hope you have understood how B+tree indexes work and how they are used to improve the performance of lookups. I hope you have also understood how important it is to keep the height of the B+tree smaller so as to reduce the number of disk accesses.

Ovais is a storage architect with keen focus on reliability, efficiency and performance of OLTP databases, more specifically MySQL. He currently works at Uber on ensuring storage reliability and scalability. Previously, he helped different web properties scale their MySQL infrastructure. He also worked at Percona as a Senior MySQL consultant and at a few other startups in various capacities.

  • Asdad

    gjk hj hj hj hj hj hj hj

  • Joseph

    Thanks for the post. This data structure is useful when thinking about indexes. But, how can we use this information when defining indexes on a large table, say a trade table? And how does table partitioning affect this data structure?

  • So the things that you should take out from this post are:

    1. Index columns with small values, so that the B+tree height is small and index access is fast, that means prefer numeric values over string values, or even when you are indexing string columns, index small prefixes.
    2. Index columns which have high selectivity, indexing columns such as gender which has only two distinctive values is counter productive., so never index columns which have such less distinct values.
    3. As you can see from the structure of the B+tree, that leaf nodes are connected together as a double-linked list, so that means if you have proper indexes then sorting (ASC/DESC) using indexes is very fast.
    5. Values that are clustered together are best handled by B+tree, which means that range scans are very fast, so avoid using values such as UUID as indexes.

    Keeping all these things in mind you can design proper indexes on whatever kind of table.

  • Partitioned tables mean indexes that are shallower, because they have less values to index, so lookups are fast.,

  • B B

    I have to disagree with “never index columns which have such less distinct values”. This is unfortunately a rule misquoted and followed blindly by software developers pretty often. Basically, index’s ultimate goal is to speed up data lookups so it all depends on what you’re looking for. When taking the gender storage as an example, you may have a situation where particular gender comprises a very small fraction of records, say 0.1% (female Star Trek fans for example). If your app is scanning through this column just for that 0.1% of all records, an index there will greatly benefit your app even though there are only 2 distinct values available.

    That rule assumes, which is never mentioned properly, that accesses to all values in such a low-selectivity column are more or less uniform in frequency.

  • I will have to disagree with you,. there are two things here,. generalization and exceptions,. the example that you have posted here is an exception,. 99% of the times columns with low-selectivity are a bad choice for indexing,. the example that you have posted is a situation that can easily change with time., right now there may only be 0.1% female fans,. but over time there could be more,.

  • Pingback: Cheatsheet: 2011 07.11 ~ 07.18 - gOODiDEA.NET()