MySQL Paginated displays – How to kill performance vs How to improve performance!
Pagination is used very frequently in many websites, be it search results or most popular posts they are seen everywhere. Typically you have a list of 10 to 20 results below which are pagination controls with either the page numbers listed or previous/next links present.
Let’s visualize by taking a look at how the big guns go about the pagination scenarios.
Google shows a list of page numbers but it doesn’t show the entire list of page numbers, neither does it show the exact count of the number of results, rather it estimates the number of results. The list of page numbers starts with 1 – 10 instead of the entire list of page numbers and it increases as a user digs deeper into the results, meaning on page 2 of the result you would see page numbers 1 – 11, on page 3 you would see 1 – 12 and so on.
Facebook has next previous button to navigate through the search results, each search result page having 10 results each, the other thing that Facebook is doing is its estimating the number of results and doesn’t give an exact count. There are no page numbers given, so that a user will only be able to the use the next previous button to navigate through the results.
Yahoo shows a list of page numbers, 11 at a time to allow you to navigate through the search results, it doesn’t show you the entire list of page numbers. The page number are shown in such a way that if you are on pages 1 to 6 you would see page numbers 1 – 11 and after that the page numbers list would start from (current page number – 5) to (current page number + 5).
Analysis of the big guns:
Analyzing how Google, Facebook and Yahoo are going about handling pagination you would see that there are certain things that are common, which are:
- None of them show the exact count of the number of results.
- None of them show the entire list of page numbers, nor do they have controls such as ‘Goto Last Page’ or anything that would involve digging deep into the search results.
- They are interested in showing the user things that are relevant to the user, because no one would want to have a look at the least relevant results (a list of 10 page numbers is a save bet, because thats what most of users would be interested in).
Now let’s have a look at a common scenario at how pagination is usually implemented.
Typical pagination implementation:
Typically pagination is implemented in the following steps:
- Get the total number of records by either executing
SELECT COUNT(*) FROM my_table WHERE x = y ORDER BY date_col;
SELECT SQL_CALC_FOUND_ROWS * FROM my_table WHERE x = y ORDER BY date_col LIMIT 0, 10;
- If you were using SELECT COUNT(*), then execute another query to fetch the results typically as follows:
SELECT * FROM my_table WHERE x = y ORDER BY date_col LIMIT 0, 10;
- Execute the above two steps for the remaining pages, the only thing to change would be to replace
LIMIT 0, 10with
LIMIT 10, 10
LIMIT 20, 10and so forth.
Performance implications of the typical pagination implementation:
The performance implications of such naive pagination implementation are frightening to say the least. Following is why I am saying so:
- First of all if you go the two query route, executing a count query and then a result fetching query would involve some overhead.
- If you don’t go the two query route and still decide to use
SQL_CALC_FOUND_ROWS, then the performance implication is going to suffer more than if you had used count query, because MySQL would then not be able to use its limit optimization. Instead it will have to have a look at all the records that match because we are asking MySQL to provide us a count as well. If we hadn’t used
SQL_CALC_FOUND_ROWSthen MySQL would have just stopped processing after having found the number of rows that we requested.
- If you are naive enough to have pagination links like Last Page or Goto Page 1000 or Goto Page 10000, then you are asking for more troubles. Suppose you asked for page number 10001 and you asked for 10 rows, then what MySQL would essentially be doing is it would fetch 10020 rows discard the first 10000 and return the 20 rows you requested. Image the overhead.
Having gone through analyzing what the big players like Google, Yahoo and Facebook are doing, and keeping in mind the performance implications of a typical pagination implementation, I would suggest a pretty straight forward solution.
I would suggest going the route of the big players. Following is how I would implement pagination:
- Don’t show all the results no one does, and no one is interested in viewing all the results.
- Go the Facebook route and display Next Page Previous Page buttons only
- Get rid of the results counting story and instead use clever querying. Suppose you show 10 results to the users at a time, then all you have to do to know if you have to show the Next Page link is fetch one extra record that is 11 records and if the 11th record exist that means you can show the Next Page link.
SELECT * FROM my_table WHERE x = y ORDER BY date_col LIMIT 0, 11;
Finally, I would say that although pagination seems trivial but it still has performance implications. Investigate and develop is the best way to design your application (
slow query log is a big help).