MySQL Query Optimization – Tip # 1 – Avoid using wildcard character at the start of a LIKE pattern.

The more I go through others SQL, there are some common mistakes that I see developers making over and over again, so I thought why not start a series of tips that can help developers optimize their queries and avoid common pitfalls.

So this post is a part of that series of tips, and this is the first tip to start things off!

Avoid using a wild card character at the start of a LIKE pattern

There are many times I have seen SQL containing the following:

SELECT * FROM tbl WHERE col1 LIKE '%x%';

This is something that should be avoided because MySQL would not use an index for such a query.

Here is a relevant quote from MySQL Internals Manual:
“The optimizer will use an index (range search) for
column1 LIKE ’x%’
but not for
column1 LIKE ’%x’
That is, there is no range search if the first character in the pattern is a wildcard.”

Let’s test this out using some sample data.

/*Table structure for table `scores` */

CREATE TABLE `scores` (
  `id` int(11) NOT NULL DEFAULT '0',
  `name` varchar(255) DEFAULT NULL,
  `score` int(3) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `student_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

/*Data for the table `scores` */

insert  into `scores`(`id`,`name`,`score`) values (1,'ovais',99),(2,'umair',100),(3,'hassan',78),(4,'saleem',66),(5,'zeeshan',32);

Query with the first character as a wild card character

mysql> explain select * from scores where `name` like '%o%';
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | scores | ALL  | NULL          | NULL | NULL    | NULL |    5 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

Query with the first character as a non-wild card character

mysql> explain select * from scores where `name` like 'o%';
+----+-------------+--------+-------+---------------+--------------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys | key          | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+---------------+--------------+---------+------+------+-------------+
|  1 | SIMPLE      | scores | range | student_name  | student_name | 258     | NULL |    1 | Using where |
+----+-------------+--------+-------+---------------+--------------+---------+------+------+-------------+
1 row in set (0.00 sec)

Conclusion

Take a look at the explain output of the first query, especially the type column that says 'ALL', what that means that the index on the name column student_name cannot be used, instead MySQL does a “sequential scan of table” commonly referred to as table scan which is a very costly operation.

Now take a look at the explain output of the second query, its much better than the first one, the type column now says 'range', which means MySQL would do a range scan on the key ‘student_name’ for all names starting with ‘o’.

  • Jozef Briss

    Without showing better way for doing SAME query your tips does not make sense.

    select * from scores where `name` like ‘%o%’;

    and

    select * from scores where `name` like ‘o%’;

    do not return same results generally.

  • Pingback: Tweets that mention MySQL Query Optimization – Tip # 1 – Avoid using wildcard character at the start of a LIKE pattern. | ovais.tariq -- Topsy.com

  • Pingback: MySQL Query Optimization – Tip # 1 – Avoid using wildcard … | mysql

  • http://twitter.com/develophp Ole

    First off, thanks for the heads up. It’s of course something to keep in mind. But I got to agree with Jozef on this. If you’re not able to provide a working code that will returnt he same result, and still be faster, then there’s not alot of help in this post.

    But thanks anyway for the effort

  • http://www.google.com/profiles/thinice thinice

    Yea I agree, ‘%foo’ vs ‘foo%’ are vastly different and unavoidable in many circumstances. Maybe a reference to SOUNDEX() would be appropriate since we’re talking search-ish?

  • http://twitter.com/shazamo Abu Ashraf

    Like the other guys said, if we don’t use ‘%x’, what is the alternative?

  • http://www.ovaistariq.net/ Ovais Tariq

    The reason I put up this tip is so that queries like ‘%x%’ should be avoided,.

    Talking about the alternatives, there are two,
    1. Use full text searching, or
    2. Use something like sphinx

  • http://www.ovaistariq.net/ Ovais Tariq

    Well ‘o%’ is pretty much an alternative for ‘%o%’ and that is how I intended it be in this post. May be I couldn’t explain this properly.

    Most of the time
    like ‘%o%’
    can be replaced with
    like ‘o%’.

    Suppose you have an application where people search for ‘Person Names’, then I don’t think ‘%o%’ would be appropriate, rather it would come down to something like following:

    select * from persons where first_name like ‘o%’ OR last_name like ‘o%’;

    and that would make more sense.

    An example that comes to mind that uses the same strategy is the search functionality in iPhone.

    And if the application is really pretty much a search application, then relying on ‘LIKE’ is not the right strategy, you should consider Sphinx or Full text searching then,.

  • http://twitter.com/develophp Ole

    Thanks for your reply and for clearing this up. I haven’t read to much about MySQL yet, or worked on projects big enough for this to have any impact. But using full text searching seems like a good replacement. I will sure look into this :)

  • http://www.ovaistariq.net/ Ovais Tariq

    Well ‘o%’ is pretty much an alternative for ‘%o%’ and that is how I intended it be in this post. May be I couldn’t explain this properly.

    Most of the time
    like ‘%o%’
    can be replaced with
    like ‘o%’.

    Suppose you have an application where people search for ‘Person Names’, then I don’t think ‘%o%’ would be appropriate, rather it would come down to something like following:

    select * from persons where first_name like ‘o%’ OR last_name like ‘o%’;

    and that would make more sense.

    An example that comes to mind that uses the same strategy is the search functionality in iPhone.

    And if the application is really pretty much a search application, then relying on ‘LIKE’ is not the right strategy, you should consider Sphinx or Full text searching then,.

  • Pingback: How to connect your C# program to a remote server Mysql database? | HOST ANSWERS

  • barthazar

    How about storing the same value in a second column, where the string would be reversed. The you can put an index(es) on both and have fun with LIKE ‘a%’ on both columns.

  • barthazar

    Sorry, that was dumb – you would search from the beginning or the end of said string, but not *inside* the string.

  • Pingback: Optimizing Site With Seo | Rhino Mobile WebSite Design

  • http://www.ovaistariq.net/ Ovais Tariq

    Well for most of the use cases you wouldn’t need searching in the middle of the string.

    Suppose you have an application where people search for ‘Person Names’, then I don’t think ‘%o%’ would be appropriate, rather it would come down to something like following:

    select * from persons where first_name like ‘o%’ OR last_name like ‘o%’;

    and that would make more sense.

    And further more if you really are building a search application, then you can’t rely on LIKE, you have to then use Sphinx or something else similar to it,.

  • Pingback: Optimizing Navigation, Menus for Search Engine Optimization … | Hosting Review Directory

  • Pingback: Outsourcing And Search Engine Optimization Automation | Hosting Review Directory

  • http://twitter.com/FaisalFeroz Faisal Feroz

    Full Text Search is a good idea – for a small searching functionality that is usually required in applications – something like quick search on some keywords
    You can also take a look at lucene – as databases are not meant to be full blown search engines so lucene and similar search indexes would do the job much better.

  • http://www.ovaistariq.net/ Ovais Tariq

    Yeah exactly, databases are not meant to be search engines., Lucene or you might as well try sphinx,. have you tried sphinx,,.

  • http://twitter.com/FaisalFeroz Faisal Feroz

    nope – haven’t tried sphinx. Just looked at the site – it seems like a pretty new project to me. I have worked with Compass and Lucence – Compass integrates Lucene with Hibernate so you get a search box without making any major changes in the coding model.

  • http://www.ovaistariq.net/ Ovais Tariq

    Well Sphinx integrates pretty well with MySQL and is pretty easy to use, even the indexing setup is very easy,.

    Creating a data source and an index in the Sphinx configuration file is pretty easy:
    source products
    {
    type = mysql
    sql_host = localhost
    sql_user = shopping
    sql_pass = mysecretpassword
    sql_db = shopping
    sql_query = SELECT id, title, description,
    cat_id, price, UNIX_TIMESTAMP(added_date) AS added_ts
    FROM products
    sql_attr_uint = cat_id
    sql_attr_float = price
    sql_attr_timestamp = added_ts
    }

    index products
    {
    source = products
    path = /usr/local/sphinx/var/data/products
    docinfo = extern
    }

    Besides that Sphinx is also available as a pluggable storage engine “SphinxSE”, which can be plugged into MySQL and used to create tables like any other table engines, and thus allowing developers to access search results from within MySQL.
    You might want to have a look here http://sphinxsearch.com/docs/current.html#sphinxse

  • Pingback: Php45, MySql5.0, AJAX, CSS - Wordpress Video Tutorials

  • JT

    “Most of the time 
    like ‘%o%’ 
    can be replaced with 
    like ‘o%’.”This is absolutely, completely wrong.

    The ‘fast’ one has very limited use in all but the most specialized situation.

    Common use for a search is to find a word anywhere in a block of content, not just at the beginning.

  • http://www.facebook.com/profile.php?id=703352719 Adin Doank

    thanks

  • http://www.phimtructuyen24h.com/ Xem Phim Online

    Accidentally find this message on the SE . I’m trapped in this time . Thanks for
    share

  • Najrellim

    Another problem with your example is that it does not take into consideration that the user might be looking for a value in the middle of a string, and possibly the beginning letter of the value is unknown. How does one deal with that situation without leading with a wildcard ?
    BTW: what are “Sphinx or Full text ” ? If you are going to reference them, couldn’t you include a link to them or an explanation of what they are? (Aren’t you glad you took the time to write an interesting an valuable blog?)
    najrellim@gmail.com

  • Trevor
  • http://www.toantai.com/ Dao tao ke toan

    i see

  • http://best-waterproof-cameras.com/ best waterproof camera

    I was also suffering from similar circumstances.

  • http://green-coffee-bean-extract.us/ Green Coffee Bean Extract

    try by blog owners solve your household problems.wish success

  • Sergei Kharchenko

    Thanks for usefull information! But I’m using dbForge Studio for MySQL and there is no such problem there! :) http://www.devart.com/dbforge/mysql/studio/

  • thisistupid

    It is a good tip despite everyone’s complaining. Basically, before you use a wildcard at the beginning ask yourself if the implementation would be better another way and if there isn’t another way then by golly that’s what it’s there for.

    The article says “common mistake” as in lesser experienced programmers may sometimes see using the wildcard as an easy way out of a corner they have coded themselves into. I have seen people using like queries on a column in place of an index.

    Just ask yourself, should I be doing this. For some of you that answer may be yes such as the ONLY APPLICABLE example that pretty much every one gave; when you actually are searching within a column in a fully normalized database.

    … And why is this text area like typing in slow motion?