MySQL's query cache is an impressive piece of engineering if sometimes misunderstood. Keeping it optimized and used efficiently can make a big difference in the overall throughput of your application, so it's worth taking a look under the hood, understanding it, and then keeping it tuned optimally.
How It Works
A lot of database engines can cache queries, usually this allows them to avoid a hard parse of the query each time. The subsequent calls to the queries, even using different values for bind variables, will use the same cached query. The data blocks are cached separately. MySQL does of course have various buffer caches for caching blocks, specifically when you're using InnoDB, however they are self-contained, and completely separate from the query cache.
MySQLs query cache works a bit differently. It caches the parsed query and the complete result set. If you have lots of small queries that return small result sets such as with web-based applications, it can be a real godsend to have the query cache working for you.
So the main things to keep in mind when tuning the query cache are (a) the total size of the query cache (b) the average size of query result sets, and (c) the size of queries above which you don't want to keep cached. Each of these three things can be measured, and once set - assuming your application footprint doesn't change dramatically - should remain fairly static.
Tuning the Query Cache
First, be sure the query cache is turned on. The query_cache_type variable should be set to ON, and the query_cache_size should be non-zero. Use this command:
mysql> show variables like 'query%'; +------------------------------+-----------+ | Variable_name | Value | +------------------------------+-----------+ | query_alloc_block_size | 8192 | | query_cache_limit | 1048576 | | query_cache_min_res_unit | 1024 | | query_cache_size | 256000000 | | query_cache_type | ON | | query_cache_wlock_invalidate | OFF | | query_prealloc_size | 8192 | +------------------------------+-----------+ 7 rows in set (0.00 sec)
Recall that we described above the maximum size query that the query cache will allow to be cached. The variable query_cache_limit controls this. If there are very large reports with large result sets, you probably don't want them to clog up your query cache. So set that accordingly.
The biggest problem most untuned query caches will have is either that they (a) aren't big enough, or (b) have fragmentation.
Setting the query cache size can be done by looking at the query cache hit ratio. Whenever MySQL performs a SELECT operation, it either increments com_select or the qcache_hits status variables. com_selects thus show us the cache misses.
mysql> show status like 'qcache_hits'; +---------------+------------+ | Variable_name | Value | +---------------+------------+ | Qcache_hits | 1446235295 | +---------------+------------+ 1 row in set (0.05 sec) mysql> show status like 'com_select'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | Com_select | 1 | +---------------+-------+ 1 row in set (0.04 sec)
So get the hit ratio by this formula: qcache_hits / (qcache_hits + com_select) which gives us .9999 or 99.99%. So we know we're big enough.
Fragmentation is a little more subtle. Sometimes our query cache is quite large; however, we're not getting a high hit ratio. Check qcache_free_memory status variable, is it also high? If so, it could mean fragmentation. Imagine, each query that is cached must be a minimum of query_cache_min_res_unit. If this is larger than the query's result set, the rest is wasted memory that can't be used by other cached queries. So you want the query_cache_min_res_unit to be roughly the average query size, maybe a little larger than average to accommodate most of the queries, but not too large to be wasting memory for most query result sets.
Determine the average size of your applications cached queries by taking the query cache used memory and divide by the number of queries in the cache. The former is determined by query_cache_size system variable, minus the qcache_free_memory status variable. The latter can be found from the qcache_queries_in_cache variable.
Since all of these variables can be changed dynamically, I recommend you make changes on your running system incrementally, and then rerunning the hit ratios to see if you're making a difference. If you're unsure about what you're doing, you should definitely experiment on development first. You could do so by running a MySQL query benchmarking tool, which calls lots of different queries, and then make changes to the query cache as you monitor the query cache at the same time.
The query cache cannot be used in all cases. If there are queries larger than query_cache_limit, they will not be cached. In addition, if you're using MySQL 4.0 and earlier, the query cache would not work with transactions, that is with queries hitting InnoDB tables. In 4.1 and up, the InnoDB storage engine indicates to the MySQL kernel whether or not the query cache can be used. Essentially, if queries are updating those tables, they'll hold row-level locks and invalidate the query cache. Once a commit is performed, subsequently the query cache can be used again on those tables. This is all handled transparently though, so you don't really have to worry about it too much.
MySQL's query cache is a powerful technology, and one that when managed well can improve throughput of your database dramatically. Once your application is built, you can take a look at how it uses the database, and tune the query cache accordingly. Make it large enough, avoid fragmentation, and keep out the really really big queries, and you should be able to keep your cache hit ratios high, and enjoy great performance.
» See All Articles by Columnist Sean Hull