Optimizing MySQL: Hardware and the Mysqld Variables

Monday Dec 17th 2001 by Ian Gilfillan
Share:

You've fine-tuned your indexes, and have optimized those queries to the bone. But still your MySQL database is crawling. It's time to look at tweaking the mysqld variables, what hardware improvements you can make, and how you can compile MySQL to run just that little bit faster.

Part 1 - Compiling MySQL

Like a grand prix driver starting at the back of the track, if you don't do things well from the beginning, and prepare hard for the race, it doesn't matter how well you do once the race has started. You're at the back and are never going to win. So, with MySQL, it's extremely important to start off on the right foot, and that means compiling MySQL properly. The specific options you use depends on your setup of course, but here are some pointers to get you started.

  • Firstly, compile statically (--static). This requires more disk space, but runs faster (13% on Linux, according to MySQL themselves)
  • Optimize to the highest level possible (-O3 with gcc)
  • Compile without debug (--without-debug). This runs 20-35% faster.
  • Compile without frame pointers. (-fomit-frame-pointer). This is from 1-4% faster.
  • Compile with only the minimum character sets that you need (e.g. --with-extra- charsets=none)

Here's a sample configuration I've seen work well:

	CFLAGS="-mcpu=i686 -march=i686 -O3 -pipe -fomit-frame-pointer" \
./configure \
        --prefix=/usr/local/build/mysql \
        --with-mysqld-user=mysql \
        --without-debug \
        --with-client-ldflags=-all-static \
        --with-mysqld-ldflags=-all-static \
        --disable-shared \
        --localstatedir=/usr/local/build/mysql/var \
        --with-extra-charsets=none \
        --enable-assembler

Note that if you're using a Pentium processor, using pgcc seems to give better results than gcc. There have been problems with pgcc and AMD processors though. For a list of complete gcc options, run:

man gcc

Taking the time to compile as well as possible for your setup is worth it. After all, you'll probably only do this once, but it can hamstring your server for life if you get it wrong (it's not fun recompiling after your server is setup and running!)

So now you've compiled MySQL to be fast, and you know it's built to be fast. But, just as the Formula 1 car requires a big effort during the race to get to the finish — fuel, tires and driver all working smoothly — so should you be tweaking your database server for more speed. Many novice MySQL users are unaware that the server variables can be tweaked. What started out as a small database gets busier and busier, and they add more and more memory. But it's all in vain as the server doesn't even use this extra RAM. It's time to introduce the mysqld variables. You can view the variables by running:

mysqladmin variables

from the command line, or by running:

SHOW VARIABLES

from inside MySQL. You will see something like the following. In an untweaked system, most of the variables will probably be similar to those shown in this file here.

Together with looking at the variables, you'll need to look at the current system values. To do this, run:

mysqladmin extended-status

from the command line, or:

SHOW STATUS

from inside MySQL. The results will be something like the following:

+--------------------------+----------+
| Aborted_clients          | 142      |
| Aborted_connects         | 5        |
| Bytes_received           | 9005619  |
| Bytes_sent               | 15444786 |
| Connections              | 794      |
| Created_tmp_disk_tables  | 1        |
| Created_tmp_tables       | 716      |
| Created_tmp_files        | 0        |
| Delayed_insert_threads   | 0        |
| Delayed_writes           | 0        |
| Delayed_errors           | 0        |
| Flush_commands           | 1        |
| Handler_delete           | 27       |
| Handler_read_first       | 1534     |
| Handler_read_key         | 608840   |
| Handler_read_next        | 652228   |
| Handler_read_prev        | 164      |
| Handler_read_rnd         | 14143    |
| Handler_read_rnd_next    | 1133372  |
| Handler_update           | 90       |
| Handler_write            | 131624   |
| Key_blocks_used          | 6682     |
| Key_read_requests        | 2745899  |
| Key_reads                | 6026     |
| Key_write_requests       | 63925    |
| Key_writes               | 63790    |
| Max_used_connections     | 20       |
| Not_flushed_key_blocks   | 0        |
| Not_flushed_delayed_rows | 0        |
| Open_tables              | 64       |
| Open_files               | 128      |
| Open_streams             | 0        |
| Opened_tables            | 517      |
| Questions                | 118245   |
| Select_full_join         | 0        |
| Select_full_range_join   | 0        |
| Select_range             | 2300     |
| Select_range_check       | 0        |
| Select_scan              | 642      |
| Slave_running            | OFF      |
| Slave_open_temp_tables   | 0        |
| Slow_launch_threads      | 0        |
| Slow_queries             | 8        |
| Sort_merge_passes        | 0        |
| Sort_range               | 3582     |
| Sort_rows                | 16287    |
| Sort_scan                | 806      |
| Table_locks_immediate    | 82957    |
| Table_locks_waited       | 2        |
| Threads_cached           | 0        |
| Threads_created          | 793      |
| Threads_connected        | 1        |
| Threads_running          | 1        |
| Uptime                   | 1662790  |
+--------------------------+----------+

my.cnf

Fortunately, MySQL has made life a lot easier by providing default values that fit most systems. Simply using the correct default set of values can make a significant difference. You can set these variables in a file called:

my.cnf

which is usually placed in:

DATADIR/my.cnf

on Unix systems (DATADIR is the directory specified when the system was configured, not the datadir as listed in the variables above!), or C:\mysql\data\my.cnf on Windows systems. The file can be placed elsewhere if you have more than one MySQL server running, or want to specify different options down to a user level, but having one set of values per server is usually sufficient. The 4 sample files that MySQL supplies are:

my-huge.cnf, my-large.cnf, my-medium.cnf and my-small.cnf.

So, pump in my-huge.cnf, and let the good times roll, I hear you say. Well, not quite!

my-huge.cnf is recommended for systems that have at least 1GB memory, and run mainly MySQL (if your Web server and db server are on the same machine, don't even think about using this set!) my-large.cnf is for systems with slightly less memory (512MB), and also mainly running MySQL. my-medium.cnf is tweaked for a system where MySQL and a Web server are running together with around 128MB, or around 64MB with MySQL alone. Lastly, and probably leastly if you're reading this article, is my-small.cnf, running on a system with less than 64MB, where MySQL can't hog too many of the resources.

So, choose the right config file for your system, and you'll be well on your way. But you can get even more benefits from tweaking the variables some more, and for that, we'll need to take a closer look at some of the variables.

table_cache is a useful one. Each time MySQL accesses a table, it places it in the cache. If your system accesses many tables, it is faster to have these in the cache. A good way to see whether your system needs to increase this is to examine the value of open_tables at peak times (one of the extended status values, above). If you find it stays at the same value as your table_cache value, and then the number of opened_tables starts increasing, you should increase the table_cache if you have enough memory. Look at these three scenarios.

Scenario 1
table_cache - 512
open_tables - 98
opened_tables - 1513
uptime - 3046085

Here it looks like the table cache has been set too high. The server has been up for ages, the number of opened_tables is relatively low, and the open_tables (remember we're checking at a peak time) is nowhere near what it could be.

Scenario 2
table_cache - 64
open_tables - 64
opened_tables - 517
uptime - 1662790 (measure in seconds)

Here, although the open_tables is maxed out, the number of open_tables is very low even though the server has been up for ages. There is probably not much benefit in upping the table_cache (this example comes from a development server).

Scenario 3
table_cache - 64
open_tables - 64
opened_tables - 13918
uptime - 33754

This table_cache is set too low. The open_tables is running at maximum, and the number of open_tables is high, even though the uptime is less than a day. If you have the memory, up the table_cache. One thing to note is that even if you only have 64 tables in your database, you may still have more open tables. MySQL, being multi-threaded, may be running many queries on the table at one time, and each of these will open a table.

The key_buffer_size is another useful one to tweak. It affects the size of the index buffers, and making it as large as possible increases the speed of index handling, particularly reading. A good rule of thumb is to set it from between quarter to half of the available memory on your server (for systems dedicated to MySQL). A good way to check this is to look at the extended status variables, and compare key_read_requests to key_reads.

Scenario 1
key_buffer_size - 402649088 (384M)
Key_read_requests - 609601541
Key_reads - 67299
Scenario 2
key_buffer_size - 16777216 (16M)
Key_read_requests - 609601541
Key_reads - 46729832

The values in scenario 1 are looking healthy. The ratio of key_reads to key_read_requests should be as low as possible, no more than 1:100. In scenario 1 it is close to 1:10000. In scenario 2, it is shocking, about 1:15, and the key_buffer size should be increased to as much as the memory allows (you can see that RAM is the primary hardware upgrade you can do to improve your system).

One of the primary errors when the system gets busy is Too many connections. It's generally no problem to increase the number. You can see how close you're running to max by comparing max_connections with the extended-status variable threads_connected. If you regularly come close, the time for some improvements is drawing near.

One improvement you can make on a code level is to use persistent connections rather than ordinary connections (e.g. using pconnect() rather than connect() in PHP), In this case, you may have a scenario with something like:

max_connections - 200
threads_connected - 155
threads_running - 1

Don't be worried that MySQL is wasting resources here. The 154 connections that aren't doing anything are persistent, meaning that you may have Web server instances running that are connected, but only one thread is actually running. The time it takes to process is usually very small. Personally, I've found persistent connections to be better, although there is no consensus on whether connections should be persistent or not in MySQL. Even though the connection overhead in MySQL is much lower than for other databases (with something like Oracle you have to use persistent connections), I've always found it to be better to keep the connection open for reuse rather than close and open it all the time. But test this for yourself, as some reports have indicated differently. (Be sure to test properly under load — some 'proofs' out there claiming non- persistent connections are better are scarily inaccurate.)

Another delay applies to systems that use INSERT DELAYED statements. There is a parameter called delayed_queue_size, defaulted to 1000. This means that after 1000 INSERT DELAYED statements are queued, the client is no longer released immediately (remember that this is the difference between INSERT DELAYED and INSERT LOW PRIORITY, see the article on Optimizing MySQL Queries and Indexes if you're not sure). If you have a busy system, and find clients are often being delayed as they hit this limit, you may want to increase this figure.

back_log is another value that may be worth increasing. If your system gets a lot of connection requests in a very short space of time, MySQL will start refusing connections after the value of this variable has been stacked. Increasing this gives your server more scope to handle these, but it obviously doesn't help if you just get too many connections. This just helps cope with those short bursts!

In a system such as the above, it is often an indicator of something going wrong when the threads_connected value starts to climb, usually meaning queries are getting locked, and you're approaching gridlock. Use a

show processes

to spot the guilty queries. By watching the queries that appear in this list, and seeing which are locking or taking too long to resolve, you can often identify a badly optimized query, and make improvements to it before the situation gets out of hand.

Another good indicator of a problem is a slow_queries value that is continually increasing. In a good system, you should have as few of these as possible. perhaps you do the odd complex join for an archive search. But if your routine queries are slow, you have a problem.

And lastly, as far the mysqld variables covered in this article, if you do lots of sorting (for example, ORDER BY clauses in your SELECT's), you'll probably need to tweak the sort_buffer. Each thread that does a sort allocates a buffer of this size. You may want to go beyond 8MB if you're doing lots of this kind of query.

Disks

As always, there's more to say and more to find out — use the generous documentation supplied by MySQL to read more about all the variables, and search their archives for discussions. But, before I leave you in the pit lane, there's one more topic I want to cover.

Besides RAM and processor speed, disk seek time is probably the primary hardware bottleneck. Increasing the speed of your disks alone can have a major impact on your database speed, but there are a few more tricks you can play.

On an operating system level, you can use striping (placing blocks on multiple disks), mount with the noatime flag (which means that you don't bother to record the last access time for a file. On your busy database system, the last access time was probably the last second for the files that count!). But these and other operating system tweaks will have to wait for another day (I'm just mentioning these to give a direction for the bright sparks who want to go further than this article).

The best way to speed up your available disks is to balance the load across multiple devices. Besides using striping, symbolic links provide the easiest way to achieve this. Let's assume you have two databases both accessed equally. By default these will appear on one device, so your disk is constantly moving around seeking records in both databases. It would be far better to place each of these databases on a separate device, thus reducing the amount of data on each individual disk (a disk is now called into action half the time it was before, and can find the records more quickly, as it has to move across less data than previously). The way to do this is quite simple:

ln -s source_directory target_directory

The source directory would be on the new device, while the target_directory would be the name of the directory containing the database you want to move (in MySQL databases are stored as directories, with the tables as files therein). The link then still works smoothly from its current location, but the data is stored in a new place.

Good luck with your optimization efforts. These are exciting times to be using MySQL, with MySQL 4 offering huge improvements in functionality and ease of use.

resource list:

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved