In Part I of my SQL Server Performance Tuning and Optimization article, I identified ways of reducing the process of finding the location of SQL performance issues to a science. Unfortunately, fixing performance issues, the focus of part II, can be something of a black art. While there are some quick guidelines I have to offer, sometimes the answers and directions consist of trial and error and judgement calls based on experience or understanding of the engine which are beyond the scope of this article.
The whole odyssey reminds me of physics problems where you have to solve several equations to get the information you need in order to answer your original question. Where most people go wrong is forgetting to answer the original question, thinking they have an answer already. Our question was "How can we make this process faster". We have now found where the problem is, but we don't yet have a solution.
The good news is there are often quick ways of improving performance. And there are techniques you can use that will have a high probability of being successful, though they may be more cumbersome.
Though my answers usually involve indexing, I have not been impressed by the Index Tuning Wizard.
Depending on how I see the problem I usually go through the following strategies of things to look for and try. In each case the same basic idea applies. Understand the SQL that is running slowly. Try to think of what is making it slow. Change the SQL or indexes or tables. Then rerun the statement to see if your change(s) worked. Sometimes you can spend hours trying to get from 2 seconds to 500 ms; other times 1 second will be reduced to 100 ms almost effortlessly. Be prepared for more of the former than the latter.
Errors, Omissions, Pleonasms, and Cursors
This is the first thing to look for, not because it is the most common, but because it should become apparent as you examine and come to understand the problem statements.
Oftentimes, a query is just wrong. Even wrong queries can return the right results, or appear to do so. A select without a where clause is a common mistake. For example, setting a variable from a table without a where clause:
SELECT @val = col from aTable
This will select all the rows from the table and the variable will end up with the value in the last row. On a fast machine with a medium dataset size, this could be missed--the statement will be a lot slower than it should be, but may not be slow enough to stop a developer or tester.
Sometimes an unnecessary redundancy (pleonasm) is used (as opposed to those necessary redundancies;):
Select aCol from aTable where anID in (select anID from aTable where anotherCol in (someVals))
Select aCol from aTable where anotherCol in (somVals)
Cursors are sometimes necessary, but many times they are used when when not really needed. Performing a quick operation several thousand times in a cursor can be orders of magnitude slower than performing the same operation for all the rows once.
Searching for coding errors and irregularities often takes a good knowledge of SQL because they are often caused by inexperience. It's often hardest to see our own mistakes, so this is an area where another team member can and should be asked for help.
SARG's and Indexing
Indexing is really the heart and soul of optimization. Most optimization issues will bring up the need for an index or two, even if some of the other techniques are employed. Indexes are usually the fastest way to access data, and seeing how indexes are used by the Optimizer is vital to fixing performance problems.
SARG stands for search arguments. These are restrictions in where and join clauses that can be applied to indexes to select rows more quickly:
Select au_lname from authors where au_lname like 'w%'
In this statement, the column
au_lname in the
where clause is the SARG. If
there were a million rows in the table, an index on
au_lname would be a huge
performance improvement. With the index, the above select could jump right to
the J's and pull out the W%'s right away. Without the index there would be a
table scan. Not only would every row have to be checked, but the amount of data
pages checked would be much larger (resulting in more time for searching)
because the table has more columns and data than the index. The table could
consist of 10,000 pages where the index might be only 1,000 pages and the number
of pages accessed to find the data in the index might be only a dozen. So this
index would reduce a 10,000 page search to 12, for an immediate performance gain
of 100 times.
Remember, SARG's must match the first column in an index. The index (
au_lname) would not help the above query and would not be used. It would be like
looking for first names in the phone book; it's just not ordered for that.
Sometimes there are conditional criteria which are understood, but not used in the query; maybe they are used later in code logic. A common example is when adding a date restriction to a query can improve performance. Oftentimes adding search arguments can speed up a query because it will use different or additional indexes.
Whenever considering indexes, consider a covered index for even better performance when retrieving large rowsets. A covered index is nothing more than an index that includes all columns referenced in a query. So a covered index is only a covered index in relation to particular queries. Repeating the above example:
Create index lname on authors(lname) Select au_lname from authors where au_lname like 'w%"
The index on
au_lname is a covering index for this query. What this means is
that all the information the query needs is in the index data pages and the
table data pages will not need to be accessed to execute the query. On the other
hand if the query were:
Select au_lname, state from authors where au_lname like 'w%'
and the index still only covered
au_lname, SQL Server would search through
the index, and then for each entry it found it would have to load the data page
and pull the single row off the 8 K page to get the state name. If 255 rows in
the index qualified (which would all be on one index page), there would be 255 page accesses of the table data.
Depending on what percentage of rows qualified, the Optimizer might decide to do a table scan instead of using the index. The table might only be 100 pages long, but using the index, in the above example, several of the pages would be hit multiple times, so a straight table scan would be better. Alternatively, if the table data were 1,000 pages (8 MB) the Optimizer would probably see that the index use would be faster. And if the table were 10,000 pages, you could rest assured that the index would be chosen.
So even if the index is used, one index page will yield 255 additional page
accesses to retrieve the data. In this case, a covering index would reduce the
page accesses back down to one or two. If the index was on (
au_lname, state) the
data pages wouldn't need to be accessed and the simple index lookup would give
all results from the query.
If access were not a big consideration and insert/update and storage space
issues were not problems, I would add an additional index (
state, au_lname) for the following select:
Select au_lname from authors where state = 'NY'
It has the same keys as the above index, but with state at the beginning, the
search will be much faster. Alternatively, the previous index (
is still a covering index, and even though it is not in a useful column order,
it still provides a performance improvement over a table scan. A full index scan
would require fewer pages than a table scan.