Benchmarking Performance of a Query - Part 1 Elapsed Time

Wednesday Jan 21st 2004 by Gregory A. Larsen
Share:

How many times have you had more than one way to do something and wanted to determine which method was faster or which one used less resources? In the first of this two-part series, Greg Larsen discusses some methods for using Query Analyzer to help make this decision.

How many times have you had more than one way to do something and wanted to determine which method was faster and/or which one used more resources? I am going to guess that most developers have run across this dilemma more than once. So how do you decide with method is faster--or which method requires more resources? In this two part series, I will discuss some of the methods I use in Query Analyzer (QA) to help me decide which coding method might be faster and use less resources.

So when we talk about how fast a chunk of code runs, what do we really mean? Are we talking about how much CPU the code consumes to accomplish its task, or are we talking about the number of I/O's the code takes to get all the data needed to return the result set? Alternatively, are we talking about the elapsed time it takes the code to run? In my mind, elapsed time is really the yardstick to use for how fast a piece of code runs. Elapsed time is the time customers have to wait while a piece of code runs. However, you still need to consider the cost of the code, and that is where CPU and I/O come into play. I consider CPU and I/O to be the resources consumed when running code. This two part series will look at techniques that can be used to identify the elapsed time, and resources used when you run a piece of code. This first article will discuss the methods I used to determine how fast code runs, or what I call the elapsed time of a query or set of T-SQL statements.

Elapsed Time

Elapsed time is the clock time it takes to run a piece of code. Basically, the elapsed time is the number of hours, minutes, second, and/or milliseconds it takes to run a chuck of code. Normally when we talk about how fast some code runs we are really talking about the overall elapsed time. Let's review a couple of different methods I use to review the elapsed time of a query.

Prior to performing an elapsed time benchmark, I first get my hands on the T-SQL code I want to benchmark. I then paste the code into QA on a stand-alone SQL Server machine, which is where I gather all my benchmark information. Using a stand-alone machine isolates my testing from performance issues that might arise in a multi-user environment. With QA, there are a couple of different ways to measure the elapsed time of a chunk of code.

The first method I use to get my initial benchmark for elapsed time is to use the Execution Time that is displayed in QA. The screen print below shows where you can find QA's Execution Time display. I have circled the display in red.

Above I executed a single SELECT statement 1,000,000 times. As you can see by using the Execution Time display, which is displayed in the lower right hand corner of the QA window, this particular set of T-SQL statements took 3 minutes, and 54 seconds to process.

Now, sometimes you need more than just the total execution time for a T-SQL batch, or you would like to know the milliseconds associated with elapsed time. Suppose you have a T-SQL batch that contains a number of different T-SQL statements and you would like to know how long each T-SQL statement runs, or how long a subset of the statements run. In this case, you would not be able to use the QA Execution Time display since it is for the entire batch. So in order to display execution time for each command or subset of commands, you would need to use a different method.

One of the methods I use to determine how long, a single T-SQL statement, or a series of statements in a batch takes, is to capture the start and end time for each command or series of statements in a batch for which I am interested in measuring elapsed time. This allows me to calculate the elapsed time for a single command, or a group of T-SQL statements. To demonstrate what I am talking about here is a chuck of code that displays the elapsed time for two different while loops.

declare @output varchar(100)
declare @i int
declare @st datetime

set @i = 0
set @st = getdate()
while @i < 2000
begin
  select @output=OrderId from Northwind.dbo.Orders
  set @i = @i + 1
end
print 'First batch completed in: ' + 
      rtrim(cast(datediff(ss,@st,getdate()) as char(10))) +
      ' seconds!'
set @i = 0
set @st = getdate()
while @i < 4000
begin
  select @output=OrderId from Northwind.dbo.[Order Details]
  set @i = @i + 1
end
print 'Second batch completed in: ' + 
      rtrim(cast(datediff(ss,@st,getdate()) as char(10))) +
      ' seconds!'

Below are the elapsed times calculated for each while loop when I ran this code against my server. These elapsed times were produced by the calculation contained in the print statements in the above code.

First batch completed in: 3 seconds!
Second batch completed in: 17 seconds!

By looking at the code you can see I am using a local variable @st to capture the start time for each while loop. At the completion of the while loop I calculate the time it took to process the entire while loop, and display the calculated elapsed time by using a PRINT statement. Using this method, allows me to determine the elapsed time for each while loop. As you can see the first while loop completed faster than the second one.

Another method to get the elapsed time would be to use the SQL Server "SET STATISTICS TIME ON" command. This command will toggle ON a SQL Server time gathering process for the current SQL Server connection. Once you turn ON the statistics gathering process SQL Server will produce time statistics for each subsequent command run for the existing connection. To turn off the statistics gathering process for a given connection you must run the "SET STATISTICS TIME OFF" command.

Turning on the time statistics will display the, compile, CPU and elapsed time for each command processed. I sometimes find this command useful, although this statistics gathering process sometimes produces lots of output and increases the execution time under a number of different situations. Because of these drawbacks, I find this method does not work well for queries that process many commands, or contain scalar functions. Here is an example of some code that uses the "SET STATISTICS TIME ..." command to toggle ON and OFF the time statistics gathering process.

set statistics time on
select 'How many Order records are there? ', count(*) 
   from Northwind.dbo.Orders

declare @x int
set @x = 0
while @x < 2
begin
  set @x = @x + 1
  select top 1 OrderID ProductID from Northwind.dbo.[Order Details]
end
set statistics time off

Here is the output produced from this code. Note that this code produces statistics output for each and every T-SQL statement. Therefore, this code works well for a single T-SQL statement, but produces multiple lines of output as it processes the WHILE loop, which causes me to normally not use this method for determining elapsed time.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
                                               
---------------------------------- ----------- 
How many Order records are there?  830

(1 row(s) affected)


SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
ProductID   
----------- 
10285

(1 row(s) affected)


SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 5 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
ProductID   
----------- 
10285

(1 row(s) affected)


SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

At the beginning of my article, I talked about how you might go about determining how writing code one way is better than writing code another way. The way I do this is to use an elapsed time benchmark test of each different coding method. Let's go through an example of how I do this.

For my example, I will take two different methods of padding a character field that contains an integer value with zeroes. If you look out at my web site you will see I have two different methods for padding with zeroes. The code for the first method looks like this:

DECLARE @I INT
DECLARE @C CHAR(8)
SET @C = ''
SET @I = 123
SET @C = RIGHT(@I+1000000000000000000,DATALENGTH(@C))
PRINT @C

The second method has code that looks like this:

DECLARE @I INT
DECLARE @C CHAR(8)
SET @C = ''
SET @I = 123
SET @C= REPLICATE('0', DATALENGTH(@C) - 
DATALENGTH(@I)+ 1) + CAST(@I AS CHAR)
PRINT @C

If you run these two chunks of code, they run very fast, less than 1 ms. Since each query runs so quickly, how might you determine which one is more efficient? Well if you use the method I showed above, and execute each section of code 1,000,000 times you should get an indication which one performs faster (shorter elapsed time) than the other. Here is the code I wrote that runs each one of the padding with zeroes example a million times.

DECLARE @I INT
DECLARE @C CHAR(8)
DECLARE @ST DATETIME
DECLARE @X INT
SET @C = ''
SET @I = 123
SET @X = 0 
SET @ST = GETDATE()
WHILE @X < 1000000
BEGIN
  SET @C = RIGHT(@I+1000000000000000000,DATALENGTH(@C))
  SET @X = @X + 1
END
PRINT 'FIRST METHOD COMPLETED IN: ' + 
      RTRIM(CAST(DATEDIFF(SS,@ST,GETDATE()) AS CHAR(10))) +
      ' SECONDS!'
SET @C = ''
SET @I = 123
SET @X = 0 
SET @ST = GETDATE()
WHILE @X < 1000000
BEGIN
  SET @C= REPLICATE('0', DATALENGTH(@C) - 
          DATALENGTH(@I)+ 1) + CAST(@I AS CHAR)
  SET @X = @X + 1
END
PRINT 'SECOND METHOD COMPLETED IN: ' + 
      RTRIM(CAST(DATEDIFF(SS,@ST,GETDATE()) AS CHAR(10))) +
      ' SECONDS!'

When I run this code on my machine, I get the follow results:

FIRST METHOD COMPLETED IN: 19 SECONDS!
SECOND METHOD COMPLETED IN: 13 SECONDS!

This shows that the second method of padding with zeroes runs faster than the first example. When you have more than one way to do something that runs extremely fast and you want to know which method is faster, then this method is useful to benchmark your code to determine which one is faster.

Conclusion

This article on benchmarking performance only discussed different methods of how to measure elapsed time. I know this article did not go into detail on how you might be able to use these techniques to optimize slow running queries. However, I hope these examples gave you a number of different ways you can use Query Analyzer to determine how long a particular query runs. Once you know how long your query consistently runs, then you can start modifying the query to determine what causes the query to run slow. This will allow you to try different techniques to see if you can reduce the time it takes to process a slow running query. The next article in this series will discuss methods of benchmarking I/O and CPU resources.

» See All Articles by Columnist Gregory A. Larsen

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved