T-SQL Best Practices - Part 2

Friday Jul 31st 2009 by Gregory A. Larsen

Continuing our discussion covering T-SQL best practices, this article focuses on how to optimize writing a jack-of- all-trades stored procedure (SP).

In my article last month, I started a new series to discuss T-SQL best practices. This is the second article in that series. This series of best practices will help you write T-SQL code that performs better, and promote coding practices to help minimize the number of application failures. This month I’m going to focus on how to optimize writing a jack-of- all-trades stored procedure (SP). I’m going to discuss what kind of SP this is and how you can optimize this kind of SP by following a specific coding style.

Jack-Of-All-Trades Stored Procedure

Before I get into how to optimize a jack-of-all-trades SP, let’s expand a little bit on what kind of stored procedure I’m talking about. The jack-of-all-trades SP is a procedure that accepts a number of different parameters. This procedure accepts none, one, many, or all of the parameters associated with the jack-of-all-trades SP. Based the parameters passed, the jack-of-all-trade SP determines what record set is to be brought back. Here is a typical example of a jack-of-all-trades SP:

CREATE PROCEDURE JackOfAllTrades (@SalesOrderID int = NULL
          ,@SalesOrderDetailID int = NULL
          ,@CarrierTrackingNumber nvarchar(25) = NULL)
SELECT * FROM AdventureWorks.Sales.SalesOrderDetail
	  (SalesOrderID = @SalesOrderID or @SalesOrderID IS NULL)
  AND (SalesOrderDetailID = @SalesOrderDetailID  or 
  	@SalesOrderDetailID IS NULL)
  AND (CarrierTrackingNumber = @CarrierTrackingNumber or 
  	@CarrierTrackingNumber IS NULL) 

Here my JackOfAllTrades SP accepts three different parameters. All these parameters default to the NULL value. If a value is passed in then it is used as a parameter in the WHERE clause to constrain records returned. Each parameter in the SP is used to build a complex WHERE clause that contains the following logic in the WHERE clause for each parameter passed:

 (<TableColumn> = @PARM or @PARM IS NULL)

This logic says if @PARM is passed a non-null value then constrain the records returned to make sure that the <TableColumn> is equal to the value of @PARM. The second part of that condition is “@PARM IS NULL”. This part says if @PARM was not passed (equal to NULL) then don’t constrain your data based on this parameter.

Let’s look at a typical execution of my JackOfAllTrades SP. Suppose I execute my SP with the following command:

EXEC JackOfAllTrades @SalesOrderID = 43659

When this command executes, its execution plan looks like this:

Here you can see that for the single parameter passed, the optimizer decided it should use an “Index Scan” operation. My SP’s SELECT statement is constraining on a highly unique column @SalesOrderID that is part of the clustered index key. So, you would think that SQL Server would be smart enough to know that an “Index Seek” operation would be quicker for resolving my jack-of-all-trades SP by going through the clustered index. But as we can see from the above executing plan that SQL Server isn’t that smart. So why is that?

When the optimizer sees the “@PARM IS NULL” condition, it appears like a constant to the SQL Server optimizer. Therefore the optimizer assumes no index will be useful to resolve the “(<TableColumn> = @PARM1 or @PARM1 IS NULL)” condition due to the constant being in the WHERE condition. Therefore, SQL Server decides to use an Index Scan operation to resolve this type of condition. The more parameters you have in our jack-of-all-trades SP the worse the performance is due to the number of SCAN operations required for each passed parameter.

Optimizing the Jack-Of-All-Trades SP

You don’t have to settle for a poor performing jack-of-all-trades SP by continuing to code these types of queries like the one above. Let’s explore what my JackOfAllTrades SP does and re-write it so SQL Server’s query optimizer will create a more optimal execution plan.

As state above the real problem with the jack-of-all-trades SP is the fact that for every parameter you need to have an “OR” condition that checks to see if the parameter passed IS NULL. If we can eliminate the requirement for this constant expression than SQL Server will be able to create a plan that will use an INDEX SEEK operation. So how can we eliminate the @PARM IS NULL requirement? The answer is by using parameterized dynamic SQL.

Ok before you start shooting me flaming arrows over suggesting dynamic SQL, hear me out. I know you are already thinking that with dynamic SQL I might introduce SQL injection into my solution. However, as you will see I will not be increasing the risk of SQL injection, because I will just be building dynamic SQL code that passes my SP parameters to another system SP “sp_executesql” that will be using these parameters within the dynamic SQL code that I build.

The system SP “sp_executesql” allows you to develop a T-SQL statement that contains parameters, and allows you to define and pass the values for those parameters to the dynamic SQL by passing parameters to the“sp_executesql” SP when executing this SP. A T-SQL statement that is executed this way is commonly called parameterized SQL. There are a number of different reasons to use parameterized SQL, but the scope of this discussion is how to use parameterized SQL to improve the performance of a jack-of-all-trades SP. Here I have re-written the query above using a dynamically built parameterized T-SQL SELECT statement:

CREATE PROCEDURE JackOfAllTrades_V2 (@SalesOrderID int = NULL
                             ,@SalesOrderDetailID int = NULL
                             ,@CarrierTrackingNumber nvarchar(25) = NULL)
SET @CMD = 'SELECT * FROM AdventureWorks.Sales.SalesOrderDetail '
   SET @WHERE = @WHERE + 'AND SalesOrderID = @SalesOrderID '
IF @SalesOrderDetailID IS NOT NULL
   SET @WHERE = @WHERE + 'AND SalesOrderDetailID  = @SalesOrderDetailID '
IF @CarrierTrackingNumber IS NOT NULL
   SET @WHERE = @WHERE + 'AND CarrierTrackingNumber  = @CarrierTrackingNumber '
EXEC sp_executesql @CMD 
	               , N'@SalesOrderID int
	                  ,@SalesOrderDetailID int
	                  ,@CarrierTrackingNumber nvarchar(25)'
                   ,@SalesOrderID = @SalesOrderID
                   ,@SalesOrderDetailID = @SalesOrderDetailID
                   ,@CarrierTrackingNumber = @CarrierTrackingNumber

Let me walk you through this code so you understand the dynamic and parameterized parts of this code. This SP starts out be setting the variable @CMD to the basic SELECT statement I need to run without the WHERE statement. I then set the @WHERE variable to an empty string. I then go through a series of four different IF statements. The first three IF statement check to see if each of the parameters being passed is NOT NULL. If a particular parameter is NOT NULL then I append a condition to the @WHERE variable for that parameter. Since I already checked and determined that the parameter was NOT NULL, I don’t need to add the IS NULL condition to my WHERE clause as was done in the original code in my SP. Instead all I need to add it the <TableColumn> = @PARM condition to the @WHERE variable. The last IF statement determines if the @WHERE variable has been populated with at least one condition, and if it has then it appends the @WHERE variable to the @CMD variable.

Note the @WHERE variable is the dynamic part of this code. But I don’t place the actual text of the parameter into the @WHERE variable, but instead I only place a reference to my parameters in the WHERE condition. Therfore my final dynamic T-SQL statement basically only contains my original SELECT statement and the WHERE clause no longer needs the IS NULL condition to constraint the data.

Lastly I use the “sp_executesql” SP to execute my dynamic parameterized T-SQL statement. To do this I pass 5 parameters to this system SP. The first one of these parameters is my dynamic T-SQL @CMD variable. The second parmeter declares all the possible variables I might have in my parametrized query, and their data types. For the last three parameters that are passed to my SP I just pass them on as parameters to the system SP just like they were originally passed on in my first jack-of-all-trades SP. So as you can see here I haven’t made my dynamic SQL any more vulnerable to SQL injection then my original SP, because I didn’t use the actual values of my parameters to populate the @WHERE variable. I am just passing those parameters on as variables to my parameterized dynamic T-SQL via the “sp_executesql” system SP.

Now let’s run the code for my re-written jack-of-all- trade SP by running the following statement:

EXEC JackOfAllTrades_V2 @SalesOrderID = 43659

When I run this against the AdventureWorks database on my server I get the following execution plan:

By comparing this execution plan with the prior one, you can see it is simpler and it uses a “Cluster Index Seek” operation to resolve this SP. The SQL Server query engine is able to using this operation because my dynamic T-SQL no longer has the “@PARM IS NULL” statement in it. Since I have simplified my T-SQL statement by using dynamic SQL and eliminating the IS NULL constraint, SQL Server is now able to pick a more optimal plan for my V2 version of my jack-of-all-trades SP.

So how much of an improvement does this actually achieve? Remember I’m only looking for records from the SalesOrderDetail table that have a SalesOrderID equal to 43659. My original jack-of-all-trades SP did an INDEX SCAN operation to resolve the query. This means it had to read all the way through the index before it could complete my request to return records that contain a single SalesOrderID value. Whereas the V2 version of my jack-of-all-trades SP is able to use an INDEX SEEK operation against the clustered Index key on the SalesOrderDetail table to get directly to the specific records that contain the SalesOrderId equal to 43659 very quickly. The INDEX SEEK operation is much more optimized then using the INDEX SCAN operation, but how much?

Measuring the I/O savings by using the V2 version of the jack-of-all-trades SP can be accomplished a number ways. Let’s measure the I/O improvements by running the following T-SQL statements:

EXEC JackOfAllTrades @SalesOrderID = 43659
EXEC JackOfAllTrades_V2 @SalesOrderID = 43659

Here I have used the “SET STATISTICS IO ON” statement so the output of running my two SPs will show me the number of I/Os each command required to resolve it query. Below is the output I get from the above T-SQL statements:

(12 row(s) affected)
Table 'SalesOrderDetail'. Scan count 1, logical reads 264, physical reads 0, read-ahead reads 0, 
	lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

(12 row(s) affected)
Table 'SalesOrderDetail'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, 
	lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

By looking at this output, we can see that my original jack-of-all-trades SP performance is 1 scan and 264 logical reads. Whereas my V2 version of the jack-of-all-trades SP did the same number of index scans, but was able to resolve the query by performing only 3 logical reads. This is a savings of 261 I/O, which in itself might not seem significant. However, what if you where calling the SP over and over again in a loop of some kind. It doesn’t take many calls before you will start to notice a significant performance improvement using the V2 version over the original version of the jack-of-all-trades SP.

Improving I/O Counts Using Dynamic Parameterized SQL

Knowing why SQL Server looks at your code and picks a sub-optimal plan is what you need to take away from this article. Here the SQL Server optimized was taking the “@PARM IS NULL” logic in my code as a constant. Therefore, it determined it needed to perform an INDEX SCAN operation to resolve the first version of my jack-of-all-trades SP. As we all know, SCAN operations are typically slower than SEEK operations. By writing dynamic T- SQL code in my V2 version of my jack-of-all-trades SP I was able to eliminate having a constant expression in the WHERE clause of my T-SQL statement. This allowed the optimizer to pick a better route at resolving my T-SQL by using a CLUSTERED INDEX SEEK operation. If you have one of these jack-of-all-trades SP at your site try re-writing it using dynamic parameterize SQL and see what kind of performance gains you are able to obtain.

» See All Articles by Columnist Gregory A. Larsen

Mobile Site | Full Site