Minimizing SQL Server Stored Procedure Recompiles

Monday Jun 9th 2003 by Andrew Novick
Share:

Andy Novick discusses the reasons that SQL Server 2000 decides to recompile a stored procedure and demonstrates techniques that can be used to minimize recompilation.

This article discusses the reasons that SQL Server 2000 decides to recompile a stored procedure and demonstrates techniques that can be used to minimize recompilation. If your system makes extensive use of stored procedures, minimizing recompilation can give you a nice boost in performance.

Last month I wrote about ways to monitor stored procedure recompilation in SQL Server 2000. That article concentrated on figuring out if recompiles are a problem and if so, where they were occurring. You can read the article at: http://www.databasejournal.com/article.php/2203601. As promised, this article shows how to change your stored procedure to cut the problem down to size. But you should recognize that there's a good reason for some of the recompiles and you shouldn't try to eliminate all of them.

We'll start with a short review of what constitutes a recompile, why you might want to minimize them, and how we measure them.

Recompilation Review

The first time that SQL Server is asked to execute a stored procedure, user-defined function, or trigger it has to convert the definition of the object into an execution plan. Execution plans are SQL Server's version of compiled code and I'll refer to the process of creating the plan as compilation.

SQL Server has what's called a cost based optimizer. That is it uses information about the values in the data columns to produce the best plan. Figuring out the best plan can consume a moderate amount of CPU resources and require that various database objects be locked while the recompile takes place. Frequent recompiles can be an important reason why a database is considered slow.

Plan creation, aka compilation, takes into account the estimated cost of executing each of the queries. The optimization algorithm takes into account:

  • The columns and base tables to be accessed
  • The joins, grouping, WHERE clause, and ORDER BY clause.
  • The available indexes on the tables
  • The distribution of data in columns that are referenced in the query

SQL Server gets information about the data distribution from the statistics that it keeps for just this purpose. As the distribution of data changes, the optimal plan may also change. For example, the optimal plan for very small tables is to do a complete table scan, no matter what the query requests, because the scan requires the minimum amount of I/O. As the table grows, the optimal approach will change to one that uses an index. Depending of course, on which columns are indexed. The point is that the optimal plan changes as the data changes.

Execution plans aren't stored permanently. Instead, only the textual definition is kept permanently in syscomments. It holds the CREATE or ALTER statement that defined the object. As the code is compiled a copy of the execution plan is saved in the master.syscacheobjects table where it is used and becomes available for reuse.

Aggregate numbers of recompiles can be monitored using the Windows Performance Monitor. Discovering which statements are causing recompiles is best done with the SQL Profiler. Last month's article and the videos that accompany them show how to use both of these tools.

SQL Server usually uses the plan in the cache. However, sometimes, it can't and it must recompile. Sometimes it decides that a better plan might be found by creating a new plan. The next section summarizes the reasons and discusses what to do about them.

Reasons That a Stored Procedure is Recompiled

SQL Server recompiles stored procedures for a variety of reasons. Last month's article went into detail about the reasons. In short the reasons amount to:

  1. You told it to. There are several ways that you can request a recompile such as using the WITH RECOMPILE clause on the procedure definition.

  2. New distribution statistics are generated or a sufficient number of row modifications occur

  3. One of these SET statements

    • SET ANSI_DEFAULTS
    • SET ANSI_NULLS
    • SET ANSI_PADDING
    • SET ANSI_WARNINGS
    • SET CONCAT_NULL_YIELDS_NULL

  4. Interleaving SQL DDL and SQL DML operations. For example creation of permanent or temporary tables and the creation of indexes on tables forces a recompile at the next SQL DML statement, such as a SELECT.

  5. The plan is aged out of cache.

Let's take a look at each of these reasons for recompile and suggest what to do about it.

Reason 1. You asked SQL Server to Perform a Recompile

The WITH RECOMPILE clause requests that a stored procedure be recompiled every time 'it is used. The reason that you'd use WITH RECOMPILE is that the best plan for the statements in the stored procedure tends to vary depending on the procedure's parameters and caching is causing SQL Server to sometimes use a suboptimal plan.

The best time to use WITH RECOMPILE is when the cost of the recompile is very small compared to the time it takes to run the stored procedure. For example a complex report on a large amount of data. If that's the case then by all means, keep the WITH RECOMPILE clause. Otherwise, you are better off removing the WITH RECOMPILE option.

If there are specific values of the stored procedure's parameters that you know warrant a recompile, you can always put the WITH RECOMPILE option on the EXEC statement. If you do that, you will want to add an sp_recompile after the procedure runs. For example, let's assume that a stored procedure, usp_MyReport, takes a single parameter, @DepartmentNum and that the procedure is run in this way almost all of the time. Sometimes you want a report for the entire organization, not just a department and usp_MyReport accepts NULL for @DepartmentNum. Now let's say that when @DepartmentNum is supplied, the optimal plan is to use a range scan on the non-clustered index that starts with the DepartmentNum column. However, when a report is requested for the entire organization, the optimal plan is a table scan. Instead of invoking usp_MyReport, replace reference to it with references to usp_MyReportInvoker shown here:

CREATE PROCEDURE usp_MyReportInvoker 
 
     @DepartmentNum INT

AS 
    IF @DepartmentNum IS NOT NULL
       EXEC usp_MyReport @DepartmentNum
    ELSE BEGIN
       EXEC usp_MyReport NULL WITH RECOMPILE
       EXEC sp_recompile usp_MyReport
    END

This procedure recompiles usp_MyReport only when you know that changing the plan is going to produce faster results. By doing this, an optimal plan is used every time the report is run. Most of the time, when department number is supplied, SQL Server can reuse the cached plan.

When I was a child, every once in a while as my mother and father would leave the house for an evening out, my mother would turn to my brother, two sisters and me and say, "When I come back, I don't want to find any of you with beans up your nose!" Not that any of use would have thought of putting beans up our nose had she not mentioned it in the first place. It was a reminder not to do things we knew we really shouldn't be doing. In the category of "Don't put beans up your nose," do not use DBCC FREEPROCCACHE, sp_recompile, or WITH RECOMPILE unless you have a real good reason. If you have a good enough reason, you will live with the stored procedure recompiles.

Reason 2: New Distribution Statistics or a Sufficient Number of Row Modifications

SQL Server does recompiles after statistics are generated or after a large enough number of row modifications. It does this because this approach tends to produce better execution plans. It's a good thing. However, there are circumstances where you know that recompiles are not changing the plan because the data is not really changing that much. In these cases you might want to turn off the automatic generation of statistics and update them your self on a schedule that you control.

Exactly when statistics get generated and the instructions to turn automatic generation on or off are well documented in Microsoft Knowledge Base article 195565. I refer you there if you're interested. You can read it at: http://support.microsoft.com/default.aspx?scid=kb;EN-US;195565

Reason 3: SET Statements That Change Session Options

Changing the value of five session options with the SET statement cause a recompile. The options are: ANSI_DEFAULTS, ANSI_WARNINGS, ANSI_PADDING, CONCAT_NULL_YEILDS_NULL.

To illustrate the causes of stored procedure recompilation including SET statements, let's bring back the stored procedure usp_Recompiles_Several_Times from last month's article. It was specifically constructed to cause recompilation at least four times each call. Here's the script to create it:

SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON 
GO

CREATE PROCEDURE dbo.usp_Recompiles_SeveralTimes AS 

SET NOCOUNT ON -- Doesn't cause a recompile

-- Cause a recompile due to creation of the temporary table   
CREATE TABLE #Temp1 ([id] int identity (1,1)
                   , [name] varchar(64))
INSERT INTO #Temp1 ([name]) VALUES ('Gregor Samsa')

-- Cause a recompile because of a change in a session option
SET ANSI_NULLS OFF
INSERT INTO #Temp1 ([name]) VALUES ('Padme Amidala')

-- Cause a recompile because of a change in a session option
SET ANSI_WARNINGS OFF
INSERT INTO #Temp1 ([name]) VALUES ('Casper Guttman')

-- Cause a recompile because of a new index
CREATE CLUSTERED INDEX IDX_Name ON #Temp1 ([name])
SELECT * FROM #Temp1 ORDER BY [name]

DROP TABLE #Temp1 -- Doesn't Cause a recompile
GO

The best approach to minimize recompiles caused by changing these options is to not change them. Do this by establishing a consistent set of options and always using them. Unfortunately, many code changes may be required by this approach. For example, if you were to always have CONCAT_NULL_YEILDS_NULL set to OFF, you'd have to do additional checking with the ISNULL function or COALESCE on the occasions when you didn't want a NULL result.

If you establish consistent session options at the start of database development you may be able to avoid SET statements. Trying to change the code to use a consistent set of options after it has been written is very difficult. In those cases the best approach would be to minimize the recompiles by grouping SET statements together. That works because the recompile does not happen until the next SQL DML statement. So if our procedure really required that ANSI_NULLS be OFF and ANSI_WARNINGS be OFF it would be best to move them to the start of the procedure so that there would only be one recompile. The first few lines of our sample SP would be:

SET NOCOUNT ON -- Doesn't cause a recompile
-- Cause a recompile because of a change in a session option
SET ANSI_NULLS OFF
SET ANSI_WARNINGS OFF
INSERT INTO #Temp1 ([name]) VALUES ('Gregor Samsa')

One recompile is better than two.

Reason 4: Interleaving DDL and DML Operations.

Once a SQL DDL statement, such as CREATE TABLE, is executed the next SQL DML statement, such as SELECT, that is executed causes the stored procedure to be recompiled before execution is resumed. Stored procedures often contain DDL statements for managing temporary tables. You can reduce the frequency of this type of recompilation through several strategies:

  • Group SQL DDL so that only one recompile is produced
  • Replace temporary tables with TABLE variables
  • Replace temporary tables with permanent tables
  • Minimize references to temporary tables created outside the procedure
  • Any references to a temporary table should proceed all DROP TABLE statements
  • Reuse temporary tables instead of dropping and recreating them inside the same procedure.

Most of these strategies are pretty obvious, but one, using TABLE variables is new to SQL Server 2000 and I'll concentrate on it. TABLE variables are created with a DECLARE statement instead of a CREATE TABLE statement. Here's a sample:

DECLARE @myTable (id int identity(1,1) primary_key
                 , ColumnA varchar(20)
                 , ColumnB varchar(20)
                 )

Once declared, a TABLE variable can be used like other tables. You can execute INSERTs, UPDATEs, DELETEs, and SELECTs against it. The major limitation to TABLE variables is that their scope is limited to the procedure that creates them. They cannot be referenced by or pass to any stored procedure, user-defined function, or trigger that is invoked by the procedure that DECLAREs them.

Another limitation of TABLE variables is that you cannot create indexes on them. The only indexes they have are the ones that SQL Server creates implicitly when the variable has a primary_key or a unique constraint.

The limitations on TABLE variables may make it impossible to use them. SQL Server actually creates a special kind of table for them in tempdb. However, it does not put any information about them into tempdb's system tables and they end up consuming fewer resources than a temp table would.

In the sample stored procedure replacing the temporary table with a TABLE variable allows us to eliminate the CREATE TABLE and the CREATE INDEX statements. These changes cut the number of recompiles down to one. Here is the procedure with the SET statements grouped at the beginning of the procedure and #Temp1 converted to a TABLE variable:

CREATE PROCEDURE dbo.usp_Recompiles_Just_Once AS 

DECLARE @Temp1 TABLE ([id] int identity (1,1)
                   , [name] varchar(64))

SET NOCOUNT ON -- Doesn't cause a recompile
-- Cause a recompile because of a change in a session option
SET ANSI_NULLS OFF
SET ANSI_WARNINGS OFF

INSERT INTO @Temp1 ([name]) VALUES ('Gregor Samsa')
INSERT INTO @Temp1 ([name]) VALUES ('Padme Amidala')
INSERT INTO @Temp1 ([name]) VALUES ('Casper Guttman')

SELECT * FROM @Temp1 ORDER BY [name]

GO

There is still one recompile every time the procedure is executed but one is better than four.

Reason 5: The Plan is Aged Out of Cache

Plans are aged out of cache when they are either not used or when SQL Server is low on memory. If you have supplied your server with adequate memory, this should not happen until long after the stored procedure was last used. If it does, you should examine the overall memory situation instead of focusing on the recompiles.

Conclusion

Although stored procedure recompiles can be a performance problem this article has shown you several ways to minimize their frequency. As with all good things, there is a point of diminishing returns. This is particularly true of recompiles caused by data modifications. SQL Server 2000 uses a strategy of replacing execution plans fairly frequently. That is the only way that the cost based optimizer can have an impact on performance.

» See All Articles by Columnist Andrew Novick

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved