Indexed Views Basics in SQL Server 2000

Thursday Mar 20th 2003 by Andrew Novick
Share:

Indexes have been a method to improve the performance of SQL queries since the creation of relational databases in the 1970's. Up until recently, indexes were always created on a single table and they've undergone only limited changes over the years. That was true until a few years ago when Oracle introduced "materialized views." SQL Server 2000 matched that improvement with indexed views.

Indexes have been a method to improve the performance of SQL queries since the creation of relational databases in the 1970's. You are probably familiar with the ways that indexes can speed query operations at the expense of additional overhead during data inserts, updates, and deletes. Up until recently, indexes were always created on a single table and they've undergone only limited changes over the years. That was true until a few years ago when Oracle introduced "materialized views." These views enabled Oracle to claim a big boost in performance over other relational database products. That is, if you used just the right benchmark tests. SQL Server 2000 matched that improvement with indexed views.

The usability of indexed views is limited. While all editions of SQL Server can create and consume an indexed view, it is only the Enterprise Edition and Developer Edition that will make use of them without the addition of the query hint NOEXPAND. As you'll see later, that condition limits their flexibility a great deal. When a database might have to run on many different SQL Servers that may not all be Enterprise Edition, I question whether you really want to use indexed views. Near the end of the article, I'll cover what you have to do to make use of them on non-Enterprise Editions of SQL Server. Having made you aware of this limitation let me also let you know that indexed views can have a major positive impact on performance and can really be worth the trouble.

What is an Indexed View?

An indexed view is a view that has a unique clustered index created on it. Normally views do not exist on disk as rows. That changes for indexed views, which exist in the database as rows that realize the view. There can also be non-clustered indexes on the view so long as it has the unique clustered index.

Since the indexed view exists on disk there are two types of overhead added to the database:

  • The disk space taken up by the view
  • The cost of maintaining the view on disk as the base tables are modified.

Both of these costs can be substantial and limit the usefulness of indexed views to situations where there is a high ratio of data retrieval operations to data modifications. They are best for decision support and analytic applications where the frequency of data modification is low. Adding indexed views to an OLTP type application may slow it down more than it helps.

Why use them?

When the circumstances are right, there is one powerful reason for using them: performance of data retrieval operations. Because the view exists on disk, there can be substantial savings in response time to queries that involve the view.

In addition, the view can be used by the optimizer in response to some or all of a query that does not directly involve the view that is indexed. The mere existence if the indexed view can change the performance behavior of queries on other views and on queries that involve similar joins or grouping operations.

The performance gain during the query comes from one of two factors:

  • Storing a join.
  • Performing grouping and aggregations in advance of running the query.

It is even possible to combine aggregation and join operations in a single indexed view.

This article uses an example from the Northwind database: the [Order Details Extended] view. Here is its definition:

create view "Order Details Extended" AS
SELECT OD.OrderID, OD.ProductID, Products.ProductName, 
	OD.UnitPrice, OD.Quantity, OD.Discount, 
	(CONVERT(money,(OD.UnitPrice*Quantity*(1-Discount)/100))*100)
                        AS ExtendedPrice
FROM Products INNER JOIN "Order Details" OD
      ON Products.ProductID = "Order Details".ProductID

I have modified the syntax of the CREATE VIEW statement slightly from the definition that you will find in the Northwind database by adding an alias (OD) for the [Order Details] table and cleaning up the formatting. The definition of the view is fundamentally unchanged.

Look at a simple query on [Order Details Extended]. I have selected an OrderID for a moderate size Order to narrow the resultset to only six rows. The DBCC DROPCLEANBUFFERS command is used so that the physical reads aren't short-circuited by the data cache. Set STATISTICS IO ON is used so we can see the quantity of IO that occurs. Here is the query and its results:

DBCC DROPCLEANBUFFERS
GO
SET STATISTICS IO ON
SELECT ProductName, UnitPrice as UnitPri, Quantity as Qty
     , Discount as Disc, ExtendedPrice 
    FROM [Order Details Extended]
    WHERE OrderID = 10657
GO
(Results)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
ProductName                     UnitPri  Qty Disc ExtendedPrice  
------------------------------- -------- --- ---- ---------------
Genen Shouyu                     15.5000  50  0.0        775.0000
Jack's New England Clam Chowder   9.6500  24  0.0        231.6000
Spegesild                        12.0000  45  0.0        540.0000
Zaanse koeken                     9.5000  10  0.0         95.0000
Gnocchi di nonna Alice           38.0000  45  0.0       1710.0000
Camembert Pierrot                34.0000  30  0.0       1020.0000
(6 row(s) affected)

Table 'Products'. Scan count 6, logical reads 12, physical reads 2, read-ahead reads 0.
Table 'Order Details'. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0.

There are 12 logical reads on Products, 2 on [Order Details], and a total of four physical reads. It's the physical reads that count the most because they use the resource, the disks, that has the highest effect on response time. However, logical reads also have their costs.

Figure 1 has the query plan that is generated before any indexes are added to views. Near the end of the article, we will see how the plan for this query is changed by the existence of an indexed view, even though the index is on another view.

Figure 1 Query Plan for SELECT on [Order Details Extended]

As you can see, the query plan confirms the information from the statistics, all the work of the query is reading the [Order Details] and Products tables using their primary keys and joining the result. For such a simple query with a small result set the query is handled very efficiently by SQL Server but the physical IO of this query can still be cut in half.

Any attempt to add an index on [Order Details Extended] will fail. There are many restrictions on the views that can be indexed, on the columns that an indexed view can contain, and on the columns in an indexed view that may be indexed. The next section shows what the restrictions are and how a view can be constructed to live within them.

Restrictions on Indexing Views

There are restrictions on which base tables may be included in the view, which views are eligible for indexing, on the columns that can be in the view, and on which columns in those the view can be part of the index. It is also important to manage the database options that are in effect any time the view or its base tables are referenced by the database.

Indexed views require that a consistent set of session options exists during three time periods:

  • When the indexed view is created
  • When any INSERT, UPDATE, or DELETE operation is performed on a base table of the view.
  • When the indexed view is used by the optimizer to produce a query plan

Since you never totally know when the indexed view might be affected, the options should be set all the time. The seven options are listed in Table 1 with the values that must be set and the default value for the option.

Table 1

SESSION SET Option

Must Be

Default Value

ANSI_NULLS

ON

OFF

ANSI_PADDING

ON

ON

ANSI_WARNING

ON

OFF

ARITHABORT

ON

OFF

CONCAT_NULL_YEILDS_NULL

ON

OFF

NUMERIC_ROUNDABORT

OFF

OFF

QUOTED_IDENTIFIER

ON

OFF

Database access methods such as OLE DB, ODBC, and DB-Library set these options, but not to the exact values needed by indexed views. They should probably be set at the server level with sp_configure and careful attention should be paid to their values by setting them when any database connection is created.

The next sections review restrictions on the tables, views, and columns that you might try to include in a indexed view. Each section has a discussion of what the restrictions are, how to detect them, and live within them.

Restrictions on Base Tables

For starters, base tables of the view must be in the same database as the view. In addition, if there are computed fields in the base table that are reference by the view, a special condition applies: the value of ANSI_NULLS and QUOTED_IDENTIFIER must have the correct option when the base table is created. The reason they must be set when the base table is created, as opposed to at runtime, is that they are both parse time options. That is, the table retains the value that was set when the table was created. If the wrong options are in effect when the base tables referenced by an indexed view are created, the view cannot be created with SCHEMABINDING, which is a requirement for the view. Ordinary columns that are not computed are not affected by this requirement.

Restrictions on the View to be Indexed

When the views to be indexed is created it:

  • Must be created the WITH SCHEMABINDING view option
  • May only refer to base tables in the same database.
  • If there is a GROUP BY clause, the view may not have a HAVING, CUBE, or ROLLUP.
  • May not have an OUTER JOIN clause.
  • May not have a UNION.
  • May not have DISTINCT or TOP clauses
  • May not have full-text predicates such as CONATINSTABLE
  • May not have a ROWSET function such as OPENROWSET
  • May not use derived tables or subqueries.
  • Must be created with ANSI_NULLS ON and QUOTED_IDENTIFIER ON

The SCHEMABINDING requirement adds additional requirements, such as the use of a two-part name for all base tables.

The Northwind.dbo.[Order Details Extended] view was shown above. It is not schema bound. This script creates a revised view that is schema bound:

CREATE VIEW OrderDetailsXSB   WITH SCHEMABINDING 
AS
SELECT OD.OrderID, OD.ProductID, P.ProductName , OD.UnitPrice
     , OD.Quantity, OD.Discount
FROM dbo.Products P
    INNER JOIN dbo.[Order Details] OD
         ON P.ProductID = OD.ProductID

There are three important changes made to create the new view:

  • The WITH SCHEMABINDING option was added to the header.
  • Two-part names are used for both base tables.
  • ExtendedPrice was removed from the SELECT list. It is an expression and to be indexed, the view may not have any expressions.

The last change is pretty major but since the columns used to compute the ExtendedPrice column are all in the view, any statement selecting from the view could compute the extended price in an expression.

What is accomplished by this view is that the [Order Details] table and the Product table are pre-joined and the ProductName field is stored in the view with information from the [Order Details] table. Once the view is indexed and a statement uses the view, the join is not necessary; SQL Server can go directly to the rows stored in the view's clustered index.

The OBJECTPROPERTY built-in function has a property, IsIndexable, which can be used to find out if a view satisfies all the requirements for indexing. This query shows us the index eligibility of both the [Order Details] base table and the two views built on it:

-- Check index eligibility for 'Order Details%' tables and views
SELECT TABLE_TYPE, TABLE_NAME
     , OBJECTPROPERTY (OBJECT_ID(TABLE_NAME), 'IsIndexable')
                          AS IsIndexable
   FROM INFORMATION_SCHEMA.TABLES
     WHERE TABLE_NAME LIKE ('Order Details%')
         OR TABLE_NAME = 'OrderDetailsXSB'
GO
(Results)
TABLE_TYPE TABLE_NAME                  IsIndexable 
---------- --------------------------- ----------- 
BASE TABLE Order Details                         1 
VIEW       Order Details Extended                0 
VIEW       OrderDetailsXSB                       1

From this query, we know that OrderDetailsXSB satisfies the conditions for indexability. That does not mean that every column in the view can be indexed. There are additional restrictions on the columns that may effect [Order Details SB]

Unfortunately, the IsIndexable property is imperfect. A view will still have a true value for IsIndexable even if it has expressions in columns. The problem only shows up when you attempt to create the index.

The limitation on expressions is hardly the only limitation on the columns in a view. The next section looks at restrictions on columns and shows you how to locate the columns that can be part of an index.

Restrictions on Columns in the View

The columns that can be in the SELECT list or WHERE clause of an indexed view are restricted. There are two possibilities based on the query: GROUP BY or no GROUP BY. If there is no group by, the columns may not contain expressions. However, they can contain computed fields in base tables and user-defined functions. On the other hand, if there is a GROUP BY clause, the following restrictions apply:

  • These aggregate functions may not be used: COUNT(*), STDEV, VARIANCE, AVG, MIN, MAX
  • No SUM aggregate function on a NULLable expression.
  • The view must contain a COUNT_BIG(*) expression

If these conditions are satisfied, it is possible to create an index on the view. However, not all columns in the view are eligible for indexing.

Selecting Columns for the Index

There are limitations on which columns from the view can be in the index. The column expressions must be deterministic and may not have floating-point data. The view can have floating point columns (real or float), it is just that no floating point columns can be in the index.

There are three properties returned by the SQL Server built-in function COLUMNPROPERTIES that tell you if columns in a view qualifies to be included in an index. Each of the columns returns a BIT value that is 1 when the column has the property. They are:

  • IsIndexable - Gives the answer: can the column be indexed.
  • IsDeterminstic - Non-deterministic columns cannot be indexed.
  • IsPrecise - 1 when the column is not a floating point type.

In addition, the IS_Nullable column may come in handy when figuring out why a column can't be indexed and there is a SUM aggregation.

To give you a quick rundown about which columns of a view can be included in an index I have put together the user-defined function udf_View_ColumnIndexableTAB. Here is the CREATE FUNCTION script for it.

SET QUOTED_IDENTIFERS ON
SET ANSI_NULLS ON
GO

CREATE  FUNCTION udf_View_ColumnIndexableTAB (

    @view_name_pattern sysname = NULL -- View name or pattern to
                   -- search for. NULL for all
  , @col_name_pattern sysname = NULL -- Column name or pattern to
                   -- search for. NULL for all
) RETURNS TABLE
/*
* Returns a table of the columns in views whose name match the 
* patterns in the parameters and the status of the columns as
* indexable, deterministic, and precise.
*
* Example:
SELECT * FROM udf_View_ColumnIndexableTAB(NULL, NULL)
*
* ) Copyright 2003 Andrew Novick http://www.NovickSoftware.com
* You may use this function in any of your SQL Server databases
* including databases that you sell, so long as they contain 
* other unrelated database objects. You may not publish this 
* UDF either in print or electronically.
* Published in the UDF of the Week Newsletter Vol 1 Number 19
http://www.NovickSoftware.com/UDFofWeek/UDFofWeek.htm
****************************************************************/
AS RETURN 
SELECT TOP 100 PERCENT WITH TIES
       C.TABLE_SCHEMA AS [Owner]
     , C.TABLE_NAME AS [VIEW_NAME]
     , COLUMN_NAME 
     , ORDINAL_POSITION 
     , dbo.udf_SQL_DataTypeString (C.DATA_TYPE
                                 , C.CHARACTER_MAXIMUM_LENGTH
                                 , C.NUMERIC_PRECISION
                                 , C.NUMERIC_SCALE) AS DATA_TYPE
     , CASE WHEN 1=COLUMNPROPERTY(OBJECT_ID(C.TABLE_NAME)
                          , COLUMN_NAME, 'IsIndexable') 
            THEN 'YES' ELSE 'NO' END as IsIndexable
     , CASE WHEN 1=COLUMNPROPERTY(OBJECT_ID(C.TABLE_NAME)
                     , COLUMN_NAME, 'IsDeterministic') 
            THEN 'YES' ELSE 'NO' END as IsDeterministic
     , CASE WHEN 1=COLUMNPROPERTY(OBJECT_ID(C.TABLE_NAME)
                          , COLUMN_NAME, 'IsPrecise') 
            THEN 'YES' ELSE 'NO' END as IsPrecise
     , IS_NULLABLE
    FROM INFORMATION_SCHEMA.[COLUMNS] C
       INNER JOIN INFORMATION_SCHEMA.TABLES T
           ON C.TABLE_SCHEMA = T.TABLE_SCHEMA
            AND C.TABLE_NAME = T.TABLE_NAME
    WHERE T.TABLE_TYPE='VIEW'
      AND (@view_name_pattern is NULL -- all tables
           OR C.Table_Name LIKE @view_name_pattern)
      AND (@col_name_pattern is NULL -- all columns
           OR [Column_Name] Like @col_name_pattern)
    ORDER BY C.TABLE_NAME
           , C.ORDINAL_POSITION

If you want to compile it, you will first need the definition of udf_SQL_DataTypeString, which was recently published in my T-SQL UDF of the Week newsletter. You can get it from the UDF of the Week Archives page.

The function's two parameters are patterns that work with the LIKE operator. The first parameter is on the view name. The second parameter is on the column name. LIKE operator, patterns are used instead of the names to give more flexibility to the UDF. The UDF can be queried to check the index eligibility of columns in our indexable view, OrderDetailsXSB:

-- Are columns in "OrderDetailsXSB" eligible for indexing?
SELECT COLUMN_NAME, ORDINAL_POSITION, DATA_TYPE 
     , ISIndexable, IsDeterministic, IsPrecise
    FROM udf_View_ColumnIndexableTAB('OrderDetailsXSB', NULL)
GO
(Results)
COLUMN_NAME    Pos Type         ISIndexable IsDet IsPrecise Is_Nullable
-------------- --- ------------ ----------- ----- --------- -----------
OrderID          1 int          YES         YES   YES       No         
ProductID        2 int          YES         YES   YES       No         
ProductName      3 nvarchar(40) YES         YES   YES       No         
UnitPrice        4 money        YES         YES   YES       No         
Quantity         5 smallint     YES         YES   YES       No         
Discount         6 real         NO          YES   NO        No         

Most of the view's columns are indexable. The column that is not indexable is Discount. It has a floating-point data type, real. Had ExtendedPrice remained in the view it also would not have been indexable because it is an expression.

Now that you are aware of all the restrictions, it is time to create an indexed view. We will use the OrderDetailsXSB view because it satisfies all the restrictions.

Creating the Index

Indexes on views are created in the same way that indexes on tables are with the CREATE INDEX statement. Here is one to create a unique clustered index on OrderDetailsXSB that makes it an indexed view:

SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
GO

CREATE UNIQUE CLUSTERED INDEX [IDX_Order_Details_X] 
       ON OrderDetailsXSB (OrderID, ProductID
                         , ProductName, Quantity)
GO

That's all there is to creating it. Once it is created, SQL Server's query optimizer can select to use the index on any query where it is appropriate. As we will see, while it will be used for queries on OrderDetailsXSB it can be used on queries on other views.

Putting the New Index to Use



For starters, let's query OrderDetailsXSB and take a look at the plan. This query retrieves the same information as the original query on [Order Details Extended]:



DBCC DROPCLEANBUFFERS
GO
SET STATISTICS IO ON
SELECT ProductName, UnitPrice as UnitPri, Quantity as Qty
     , Discount as Disc 
     , CONVERT(numeric(18,2), UnitPrice*Quantity*(1-Discount))
                AS ExtendedPrice
    FROM OrderDetailsXSB
    WHERE OrderID = 10657
GO
(Results)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
ProductName                     UnitPri  Qty Disc ExtendedPrice  
------------------------------- -------- --- ---- ---------------
Genen Shouyu                     15.5000  50  0.0          775.00
Jack's New England Clam Chowder   9.6500  24  0.0          231.60
Spegesild                        12.0000  45  0.0          540.00
Zaanse koeken                     9.5000  10  0.0           95.00
Gnocchi di nonna Alice           38.0000  45  0.0         1710.00
Camembert Pierrot                34.0000  30  0.0         1020.00

6 row(s) affected)

Table 'OrderDetailsXSB'. Scan count 1, logical reads 2, physical 
reads 2, read-ahead reads 0.

The results are the same as the first query but the statistics are different. They show only 2 physical reads. That's half the physical reads used by the query without the index. That is the kind of performance gains that can be expected from the addition of indexed views.

Figure 2 shows the query plan for the query and reveals that the indexed view was queried instead of joining the two tables. Adding the index had the desired effect of avoiding the join and reducing IO.

Figure 2 Query Plan of Query on [Order Details SB]

The only fly in the ointment is that the ExtendedPrice column is no longer in the view and has to be added to the query that selects from OrderDetailsXSB. Or does it?

The optimizer can use the new index to improve the performance of any query where it may help out. Let's try our original query on the unaltered view [Order Details Extended] and see what happens:

DBCC DROPCLEANBUFFERS
GO
SET STATISTICS IO ON
SELECT ProductName, UnitPrice as UnitPri, Quantity as Qty
     , Discount as Disc, ExtendedPrice 
    FROM [Order Details Extended]
    WHERE OrderID = 10657
GO
(Results)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
ProductName                     UnitPri  Qty Disc ExtendedPrice  
------------------------------- -------- --- ---- ---------------
Genen Shouyu                     15.5000  50  0.0          775.00
Jack's New England Clam Chowder   9.6500  24  0.0          231.60
Spegesild                        12.0000  45  0.0          540.00
Zaanse koeken                     9.5000  10  0.0           95.00
Gnocchi di nonna Alice           38.0000  45  0.0         1710.00
Camembert Pierrot                34.0000  30  0.0         1020.00

(6 row(s) affected)

Table 'OrderDetailsXSB'. Scan count 1, logical reads 2, physical
reads 2, read-ahead reads 0.

Figure 3 shows that the plan is the same plan that was used for directly querying OrderDetailsXSB. The IO statistics show that the same benefit was achieved as if OrderDetailsXSB was queried directly.

Figure 3 Query Plan that Uses an Index on a Different View

This is great! It means that the original queries do not have to be changed in order to reap the benefits of indexed views. An application can be left intact, indexed views can be added to the database, and query performance will improve, usually substantially.

Using Indexed Views On Non Enterprise Editions of SQL Server

At the start if the article I mentioned that to use indexed views on SQL Server editions other than Enterprise and Developer, the NOEXPAND query hint must be used. NOEXPAND Specifies that the indexed view is not expanded when the query optimizer processes the query. In effect, the query optimizer treats the view like a table with clustered index.

In order to use NOEXPAND, the view must be named in the query! That means that all the wonderful optimizer magic that I just showed to you in the previous section doesn't happen. Under Enterprise and Developer Editions, the optimizer is able to use the OrderDetailsXSB indexed view to substitute for [Order Details Extended] even though OrderDetailsXSB wasn't in the query. In other editions, it can't. If our example query is going to run under the Standard Edition it must become:

SELECT ProductName, UnitPrice as UnitPri, Quantity as Qty
     , Discount as Disc 
     , CONVERT(numeric(18,2), UnitPrice*Quantity*(1-Discount))
                AS ExtendedPrice
    FROM OrderDetailsXSB  WITH (NOEXPAND)
    WHERE OrderID = 10657

The query must use the WITH (NOEXPAND) hint in the FROM clause. The drawback to using NOEXPAND is that the view must be indexed. If it is not indexed SQL Server issues by SQL Server the message:

Server: Msg 8171, Level 16, State 2, Line 1
Hint 'noexpand' on object 'OrderDetailsXSB' is invalid.

The bottom line is that to use indexed views on queries with the NOEXPAND hint, the view must be named in the query, which means the view is going to have to always exist in the database. The best way to make this happen is to make views that satisfy the restrictions on indexes and then use them as the basis for the views needed by the application. For example, the alternative to changing the application to query OrderDetailsXSB directly is to define [Order Details Extended] as a view based on OrderDetailsXSB. Then the applications SQL statements don't have to be changed to use the indexed view. In essence, you the DBA/Programmer are doing the work the optimizer would have done. The disadvantage to that is that the optimizer loses its flexibility and won't be able to make the best execution plan based on costs.

Conclusion

Although there are many restrictions on indexed views and they may not be as useful in certain editions of SQL Server, indexed views can provide a very significant improvement in performance. Getting around the restrictions may not be as difficult as it originally appears due to the optimizer's ability to make use of the index even if the query is not directly on the view being queried.

While the improvement in query performance can be substantial, the costs can be large enough to offset any benefits. Maintaining the index on a view has performance costs: it adds overhead to every insert, update or delete on a base table that participates in the view. This limits the applicability of indexed views to situations where data modifications are limited; preferably, where the data modification occurs in batches. Also, don't forget about the extra space consumed by the index. Applications that involve large proportions of decision support and analysis are the best candidates for using indexed views.

References:

Improving Performance with SQL Server 2000 Indexed Views

http://www.microsoft.com/technet/treeview/default.asp?url=/technet/prodtechnol/sql/maintain/optimize/indexvw.asp

» See All Articles by Columnist Andrew Novick

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved