MDX in Analysis Services: Optimizing MDX: Caching and Other Considerations

Monday Feb 23rd 2004 by William Pearson
Share:

Bill Pearson concludes the Optimizing MDX sub-series with an exploration of caching to optimize MDX queries. He then discusses additional optimization options, including the substitution of external functions and leveraging the database structure, for obtaining optimal performance.

About the Series ...

This is the twelfth tutorial article of the series, MDX in Analysis Services. The series is designed to provide hands-on application of the fundamentals of MDX from the perspective of MS SQL Server 2000 Analysis Services ("Analysis Services,"); our primary focus is the manipulation of multidimensional data sources, using MDX expressions, in a variety of scenarios designed to meet real-world business intelligence needs.

For more information on the series, as well as the hardware / software requirements to prepare for the tutorials we will undertake, please see the first lesson of this series: MDX Concepts and Navigation.

Note: At the time of writing, Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples. The screen shots that appear in this article were taken from a Windows 2003 Server, and may appear somewhat different from coinciding views in other operating systems.

Introduction

In our last tutorial, More on Location, and the Importance of Arrangement, we returned to our three-part mini-series, Optimizing MDX. We continued our focus from the first article of the series, Control Location of Processing, exploring the use of control of location as a primary intervention type for MDX query optimization. We performed a practice exercise to reinforce the concepts exposed, and then extended our considerations of additional types of intervention to include the optimization of set operations and syntax arrangement considerations. Within our exploration of the optimization of set operations, we undertook practice examples that illustrated some ways we can rearrange queries to enhance performance, often significantly.

In this lesson, the final article of the current Optimizing MDX mini-series, we will expose methods of caching to load a commonly used slice of a cube into memory, making for faster retrieval in prospective operations. Our discussion will include various aspects of cache creation, and uses of caching within MDX. In addition, we will touch upon other performance enhancement options, including external functions and cube design modifications and augmentation.

Caching and Optimization

When queries are not well underpinned by aggregates, we can often enhance performance by creating and caching the appropriate aggregates in memory. Caching is a feature that MDX provides to improve performance; caching affords us the capability of loading a commonly used slice of a cube into memory, "caching" it for faster retrieval by our queries.

Analysis Services and the PivotTable Service automatically cache query definitions, data and meta data on the server and client sides, respectively. Caching increases performance in those cases where queries are repeatedly requesting the same data or meta data, reducing network traffic or execution time. The ability to create caches for data that we specify in MDX gives us another means of fine-tuning query performance; through this capability, we realize a great degree of control over the caching of data for which we expect there to be a recurring need.

In terms of creation scope, caches are analogous to named sets: we can create a cache for the lifetime of a single query, or for a session. To create a cache to be used at the session level, the CREATE CACHE statement can be used. The CREATE CACHE statement can be used to create caches at the query level, but the WITH statement, with which we are now somewhat familiar, can perform this task just as easily, and is more frequently used for this purpose.

Let's take a look at the use of the WITH statement to create a cache in an MDX query. First, we will call our old friend, the MDX Sample Application, as a platform from which to perform our practice exercises.

1.  Start the MDX Sample Application.

We are initially greeted by the Connect dialog, shown in Illustration 1.


Illustration 1: The Connect Dialog for the MDX Sample Application

The illustration above depicts the name of my server, MOTHER1, and properly indicates that we will be connecting via the MSOLAP provider (the default).

2.  Click OK.

The MDX Sample Application window appears.

3.  Clear the top area (the Query pane) of any remnants of queries that might appear.

4.  Ensure that FoodMart 2000 is selected as the database name in the DB box of the toolbar.

5.  Select the Warehouse cube in the Cube drop-down list box.

The MDX Sample Application window should resemble that depicted in Illustration 2, complete with the information from the Warehouse cube displaying in the Metadata tree (left section of the Metadata pane).


Illustration 2: The MDX Sample Application Window (Compressed View)

Let's look first at creating a cache with a query scope. To do so, we will take the following steps:

1.  Create the following new query:


-- MXAS12-1:  WITH CACHE Query
WITH CACHE AS 
 '([Product].[Product Department].Members)'
SELECT
   {[Measures].[Warehouse Sales]} ON COLUMNS,
   {[Product].[Product Department].Members} 
 ON ROWS
FROM Warehouse

2.  Execute the query using the Run Query button.

The results dataset appears as shown in Illustration 3.


Illustration 3: WITH CACHE Query

3.  Save the query as MXAS12-1.

Keep in mind that the "life" of the WITH CACHE statement is only as long as the query in which it resides. Use of the WITH CACHE statement can sometimes result in more rapid completion of the overall query, because the full set of cells that we have specified in the statement arrives at the client before the multidimensional data set is returned. In scenarios where the server is accessed via a high speed LAN by the application generating the query, the small performance enhancement may be negligible. By contrast, scenarios where access is over WAN links or modem connections, query results may be retarded in first making an appearance, but will likely require less time to retrieve overall.

Now, let's take a look at the use of the CREATE statement to create a cache in an MDX query. We will create a cache with session scope, as we have already created a cache with query scope above.

4.  Create the following new query:


-- MXAS12-2:  CREATE CACHE Query
CREATE SESSION CACHE FOR Warehouse AS 
'(
  Descendants (
      [Time].[1998],
     [Time].[Month]
),
[Warehouse].[Country].Members
)'

5.  Execute the query using the Run Query button.

6.  Save the query as MXAS12-2.

The query creates a cache with session scope; we notice that no measures are specified this time. This is because all the cube's base measures are loaded into the cache at runtime. The CREATE statement above does not take noticeably more time to execute than would the core query; immediate execution of the query will occur subsequent to cache creation, however, because the query would process in its completeness from RAM, where the cache is housed.

Before we decide to use caching, we need give thought to whether the query we are attempting to improve through caching will actually benefit from means other than redesign of the query itself, and whether the caching process can realistically provide improved performance in general. Obviously, a one-time query is not likely to benefit from caching. In addition, numerous situations will exist where, although a large population of cells are specified in a query, only a few cells are actually accessed. To use caching, for example, in a scenario where only specific tuples out of a large CrossJoin are actually used, may mean more processing time to cache the population of cells than will be saved in performance gains for the few cells actually used. The cache statements are best left to scenarios where their effects are likely to increase performance and where the query actually has a need for tuning from the outset.

7.  Close the Sample Application.

Other Performance Enhancement Options

External functions can often offer processing optimizations over their calculated equivalents, particularly when the calculated members are complex. External functions can be used in a query via the presence of the associated calls to those functions, if the associated function library is installed in the appropriate place(s) on the client or server. Queries making use of these functions can be resolved on either the client or the server if the function resides on either tier, and, as long as the external query is not used in making the determination of the axes within the query, then the query can be largely resolved on the server, even when the function does not reside on the server. Keep in mind, however, that the presence of the function solely on the client means forced client-based processing regardless of other factors.

Finally, cube design modification and / or augmentation can provide significant efficiency in cube processing, even though this might often rest outside the control of the query designer. Such improvements as placing member properties into measures (especially numerical data) allow for free and efficient use of these fields at multiple levels. And the additional processing that can be encountered by making this measure available as a calculated member can be mitigated significantly by placing the measure in a "custom" cube (one-dimensional cubes work fine, if adequate to meet the need), which can then be combined with other cubes - cubes that need the new measure - in virtual cubes. This concept can be extended into many other areas, and provides an excellent way to leverage the existing cube structure by "adding on" needed components.

Summary and Conclusion ...

In this, the concluding lesson of our Optimizing MDX mini-series, we extended our toolset by adding yet another intervention type for optimizing our MDX queries. We exposed two methods of caching to load a commonly used slice of a cube into memory, making for faster retrieval in prospective operations. We discussed the creation scope for caches, within the context of both the CREATE CACHE and the WITH CACHE statements, and discussed appropriate uses for each. Finally, we touched upon other performance enhancement options, including external functions and cube design modifications and augmentation.

While this brings our Optimizing MDX mini-series to a close for the present, we will likely add new segments to the set from time to time, on an ad hoc basis. Optimization is obviously an area of great interest to developers and consumers alike, and it is a rare week indeed when yet another nuance for more efficient MDX does not present itself at client sites and / or in my lab. Stay tuned for more tips in the months to come!

» See All Articles by Columnist William E. Pearson, III

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved