MDX in Analysis Services: Optimizing MDX: Control Location of Processing

Monday Dec 22nd 2003 by William Pearson
Share:

Discover ways to optimize performance in MDX queries. Join author Bill Pearson in a three-part mini-series that begins with a focus on control of the location of query processing.

About the Series ...

This is the tenth tutorial article of the series, MDX in Analysis Services. The series is designed to provide hands-on application of the fundamentals of MDX from the perspective of MS SQL Server 2000 Analysis Services ("Analysis Services,"); our primary focus is the manipulation of multidimensional data sources, using MDX expressions in a variety of scenarios designed to meet real-world business intelligence needs.

For more information on the series, as well as the hardware / software requirements to prepare for the tutorials we will undertake, please see the first lesson of this series: MDX Concepts and Navigation.

Note: At the time of writing, Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples. The screen shots that appear in this article were taken from a Windows 2003 Server, and may appear somewhat different from coinciding views in other operating systems.

Introduction

With this lesson, we begin a three-part mini-series, Optimizing MDX. In these articles, we will explore methods for optimizing the performance we obtain from our MDX queries. Throughout the three tutorials, we will discuss the main types of intervention through which we can enhance performance of MDX, and we will explore examples of the various types in turn, in a hands-on manner, to provide a practical level of awareness that we can apply in our daily work with MDX.

In this lesson, we will:

  • Expose general types of intervention we can use to optimize MDX queries;
  • Discuss how Analysis Services processes queries, as a preparation for our focus on optimization techniques;
  • Examine the Large Level Threshold property, and its pervasive importance;
  • Begin an exploration of optimization through the control of location of query execution (to be concluded in our next lesson).

The Types of Intervention

As most of us know, from having worked with MDX in the preceding articles of this series and elsewhere, MDX syntax allows us to define and manipulate multidimensional objects and data. As with an SQL query, we have found that an MDX query requires a data request (the SELECT clause), a point from which to begin (the FROM clause), and a filter (the WHERE clause). These and other keywords, which may or may not actually be visible in our queries, (many assume the values of defaults when left unspecified), provide the tools used to extract specific portions of data from a cube for analysis. MDX also supplies a robust set of functions for the manipulation of data, data definition language (DDL) syntax for managing data structures, and support for extending itself with user-defined functions. MDX commands abound for creating (and deleting) cubes, dimensions, measures, and their subordinate objects.

Many approaches exist in the rich MDX toolset to help us accomplish our business objectives; and, as we might reasonably expect, some approaches are more efficient in operation than others are, predominantly in terms of memory and CPU utilization. In this article, we will discuss some of the types of intervention that we might perform to achieve optimization of a query, discussing the points at which adjustments and / or enhancements might give us the performance boosts we desire. While we will touch upon enhancements that can be made outside our queries to contribute to performance, our focus in this "triptych" of articles will be largely upon items that can be considered within the queries themselves to contribute to overall efficiency.

The types of intervention that we might investigate as options for the optimization of the MDX queries we construct consist of the following general groups:

  • Control of the location of processing;
  • Optimization of set operations and syntax arrangement;
  • Optimal uses of caching;
  • Substitution of external functions;
  • Leveraging database structure.

We will examine each of these types of intervention in the following tutorials, and look at examples of actions we might take within the context of most, to illustrate possible performance enhancements we might bring about.

Understanding How Queries are Processed

Before we delve into our examination of MDX query optimization techniques, a general understanding of how Analysis Services processes our queries will be useful. It is from the perspective of each of the process points that we can seek to optimize a given query. From a high level, these steps occur in the following general order:

  • An MDX statement or query is issued by a client application.
  • The statement or query is parsed by the PivotTable Service, which rearranges it into an expression tree to facilitate ease of use in subsequent steps.
  • The metadata identification is resolved, and the determination made to process the query on the server or on the client.
  • Final preparations for processing occur, including possible further optimization procedures.
  • Rows and columns are identified (the axes are resolved) and, in some cases (isolation mode, predominantly), cell values are pre-fetched and cached.
  • The combined results are returned to the application from which the query originated.

Integral to our lesson will be the location at which the calculations are performed. The location at which the operations occur is highly relevant to performance optimization, as most of us can readily see. The location under consideration can be either the Analysis Server or the client PivotTable service.

The Location of Query Processing

Control of the location of processing, as we shall see, can be managed from the Large Level Threshold property setting. While the large level value is a critical influencer of which location is selected for query processing, we can also control the location using other means, the chief of which is the Execution Location parameter. We will examine these settings in more depth as we proceed, but before we undertake to control the location of processing, we should understand the benefits and costs associated with the client / server processing options from which we can select. Some of the more important of these are summarized in Table 1.

Client vs. Server Processing

Costs and Benefits

Processing

Location

Disadvantages

Advantages

Server

Consumes more CPU and memory resources

Results can be cached on the server, so that requests for the same result sets by other client applications are quickly available

Client

Takes longer to process

Placing processing burden on the client relieves the server of virtually all resource requirements

Potentially much greater network traffic, as large amounts of data are being dispatched from the server.

Client-based processing means client-based caching. This results in a loss of rapid fulfillment of recurring requests, by multiple clients, for the same information


Table 1: Advantages and Disadvantages of Processing Location Options

Let's take a look at the first of these two prominent settings, Large Level Threshold property, with which we will begin a hands-on examination of managing processing location.

The Large Level Threshold Property

The Large Level Threshold property fixes the point at which a dimension level is too large to be sent to the client application as a single body. The property, by default, is set at the server at 1,000. We can use the property to help us to manage both CPU and memory utilization from the client perspective. Because the property stipulates the point at which level is "too large for the client," it also determines the level at which the number of members from a given level force server-based execution of the PivotTable Service. The property, therefore, is highly useful to us in determining query execution location, which, as we are learning, is a significant point of intervention in the optimization of our MDX queries.

If the number of members in a level is below the threshold we set in this property, it is sent to the client application in a single piece. The level is broken into smaller groups, which are sent to the client application as needed, if the number of members in the level under consideration is equal to or more than the threshold setting. Should the level contain a number of members smaller than this threshold value, the entire level will be sent to the client application. If the level under consideration contains a number of members that is greater than or equal to the value of the Large Level Threshold property, the level is incrementally sent from the server to the client application.

While the default value for the Large Level Threshold property is 1,000, the minimum value for this property is 10. (If the value is set to less than 10, the value will reset itself at 10, automatically preventing any errors in processing). The Large Level Threshold property can be modified at the server, or can be specified by the client upon establishment of a connection to the server. Let's explore the process for setting the parameter in the former case, with the following steps.

1.  Start Analysis Manager (Start -> Programs -> Microsoft SQL Server -> Analysis Services -> Analysis Manager).

2.  Expand the Analysis Servers folder by clicking the "+" sign to its left.

3.  Right-click the Server (MOTHER1 in my case, as shown in Illustration 1), typically named after the machine on which it resides.

4.  Select Properties from the context menu that appears, as shown in Illustration 1.


Illustration 1: Select the Properties Option from the Context Menu

The Properties dialog appears.

5.  Select the Environment tab.

6.  Type 750 in the Large Level Defined As box, as shown in Illustration 2.


Illustration 2: Modification: Large Level Threshold Defined as 750 Members

We have now set the number of members a dimension level must contain to be processed as a large level. The range here can be from 1 - 10,000 (default is 1,000, as we mentioned earlier), and it is meaningful in the context (the number of level members) with which Analysis Services classifies dimension levels into large and small categories only. Keep in mind, as we have already discovered, that large levels are not sent to the client unless they are specifically requested, but small levels are sent to the client even if the entire level is not requested.

7.  Click OK.

We are greeted with a dialog that announces we must restart the Analysis server for changes to take effect, as shown in Illustration 3.


Illustration 3: A Restart of Analysis Services is Required for Changes to Take Effect

8.  Click OK.

9.  Click File --> Exit from the console main menu to close Analysis Services.

The new threshold level will be enacted upon our next start of Analysis Services.

As we have mentioned, we can set the Large Level Threshold property via an MDX query we generate from a client application. When we define the large-level threshold at the client application, we do so by specifying the setpoint as a connection property. Because the property value is set upon opening the connection, this approach for setting the large-level limit does not provide a means of controlling the location of query processing on a query-by-query basis, any more than the same setpoint within Analysis Services provides for ad hoc location management. The Large Level Threshold property therefore allows for setting the property from the client, much as it is set from within the Analysis Services properties that we explored above. A connection string example is as follows:

	Large Level = 750;

Thus, we see that the control of the location of processing can be managed from the Large Level Threshold property setpoint. While the Large Level value is a critical influencer of which location is selected for query processing, we can also control the location using other means, the chief of which is the Execution Location parameter, which we will take up in our next section.

The Execution Location Parameter

We have learned that large levels, as defined by the threshold, are processed solely at the server level, and not sent to the client unless a specific request is made. Small levels, in contrast, are sent to the client to be processed even if the associated level is not requested in its entirety.

The Execution Location connection parameter provides another, more direct means of controlling location of processing of our MDX queries. To build a query that will be executed on the server, we need only use the OLE DB property ExecutionLocation, which specifies where the query is to be resolved. The setpoint options for ExecutionLocation are displayed in Table 2.

Setpoint

Meaning

0

Default. For compatibility with earlier versions, this means the same as setpoint 1, and, thus, "Automatic." (Subject to change in future versions.)

1

The PivotTable Service selects the query processing location (server or client application) that it predicts will provide the best performance ("Automatic" setting).

2

Queries are processed on the client application.

3

Queries are executed on the server. (Queries that contain session-scoped calculated members, user-defined sets, or user-defined functions are exceptions.)


Table 2: Cube Editor View of the Store Dimension Member Properties

The location of query processing can be forced, using the Execution Location property, to our choice of server or client. The default setting (Automatic), allows the PivotTable Service to determine where the query should be processed, based upon its prediction of which option will mean better performance.

A critical factor in the determination of the execution location, when using the default option (where the processing location is determined internally), is the Large Level Threshold property we have already discussed. Say the level is set at 1,000. The PivotTable Service forces "large level" treatment if it determines that the query will require the aggregation of 1000 or more members in a given dimension level. If the query does not involve a large level, the processing of the query occurs at the client. If a large level is apparent, the query processing occurs at the server.

Other means of controlling the location of processing involve named sets, calculated members, and calculated cells, and the circumstances under which they are created. We will delve into the mechanics of these options further in our next session together, Optimizing MDX: More on Location, and the Importance of Arrangement, where we will also move into arrangement considerations in MDX query optimization.

Summary and Conclusion ...

In this lesson, we began the Optimizing MDX group of articles. We established the objective, within this mini-series, of exploring methods for optimizing the performance of our MDX queries. We introduced what I like to call types of intervention, or classifications of approaches that we can take, to enhance the performance of MDX; we will revisit the types of intervention theme throughout the series.

Within the context of each intervention type that we expose, we will perform practice examples to reinforce an awareness that we can apply in our daily work with MDX. In this lesson, we initially discussed the general steps Analysis Services takes in processing queries, as a preparation for our overall focus on optimization techniques. We then "drilled down" our focus to an examination of the important Large Level Threshold property, followed by an introduction to the first intervention type, to which the property contributes its strongest influence, control of location of query execution.

» See All Articles by Columnist William E. Pearson, III

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved