Introduction to MSSQL Server 2000 Analysis Services: Basic Storage Design

Monday Sep 20th 2004 by William Pearson
Share:

Design aggregations for your cube. MSAS Architect Bill Pearson provides a hands-on introduction to the design of aggregation storage size and query performance optimization.

About the Series ...

This article is a member of the series Introduction to MSSQL Server 2000 Analysis Services. The series is designed to provide hands-on application of the fundamentals of MS SQL Server 2000 Analysis Services, with each installment progressively adding features and techniques designed to meet specific real - world needs. For more information on the series, as well as the hardware / software requirements to prepare for the exercises we will undertake, please see my initial article, Creating Our First Cube.

Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples. Images are from a Windows 2003 Server environment, upon which I have also implemented MS Office 2003, but the steps performed in the articles, together with the views that result, will be quite similar within any environment that supports MSSQL Server 2000 and MSSQL Server 2000 Analysis Services ("Analysis Services" or "MSAS"). The same is generally true, except where differences are specifically noted, when MS Office 2000 and above are used in the environment, in cases where MS Office components are presented in the article.

Overview

Optimization of MSAS requires an in-depth knowledge of many factors, including the manner in which MSAS executes queries within the client/server architecture that embodies it. From the perspective of the cubes that MSAS produces, our primary concerns, from a performance standpoint, lie within two main groups: cube processing performance (how fast the cube builds / updates from the source data) and cube query performance (the response time with which consumers' needs are met with information contained in the cube).

The structure of MSAS cubes themselves span many considerations; among the most significant concerns are storage modes and aggregations. MSAS allows for management of these and other factors in numerous ways, including several tools within the application. This article will focus on the Storage Design Wizard, and, as a natural part of exploring its use, we will consider and acknowledge the importance and potential complexity of storage configuration for MSAS. The detailed topics of storage, aggregation, and a host of other considerations in tuning MSAS are beyond the scope of this article. These and many other performance-related topics will be treated individually in other articles of my various series.

In this lesson, we will do the following:

  • Introduce the Storage Design Wizard, discussing its general purposes;
  • Explore scenarios where we might use the Storage Design Wizard;
  • Practice using the Storage Design Wizard in a basic scenario;
  • Review cube processing;
  • Discuss how the processing log can be used to focus on storage design effectiveness.

The Storage Design Wizard

Introduction and Overview

The MSAS Storage Design Wizard provides us a guided, user-friendly approach to configure, both initially and on an on-going basis, storage options for our cubes. In addition, the wizard affords us a means of adding, modifying and removing cube aggregations. Moreover, the Storage Design Wizard lets us manage aggregations on a partition-by-partition basis when working with a multi-partitioned cube, as we shall see when we address partitioning in a subsequent article.

The primary purpose in life for a cube is to provide a data source from which data can be retrieved rapidly by enterprise information consumers. Aggregations, or predefined summaries of data values, support this rapid retrieval of data. One of the strengths of an MSAS OLAP implementation is that it offers us flexibility in deciding the most appropriate physical storage of these aggregations, within the context of our individual business and technical environments.

We can select from three options (referred to as modes) for storage of aggregations in MSAS. The options differ mainly in the physical location where detail, or low level dimensional members, and aggregated values are stored. The three storage modes from which we can select are shown in Table 1.

Storage Mode

Physical Storage Locations

Detail Values

Aggregated Values

ROLAP

RDBMS data source

RDBMS data source

(MSAS-maintained)

HOLAP

RDBMS data source

Cube

MOLAP

Cube

Cube


Table 1: Storage Modes Available in MSAS

As an example, a cube developer might decide that monthly financial system balances for the last two operating years belong in MOLAP, where they can be accessed quickly by the enterprise on a daily basis. The historical data, more than two years old, might be relegated to ROLAP. Information consumers would never know the difference, as all would appear to be coming from the same source (the cube), at the level of retrieved data, unless there is a sudden need to do intensive reporting upon historical data (the speed of retrieval of which might be impacted). Design in this area, if based upon realistic data access and usage requirements, can typically afford consumers a single view of the data, regardless of the underlying storage mode(s) chosen.

Different storage modes can be set up for different portions of the cube, as in the example above, based upon partitions, a concept we explore in other articles of the series. Put simply, MSAS storage modes allow us flexibility to meet the needs of our business environment - and settings that can be easily modified if those circumstances change.

Considerations and Comments

I began working heavily with MSSQL Server in the days of version 6.5, upon which I initially certified. (I had worked with the predecessor version a bit, upgrading it several times and so forth, but version 6.5 was my first in-depth exposure as a DBA / developer). It was my observation, upon the advent of MSSQL Server 7.0 and MSSQL Server 2000, and their complementary OLAP Services and Analysis Services components, respectively, that many "old hands" from database environments including MSSQL Server 6.5, Oracle, and other RDBMS', tended to sneer at the use of wizards in performing maintenance and optimization procedures for the RDBMS and / or the OLAP components that accompanied them. This was often, as with other "assistance" features in MSSQL Server, because many more practitioners were immediately admitted to the "DBA club" (for better or for worse), and because the road to performing many redundant processes was made more open (and less mystical). It is clear that many of these assistance tools, such as the Storage Design Wizard, help us to perform more efficiently, and with less tendency to make errors. Using a sophisticated algorithm to do its work, this wizard is, like most such tools, highly effective when used with proper training and a sufficient understanding of MSAS structural fundamentals.

Usage patterns and other variables enter the tuning equation, for which we have additional tools and procedures, as we have seen, and shall continue to see, in other articles. However, for initial storage configuration, and flexible modification as the irresistible march of time affects our data environments, the Storage Design Wizard offers much in the way of effective, efficient OLAP storage management.

Because the procedure we take in this article will alter the structure of one of the sample cubes that ships with MSAS, we will make a copy of the cube to avoid making changes to the original. (Many MSAS practitioners have created other example objects within, and made other illustrative changes to, the original sample cubes, and wish to keep these "customized samples" intact.) This brief preparatory step will leave you able to revisit the sample cube in its original / current state in the future, without having to undo any steps that we take in the following sections, or otherwise "put things back as they were." (You can always restore the database to bring back the sample cubes exactly as they appeared at installation, as well. See the Books Online if you need to take this route.)

Hands-On Procedure

Preparation

We will prepare for our practice example by opening MSAS Analysis Manager, and creating a copy of the HR cube, upon which to run the Storage Design Wizard. Keep in mind, as we progress, that we are working with a small cube with no pre-existing aggregations.

1.  Start Analysis Manager.

2.  Expand the Analysis Servers folder by clicking the "+" sign to its immediate left.

Our server(s) appear.

3.  Expand the desired server (mine appears as MOTHER1 in the illustrations).

Our database(s) appear, in much the same manner as shown in Illustration 1.

NOTE: The objects appearing in your environment will differ, in general, from mine, and depend upon the actions you have performed since initially installing MSAS.


Illustration 1: Sample Databases Displayed within Analysis Manager

4.  Expand the FoodMart 2000 database.

5.  Expand the Cubes folder.

The sample cubes appear, as shown in Illustration 2.


Illustration 2: FoodMart 2000 Database Sample Cubes

6.  Right-click on the HR sample cube.

7.  Select Copy from the context menu that appears, as shown in Illustration 3.


Illustration 3: Selecting Copy from the Context Menu

8.  Right-click on the Cubes folder.

9.  Select Paste from the context menu that appears, as shown in Illustration 4.


Illustration 4: Selecting Paste from the Context Menu

The Duplicate Name dialog appears.

We cannot have two cubes of the same name in a given MSAS database.

10.  Type the following into the Name box of the Duplicate Name dialog:

DBJ_STORDESIGN

The Duplicate Name dialog appears, with our modification, as depicted in Illustration 5.


Illustration 5: The Duplicate Name Dialog, New Name Inserted

TIP:

This is also a good way to rename MSAS objects for which a "Rename" option does not exist. In "rename" scenarios, we simply create the new object in the manner shown above, give it the desired name, and discard the original object, as appropriate.

Keep in mind that, in the case of cubes and other structural objects, this will likely mean reprocessing before the clone will be fully usable.

11.  Click OK to save the name change.

The new cube appears in the cube tree, among those already in place. We now have a copy of the HR cube, DBJ_STORDESIGN, upon which we can work with the Storage Design Wizard.

Storage Design in a Simple Cube Structure

Let's take a look at the Storage Design Wizard in a practice exercise.

We will work with a simple cube structure, to minimize distraction from the steps involved. We will return to the Storage Design Wizard within the context of managing partitions, among other considerations, in a subsequent article where more complex scenarios will arise.

1.  Right-click on the DBJ_STORDESIGN sample cube.

2.  Click Design Storage on the context menu that appears, as shown in Illustration 6.


Illustration 6: Selecting Design Storage from the Context Menu

The Storage Design Wizard Welcome dialog appears, as depicted in Illustration 7.


Illustration 7: Welcome Dialog - Storage Design Wizard

3.  Click Next.

The Select the type of data storage dialog appears. (Had aggregates already existed in the cube, an Aggregates already exist dialog would have appeared.) It is here that we select from the storage modes that we discussed earlier.

4.  Click the MOLAP radio button to select the MOLAP storage mode.

The Select the type of data storage dialog appears as shown in Illustration 8.


Illustration 8: Select the Type of Data Storage Dialog

Recall from our discussion earlier that the MOLAP option places both the detail data and the aggregations in the cube. This will be a good selection for our example, so that we can focus on the further actions of the Storage Design Wizard. We will devote future articles to the characteristics and appropriate uses of the ROLAP and HOLAP options.

5.  Click Next.

The Set Aggregation Options dialog appears. Within this dialog, we can exploit more of the powerful features of the MSAS Storage Design Wizard. Here, MSAS determines the combination of aggregations that give us the most "return," within the cube design we have submitted. The process is easy and needs to be accomplished only infrequently, providing that it is performed correctly at appropriate points in the life cycle of a given cube.

The Storage Design Wizard applies an 80 / 20 rule within the sophisticated algorithm that we mentioned earlier in helping us to attain, at least from a preliminary perspective, the optimal mix of aggregations within our cube structure. Because all aggregated measures in a cube are derivatives, higher-level aggregations of measures can be derived, upon demand, from lower level aggregations. The example that is cited most often is that of an aggregation that exists at intersects of the Time dimension and another dimension in the cube.

Let's take, for example, a Store by Month aggregation. This aggregation can "roll up" to Store by Quarter, Store by Year and other higher-level aggregations within MSAS. The benefit of this is that all the higher-level aggregations do not need to be stored in their "materialized" state, adding to the overall space requirement in what can be an exponential manner. Precalculated aggregations need not exist for every rolled up intersect. MSAS allows for the dynamic generation of these aggregations upon demand - in effect, they can be maintained as "virtual" aggregations.

On the Set aggregation options dialog, we can set parameters that affect both aggregation storage and query performance. In the Aggregation options section of the dialog, we can mandate restrictions on the total cube size by setting an upper limit for the amount of space that we feel we can afford to devote to the cube. The algorithm is then put to work to determine the "best mix" of aggregations that it can manage in the space we dictate.

We will leave the Estimated storage reaches selection at the default of 100 MB for this exercise. We will, however, make an adjustment to the next parameter, Performance gain reaches.

6.  Click the Performance gain reaches radio button to activate the percentage box to its right.

7.  Type in 20 for the percentage.



The percentage performance gain we type into this box option represents the targeted percentage improvement between the maximum and minimum query times. Twenty is a good starting target, and can be expected, generally, to result in adequate aggregation to ensure a significant increase in performance. Diminishing returns can result from setting the percentage unnecessarily high; the idea here is to attain a good level of balance between the increased disk space required by new aggregations and the level of overall performance.

A third radio button represents another option for getting to the best aggregation mix. With the Until I click stop setting, we can attempt to manually determine the best balance in conjunction with keeping an eye on the Performance vs. Size graph that appears to the right of the dialog. We would, ideally, determine the point at which the increase in performance begins to level off while storage continues to increase, and then stop the process.

As we progress through this and other series, we will focus, from time to time, on the use of the parameters found within the Set aggregate options dialog, along with numerous others, to meet specific tuning objectives, or to offer options for overall improvement in a certain aspect of query or processing performance. For now, let's get a grasp of the operation of the tool.

The Set Aggregation Options dialog, with our settings, appears as depicted in Illustration 9.


Illustration 9: Set Aggregation Options Dialog

The remaining buttons include the following:

  • Start: Kicks off the aggregation design process, based upon our settings;
  • Continue: Activates once we click Stop, or when the Performance vs. Size graph indicates we have met our storage or performance gain targets, as a means of resuming the design process;
  • Stop: Allows us to manually stop the design process;
  • Reset: Enables us to delete aggregations added and restart the design process.

To conclude this section, let's proceed with the following steps.

8.  Click Start.

The Next button will activate as soon as the design process finishes. The Set aggregation options dialog, after the process is complete, appears as shown in Illustration 10.


Illustration 10: Set Aggregation Options Dialog, Results Displayed at Lower Right

We see that the Wizard has produced 12 aggregations, and reached the 20% estimated performance gain level, as indicated underneath the Performance vs. Size graph. Pressing Continue (which has become enabled) at this point will result in intermittent incremental increases above 20%, with a leveling off of the curve, so a degree of manual tweaking can be had for minimal additional effort. We will leave the results as they are, however and move ahead.

9.  Click Next, to arrive at the Finish the Storage Design Wizard dialog, as shown in Illustration 11.


Illustration 11: Finish the Storage Design Wizard Dialog

We can either save our new definition at this point, or process the cube to create the new aggregations. The definition remains stored until the design is enacted, via the processing cycle.

Let's process the cube, and pay attention to the effects in the resulting Process Log.

10.  Leaving the radio button (underneath What do you want to do now?) at its default of Process now, click Next.

Cube processing begins, and runs its course quickly, as witnessed by the Process Log window that next appears. Once processing is finished, we notice the green Processing completed successfully message appears at the bottom of the log window, as depicted in Illustration 12.


Illustration 12: Processing Completed Successfully Message
(Window Partially Collapsed)

We note that processing duration time, together with the various steps of the process, is detailed in the log window. This presents an opportunity, in tuning evolutions, to compare process times between the current log and previous logs (all logs are captured in a database, which we have explored in other articles, and will again examine in subsequent articles). We note, too, that the log states, "cube needs to be processed" on the fourth line from the top of the entries, an indication that the condition was noted as soon as we began processing. Keep in mind that changes planned via the Storage Design Wizard (among numerous other structural changes) require a processing run to be consummated.

11.  Click Close to close the log window, once you have examined it.

We leave the Process Log window and the Cube Editor behind, and arrive in Analysis Manager, once more.

12.  Delete the DBJ_STORDESIGN cube, if desired, by right-clicking and selecting Delete from the context menu that appears.

13.  Select File --> Exit from the Main Menu in the Management Console to close Analysis Manager.

We have seen operation of the Storage Design Wizard from start to finish, for a simple cube. We will revisit the Storage Design Wizard from time to time, specifically within the context of our work with partitions, which allow us to design aggregations differently for separate "sections" of a cube, and within other articles where we can employ it to help us meet various maintenance and optimization objectives.

Summary ...

In this lesson, we introduced the Storage Design Wizard, and emphasized its value as a tool within the important, and often complex, context of storage configuration for MSAS. Our objective was to expose the tool's role in storage and aggregation design, as well as the basic concepts of MSAS storage design in general.

We explored some of the scenarios where we might use the Storage Design Wizard, and discussed the storage types that are available to us in our cube designs. We then practiced using the Storage Design Wizard within the context of a simple cube with no pre-existing aggregations, and exposed the steps involved in increasing performance through basic storage design. We processed the cube to put our changes into effect, discussing processing considerations within the scope of storage design. Finally, we discussed how the Processing Log can be used as a means of ascertaining the effects of our storage and aggregation design upon cube processing performance.

» See All Articles by Columnist William E. Pearson, III

Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved