Design aggregations for your cube. MSAS
Architect Bill Pearson provides a hands-on introduction to the design of aggregation
storage size and query performance optimization.
About the Series ...
article is a member of the series Introduction to MSSQL Server 2000 Analysis Services. The series is designed to
provide hands-on application of the fundamentals of MS SQL Server 2000 Analysis
Services, with each installment progressively adding features and
techniques designed to meet specific real - world needs. For more information
on the series, as well as the hardware / software requirements to prepare
for the exercises we will undertake, please see my initial article, Creating Our First
Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL
Server 2000 Analysis Services, and the related Books Online
and Samples. Images are from a Windows 2003 Server
environment, upon which I have also implemented MS Office 2003, but the
steps performed in the articles, together with the views that result, will be
quite similar within any environment that supports MSSQL Server 2000 and MSSQL Server 2000 Analysis Services ("Analysis
Services" or "MSAS"). The same is generally true,
except where differences are specifically noted, when MS Office 2000 and
above are used in the environment, in cases where MS Office components are
presented in the article.
of MSAS requires an in-depth knowledge of many factors, including the manner in
which MSAS executes queries within the client/server architecture that embodies
it. From the perspective of the cubes that MSAS produces, our primary
concerns, from a performance standpoint, lie within two main groups: cube
processing performance (how fast the cube builds / updates from the source
data) and cube query performance (the response time with which consumers' needs
are met with information contained in the cube).
The structure of MSAS
cubes themselves span many considerations; among the most significant concerns
are storage modes and aggregations. MSAS allows for management of these and
other factors in numerous ways, including several tools within the application.
This article will focus on the Storage Design Wizard, and, as a natural
part of exploring its use, we will consider and acknowledge the importance and
potential complexity of storage configuration for MSAS. The detailed topics of
storage, aggregation, and a host of other considerations in tuning MSAS are
beyond the scope of this article. These and many other performance-related
topics will be treated individually in other articles of my various series.
In this lesson, we will do the following:
Introduce the Storage Design Wizard, discussing its general purposes;
scenarios where we might use the Storage Design Wizard;
the Storage Design
Wizard in a basic
the processing log can be used to focus on storage design effectiveness.
The Storage Design Wizard
Introduction and Overview
MSAS Storage Design Wizard provides us a guided, user-friendly approach
to configure, both initially and on an on-going basis, storage options
for our cubes. In addition, the wizard affords us a means of adding, modifying
and removing cube aggregations. Moreover, the Storage Design Wizard
lets us manage aggregations on a partition-by-partition basis when working with
a multi-partitioned cube, as we shall see when we address partitioning in a
primary purpose in life for a cube is to provide a data source from which data
can be retrieved rapidly by enterprise information consumers. Aggregations,
or predefined summaries of data values, support this rapid retrieval of data.
One of the strengths of an MSAS OLAP implementation is that it offers us flexibility
in deciding the most appropriate physical storage of these aggregations, within
the context of our individual business and technical environments.
select from three options (referred to as modes) for storage of
aggregations in MSAS. The options differ mainly in the physical location where
detail, or low level dimensional members, and aggregated values are stored. The
three storage modes from which we can select are shown in Table 1.
Table 1: Storage Modes Available in MSAS
Physical Storage Locations
example, a cube developer might decide that monthly financial system balances
for the last two operating years belong in MOLAP, where they can be
accessed quickly by the enterprise on a daily basis. The historical data, more
than two years old, might be relegated to ROLAP. Information consumers
would never know the difference, as all would appear to be coming from the same
source (the cube), at the level of retrieved data, unless there is a sudden
need to do intensive reporting upon historical data (the speed of retrieval of
which might be impacted). Design in this area, if based upon realistic data
access and usage requirements, can typically afford consumers a single view of
the data, regardless of the underlying storage mode(s) chosen.
storage modes can be set up for different portions of the cube, as in the
example above, based upon partitions, a concept we explore in other
articles of the series. Put simply, MSAS storage modes allow us flexibility to
meet the needs of our business environment - and settings that can be easily
modified if those circumstances change.
Considerations and Comments
I began working heavily
with MSSQL Server in the days of version 6.5, upon which I initially
certified. (I had worked with the predecessor version a bit, upgrading it
several times and so forth, but version 6.5 was my first in-depth exposure as a
DBA / developer). It was my observation, upon the advent of MSSQL Server 7.0
and MSSQL Server 2000, and their complementary OLAP Services and Analysis
Services components, respectively, that many "old hands" from
database environments including MSSQL Server 6.5, Oracle, and other RDBMS', tended
to sneer at the use of wizards in performing maintenance and optimization
procedures for the RDBMS and / or the OLAP components that accompanied them. This
was often, as with other "assistance" features in MSSQL Server,
because many more practitioners were immediately admitted to the "DBA club"
(for better or for worse), and because the road to performing many redundant processes
was made more open (and less mystical). It is clear that many of these assistance
tools, such as the Storage Design Wizard, help us to perform more
efficiently, and with less tendency to make errors. Using a sophisticated
algorithm to do its work, this wizard is, like most such tools, highly
effective when used with proper training and a sufficient understanding of MSAS
Usage patterns and
other variables enter the tuning equation, for which we have additional tools
and procedures, as we have seen, and shall continue to see, in other articles. However,
for initial storage configuration, and flexible modification as the
irresistible march of time affects our data environments, the Storage Design
Wizard offers much in the way of effective, efficient OLAP storage
the procedure we take in this article will alter the structure of one of the
sample cubes that ships with MSAS, we will make a copy of the cube to avoid
making changes to the original. (Many MSAS practitioners have created other
example objects within, and made other illustrative changes to, the original
sample cubes, and wish to keep these "customized samples" intact.)
This brief preparatory step will leave you able to revisit the sample cube in
its original / current state in the future, without having to undo any steps
that we take in the following sections, or otherwise "put things back as
they were." (You can always restore the database to bring back the sample
cubes exactly as they appeared at installation, as well. See the Books
Online if you need to take this route.)
will prepare for our practice example by opening MSAS Analysis Manager,
and creating a copy of the HR cube, upon which to run the Storage
Design Wizard. Keep
in mind, as we progress, that we are working with a small cube with no
Expand the Analysis
Servers folder by clicking the "+" sign to its immediate
desired server (mine appears as MOTHER1 in the illustrations).
Our database(s) appear, in much the same manner
as shown in Illustration 1.
NOTE: The objects appearing in your
environment will differ, in general, from mine, and depend upon the actions you
have performed since initially installing MSAS.
Illustration 1: Sample Databases
Displayed within Analysis Manager
Expand the FoodMart
Expand the Cubes
sample cubes appear,
as shown in Illustration 2.
Illustration 2: FoodMart
2000 Database Sample Cubes
the HR sample cube.
from the context menu that appears, as shown in Illustration 3.
Illustration 3: Selecting
Copy from the Context Menu
the Cubes folder.
from the context menu that appears, as shown in Illustration 4.
Illustration 4: Selecting
Paste from the Context Menu
Name dialog appears.
have two cubes of the same name in a given MSAS database.
following into the Name box of the Duplicate Name dialog:
Name dialog appears, with our modification, as depicted in Illustration
Illustration 5: The
Duplicate Name Dialog, New Name Inserted
is also a good way to rename MSAS objects for which a "Rename"
option does not exist. In "rename" scenarios, we simply create the
new object in the manner shown above, give it the desired name, and discard the
original object, as appropriate.
Keep in mind that, in the case of cubes and other structural
objects, this will likely mean reprocessing before the clone will be fully
to save the name change.
cube appears in the cube tree, among those already in place. We now have a
copy of the HR cube, DBJ_STORDESIGN, upon which we can work with
the Storage Design Wizard.
Design in a Simple Cube Structure
Let's take a look at
the Storage Design Wizard in a practice exercise.
We will work with a
simple cube structure, to minimize distraction from the steps involved. We will
return to the Storage Design Wizard within the context of managing
partitions, among other considerations, in a subsequent article where more
complex scenarios will arise.
the DBJ_STORDESIGN sample cube.
Storage on the context menu that appears, as shown in Illustration 6.
Illustration 6: Selecting
Design Storage from the Context Menu
Design Wizard Welcome dialog appears, as depicted in Illustration 7.
Illustration 7: Welcome
Dialog - Storage Design Wizard
the type of data storage dialog appears. (Had aggregates already existed
in the cube, an Aggregates already exist dialog would have appeared.)
It is here that we select from the storage modes that we discussed
Click the MOLAP
radio button to select the MOLAP storage mode.
the type of data storage dialog appears as shown in Illustration 8.
Illustration 8: Select
the Type of Data Storage Dialog
from our discussion earlier that the MOLAP option places both the detail
data and the aggregations in the cube. This will be a good selection
for our example, so that we can focus on the further actions of the Storage
Design Wizard. We will devote future articles to the characteristics and
appropriate uses of the ROLAP and HOLAP options.
Aggregation Options dialog appears. Within this dialog, we can exploit
more of the powerful features of the MSAS Storage Design Wizard. Here,
MSAS determines the combination of aggregations that give us the most "return,"
within the cube design we have submitted. The process is easy and needs to be
accomplished only infrequently, providing that it is performed correctly at
appropriate points in the life cycle of a given cube.
Design Wizard applies an 80 / 20 rule within the sophisticated algorithm
that we mentioned earlier in helping us to attain, at least from a preliminary
perspective, the optimal mix of aggregations within our cube structure.
Because all aggregated measures in a cube are derivatives, higher-level
aggregations of measures can be derived, upon demand, from lower level
aggregations. The example that is cited most often is that of an aggregation
that exists at intersects of the Time dimension and another dimension in
take, for example, a Store by Month aggregation. This aggregation can "roll
up" to Store by Quarter, Store by Year and other higher-level
aggregations within MSAS. The benefit of this is that all the higher-level
aggregations do not need to be stored in their "materialized" state,
adding to the overall space requirement in what can be an exponential manner.
Precalculated aggregations need not exist for every rolled up intersect. MSAS
allows for the dynamic generation of these aggregations upon demand - in effect,
they can be maintained as "virtual" aggregations.
On the Set
aggregation options dialog, we can set parameters that affect both aggregation
storage and query performance. In the Aggregation options section of
the dialog, we can mandate restrictions on the total cube size by setting an
upper limit for the amount of space that we feel we can afford to devote to the
cube. The algorithm is then put to work to determine the "best mix"
of aggregations that it can manage in the space we dictate.
leave the Estimated storage reaches selection at the default of 100
MB for this exercise. We will, however, make an adjustment to the next
parameter, Performance gain reaches.
Click the Performance
gain reaches radio button to activate the percentage box to its right.
Type in 20
for the percentage.
The percentage performance gain we type into this box option represents the targeted percentage improvement between the maximum and minimum query times. Twenty is a good starting target, and can be expected, generally, to result in adequate aggregation to ensure a significant increase in performance. Diminishing returns can result from setting the percentage unnecessarily high; the idea here is to attain a good level of balance between the increased disk space required by new aggregations and the level of overall performance.
A third radio button represents another option for getting to the best aggregation mix. With the Until I click stop setting, we can attempt to manually determine the best balance in conjunction with keeping an eye on the Performance vs. Size graph that appears to the right of the dialog. We would, ideally, determine the point at which the increase in performance begins to level off while storage continues to increase, and then stop the process.
As we progress through this and other series, we will focus, from time to time, on the use of the parameters found within the Set aggregate options dialog, along with numerous others, to meet specific tuning objectives, or to offer options for overall improvement in a certain aspect of query or processing performance. For now, let's get a grasp of the operation of the tool.
The Set Aggregation Options dialog, with our settings, appears as depicted in Illustration 9.
Illustration 9: Set Aggregation Options Dialog
The remaining buttons include the following:
- Start: Kicks off the aggregation design process, based upon our settings;
- Continue: Activates once we click Stop, or when the Performance vs. Size graph indicates we have met our storage or performance gain targets, as a means of resuming the design process;
- Stop: Allows us to manually stop the design process;
- Reset: Enables us to delete aggregations added and restart the design process.
To conclude this section, let's proceed with the following steps.
The Next button will activate as soon as the design process finishes. The Set aggregation options dialog, after the process is complete, appears as shown in Illustration 10.
Illustration 10: Set Aggregation Options Dialog, Results Displayed at Lower Right
We see that the Wizard has produced 12 aggregations, and reached the 20% estimated performance gain level, as indicated underneath the Performance vs. Size graph. Pressing Continue (which has become enabled) at this point will result in intermittent incremental increases above 20%, with a leveling off of the curve, so a degree of manual tweaking can be had for minimal additional effort. We will leave the results as they are, however and move ahead.
9. Click Next, to arrive at the Finish the Storage Design Wizard dialog, as shown in Illustration 11.
Illustration 11: Finish the Storage Design Wizard Dialog
We can either save our new definition at this point, or process the cube to create the new aggregations. The definition remains stored until the design is enacted, via the processing cycle.
Let's process the cube, and pay attention to the effects in the resulting Process Log.
10. Leaving the radio button (underneath What do you want to do now?) at its default of Process now, click Next.
Cube processing begins, and runs its course quickly, as witnessed by the Process Log window that next appears. Once processing is finished, we notice the green Processing completed successfully message appears at the bottom of the log window, as depicted in Illustration 12.
Illustration 12: Processing Completed Successfully Message
(Window Partially Collapsed)
We note that processing duration time, together with the various steps of the process, is detailed in the log window. This presents an opportunity, in tuning evolutions, to compare process times between the current log and previous logs (all logs are captured in a database, which we have explored in other articles, and will again examine in subsequent articles). We note, too, that the log states, "cube needs to be processed" on the fourth line from the top of the entries, an indication that the condition was noted as soon as we began processing. Keep in mind that changes planned via the Storage Design Wizard (among numerous other structural changes) require a processing run to be consummated.
11. Click Close to close the log window, once you have examined it.
We leave the Process Log window and the Cube Editor behind, and arrive in Analysis Manager, once more.
12. Delete the DBJ_STORDESIGN cube, if desired, by right-clicking and selecting Delete from the context menu that appears.
13. Select File --> Exit from the Main Menu in the Management Console to close Analysis Manager.
We have seen operation of the Storage Design Wizard from start to finish, for a simple cube. We will revisit the Storage Design Wizard from time to time, specifically within the context of our work with partitions, which allow us to design aggregations differently for separate "sections" of a cube, and within other articles where we can employ it to help us meet various maintenance and optimization objectives.
In this lesson, we introduced the Storage Design Wizard, and emphasized its value as a tool within the important, and often complex, context of storage configuration for MSAS. Our objective was to expose the tool's role in storage and aggregation design, as well as the basic concepts of MSAS storage design in general.
We explored some of the scenarios where we might use the Storage Design Wizard, and discussed the storage types that are available to us in our cube designs. We then practiced using the Storage Design Wizard within the context of a simple cube with no pre-existing aggregations, and exposed the steps involved in increasing performance through basic storage design. We processed the cube to put our changes into effect, discussing processing considerations within the scope of storage design. Finally, we discussed how the Processing Log can be used as a means of ascertaining the effects of our storage and aggregation design upon cube processing performance.
» See All Articles by Columnist William E. Pearson, III
Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.