Introduction to MSSQL Server 2000 Analysis Services: Partitioning a Cube in Analysis Services - An Introduction

Monday Oct 11th 2004 by William Pearson
Share:

Divide a logical cube into separate "storage "containers" for data storage flexibility and optimal query performance. Bill Pearson introduces the concepts with a hands-on practice session.

About the Series ...

This article is a member of the series Introduction to MSSQL Server 2000 Analysis Services. The series is designed to provide hands-on application of the fundamentals of MS SQL Server 2000 Analysis Services, with each installment progressively adding features and techniques designed to meet specific real-world needs. For more information on the series, as well as the hardware / software requirements to prepare for the exercises we will undertake, please see my initial article, Creating Our First Cube.

Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples. Images are from a Windows 2003 Server environment, upon which I have also implemented MS Office 2003, but the steps performed in the articles, together with the views that result, will be quite similar within any environment that supports MSSQL Server 2000 and MSSQL Server 2000 Analysis Services ("Analysis Services" or "MSAS"). The same is generally true, except where differences are specifically noted, when MS Office 2000 and above are used in the environment, in cases where MS Office components are presented in the article.

Overview

We have touched upon partitions over the life of the Introduction to MSSQL Server 2000 Analysis Services series, as well as within other series at Database Journal. We recently discussed partitioning more specifically within our article Basic Storage Design, within which we introduced the Storage Design Wizard. The Storage Design Wizard, as we discovered, enables us to manage aggregations on a partition-by-partition basis when working with a multi-partitioned cube. We noted that, if a cube we are optimizing through the use of the Storage Design Wizard contains multiple partitions, we are forced to select a partition from the outset, as we can only design storage for a single partition at a time.

In this article, we will introduce the MSAS Partition Wizard, whose role is to enable us to create and modify partitions to optimize the query performance of our cubes. We will first discuss partitioning as a concept, within the context of MSAS cubes, and then we will perform a hands-on exercise where we partition a copy of the Budget cube, one of the sample cubes that accompany the installation of MSAS. Our objective in this article is an overview of the wizard itself; later articles will focus on detailed partitioning strategy and techniques.

Within our exploration of the Partition Wizard, we will accomplish the following:

  • Create copy of the Budget sample cube for use in our practice exercise;
  • Use the Partition Wizard, to create partitions for our practice cube;
  • Examine the options that are available to us, as we proceed through the guided steps of the Partition Wizard.

Partition a Cube in Analysis Services

Objective and Business Scenario

Partitions allow us to divide a logical cube into separate "storage containers" for our data and its aggregations. All MSAS cubes have at least one partition, which is created along with the cube automatically. Additional partitions are not a requirement, and, indeed, many cubes perform well enough to meet our business needs with only the partition with which they were originally created. Nevertheless, very large cubes can benefit dramatically from a well-conceived partitioning strategy, whose benefits include enlarged flexibility in data storage and data source location, as well as enhanced query performance.

The creation of separate physical partitions means that we can design cube storage around our expected uses, particularly our expected access frequency, of the data in the partitions we design. An example of a common partitioning strategy that I see on client engagements includes the creation of one partition for each quarter of the current operating year, together with a separate partition for each prior year.

Because cube querying, and reporting in general, is typically at its highest frequency in the current year and its constituent quarters, we would likely want to design the "current" partitions with MOLAP storage, along with the aggregations required to deliver rapid query results. Prior years' partitions would receive perhaps less aggregation, because the further in the past a given year from the current year, the less routine demand we would anticipate for more than summary information. The need for access to the partitions in general would decrease with age, so the storage design strategy for those years' data might become more focused upon size conservation, and less upon query performance.

A cube containing financial information can, therefore, contain a partition for the data of each past year, and also partitions for each quarter of the current year. At the end of the year, the four quarterly partitions can be merged into a single partition for the year.

The "separate container" concept means each partition can have different combinations of options, enabling us to design cube storage strategies appropriate to our unique business requirements. These combinations include:

  • source data location;
  • aggregation data location;
  • storage mode;
  • aggregation design.

Many variations with regard to partition data sources are possible; we will investigate some of the alternatives in subsequent articles. We can store partition aggregate data on the Analysis server computer where the partition is defined (the default) or on another Analysis server computer, as a remote partition. Storage mode determines whether the partition's aggregate data is stored in a database specified in the partition's data source (ROLAP), stored completely within a cube on an Analysis server computer (MOLAP) or stored within a custom combination of the two (HOLAP).

Aggregation design, which specifies the number and contents of the aggregations created for the partition, can be different for each partition. As we learned in Basic Storage Design, we can tailor a partition's aggregation design, defining the balance between query performance and storage utilization, with the Storage Design Wizard. Moreover, with the Usage-Based Optimization Wizard, we can perform these same actions, as well as optimizing the aggregation design based on queries (which we can select) that the organization's information consumers have previously executed against the partition's cube. As we noted in our articles surrounding Storage Design and Usage-Based Optimization, the aggregations that we design in either case are actually created as a part of the subsequent cube processing cycle.

Considerations and Comments

For purposes of this exercise, we will prepare a copy of Budget cube, one of the sample cubes within the FoodMart 2000 sample database that accompanies the installation of MSAS. The "clone" will allow us to leave the original sample cube in its existing condition, as we might have saved various settings, structures, and so forth, for referential or other reasons. There will therefore be no need to remember to return and remove settings that we modify for purposes of the lesson, or otherwise restore the original sample to its previous state.

While the cloning process is simple, ensure that you have the authority, access and privileges needed to accomplish the process, and that the copy of an existing MSAS cube (which we will, of course, rename) presents no other issues in your environment. After the session, the clone we have created can be deleted or used for another purpose, whatever is convenient.

If the sample database / Budget cube was not installed, or was removed prior to your beginning this article, please see the MSAS documentation, including the Books Online, for the straightforward procedure to restore the database from the archive (.cab) file containing the samples. As of this writing, a copy of the archive can be obtained from the installation CD or via download from the appropriate Microsoft site(s).

Hands-On Procedure

Preparation

Let's get started by creating a clone of the Budget FoodMart sample cube, as we mentioned in the section above, in case we want to keep our originals intact for other uses.

NOTE: If you are performing the steps of our article on a production (or other) cube, instead of the sample cube we will create for the purposes of illustrating the partitioning process, the preparation steps in the first couple of sections below can be skipped.

1.  Open Analysis Manager, beginning at the Start menu.

2.  Expand the Analysis Servers folder by clicking the "+" sign to its immediate left.

Our server(s) appear.

3.  Expand the desired server.

Our database(s) appear, in much the same manner as shown in Illustration 1.


Illustration 1: A Sample Set of Databases Displayed within Analysis Manager

4.  Expand the FoodMart2000 database.

5.  Expand the Cubes folder.

The sample cubes appear, as shown in Illustration 2.


Illustration 2: The Sample Cubes in the FoodMart2000 Database

NOTE: Your local databases / cube tree will differ, and obviously depend upon the activities you have performed since the installation of MSAS (and the simultaneous creation of the original set of sample cubes). Should you want or need to restore the cubes to their original state, simply restore the database under consideration. For instructions, see the MSSQL Server 2000 Books Online.

6.  Right-click on the Budget sample cube.

Again, we are making a copy of the Budget cube, because our lesson will involve making changes to the cube we use within the practice example. As we have noted, working with the copy will allow us to maintain our existing sample cube in its current condition.

7.  Select Copy from the context menu that appears.

8.  Right-click on the Cubes folder.

9.  Select Paste from the context menu that appears.

The Duplicate Name dialog appears.

As noted in previous articles, we cannot have two cubes of the same name in a given MSAS database.

10.  Type the following into the Name box of the Duplicate Name dialog:

DBJ_Budget

The Duplicate Name dialog appears, with our modification, as depicted in Illustration 3.


Illustration 3: The Duplicate Name Dialog, with New Name

TIP: As I have mentioned elsewhere in this and other series, the foregoing is an excellent way of renaming a cube, (a "rename" capability is not available here, as it is in many Windows applications). Simply create a duplicate, give it the name to which you wish to rename the old cube, and then delete the old cube, as appropriate (although not in this case, of course). This also works for MSAS databases.

11.  Click OK to save the name change.

The new cube appears in the cube tree, among those already in place. We now have a copy of the Budget cube, DBJ_Budget , upon which we can work with the Partition Wizard.

Procedure

We can begin partitioning of our new cube with a few basic steps.

1.  Expand the new DBJ_Budget cube.

2.  Right-click the Partitions folder that appears within the Cube object.

The context menu appears.

3.  Click New Partition, as shown in Illustration 4.

Click for larger image

Illustration 4: Select New Partition

The Welcome dialog appears, as depicted in Illustration 5.

Click for larger image

Illustration 5: Welcome Dialog - Partition Wizard

4.  Click Next.

The Specify the data source and fact table dialog appears, as shown in Illustration 6.


Illustration 6: Specify the Data Source and Fact Table Dialog - Partition Wizard

We will leave this dialog at default, although we might have used it to specify a different data source to be used to define our partition. We will take advantage of this option in partitioning, and in an identical option for performing incremental updates, in subsequent articles. For now, let's continue with our exploration of the basic partitioning process.

5.  Click Next.

The Select the data slice dialog appears. Here we can designate the portion of our fact table upon which our partition is based. This dialog makes specification of the slice easy, by offering a selection of the dimensional members (the slice can be based upon a member, from any level of any available dimension) that are available within our cube's structure. A filter is created, based upon our selection in this dialog, which restricts the rows in the data source that Analysis Services retrieves during cube builds. We can also remove slices here, to modify an existing arrangement we have in place, as the need arises.

6.  Click Time in the Dimensions box (left side of the dialog).

The Members of the Time dimension appear in the Members box, to the right of the dialog.

7.  Click 1997 to select it.

NOTE: We see the Quarter levels if we expand 1997 with the "+" sign to the left, although we wish only to select 1997 at present.

The Select the data slice dialog appears, with our selections, as shown in Illustration 7.


Illustration 7: Select the Data Slice Dialog - Partition Wizard (Year 1997 Expanded)

8.  Click Next.

The Specify the partition type dialog appears. Here we can choose between a local (exists on the same server as the parent cube) or a remote partition type. This opportunity to specify the physical placement of the partition, to which we alluded in our introductory comments, can only be set at its creation. For purposes of our exercise, we will leave the dialog at the default selection of Local.

The Specify the partition type dialog appears as depicted in Illustration 8.


Illustration 8: Specify the Partition Type Dialog - Partition Wizard

9.  Click Next.

The Finish the Partition Wizard dialog appears. Here we can name the partition, and can make provision for its storage design. In our example, where we are creating partitions to segregate years, it is likely that, in the future (say, in 2000), our partitioning strategy might include the merging of years beyond the current and its immediate predecessor, which are by far the most frequently accessed for financial reporting purposes.

10.  Type the following into the Partition name box:

Budget_1997

Merging 1997 and 1998 (the reciprocal partition) together requires that the storage design for both be identical. Another convenience that is provided in the Finish the Partition Wizard dialog is the opportunity to design storage for the new partition in the image of the existing partition (remember that all cubes have at least one), through the use of the Copy the aggregation design from an existing partition radio button.

11.  Select the radio button to the immediate left of Copy the aggregation design from an existing partition, in the What do you want to do? section of the dialog.

The Copy from selector that accompanies the Copy the aggregation design from an existing partition option becomes enabled.

12.  Ensure that DBJ_Budget appears in the Copy from selector.

13.  Click the Process the partition when finished check box.

Another option we have at this stage is accessed via the Advanced button, appearing in the lower right of the dialog. Here, we can use the WHERE clause of an SQL statement to further filter our partition. We will revisit this option in subsequent articles.

The Finish the Partition Wizard dialog appears as depicted in Illustration 9.


Illustration 9: The Finish the Partition Wizard Dialog (with Our Settings Circled)

14.  Click Finish.

Processing begins, and we are soon greeted with indication of successful completion via the Process log viewer. The log viewer appears as shown in Illustration 10.


Illustration 10: The Process Log Viewer Indicates Successful Completion

15.  Click Close to close the Process log viewer.

The new partition, together with the pre-existing one, appears in the cube tree, as depicted in Illustration 11.


Illustration 11: The New Partition Appears ...

16.  Discard the DBJ_Budget cube, as desired.

(Simply right-click, and select the Delete option from the context menu that appears, to delete the cube clone.)

17.  Select File -` Exit, when ready to close Analysis Manager.

We have thus practiced the basic steps of using another useful performance tuning tool, the Partition Wizard. As more data becomes available for analysis, partitions (for quarters or months, in our simple scenario) could potentially be created to meet the needs of the organization in an optimal way. Those requirements, and the frequency with which they would need to be met, would dictate the best strategy for storage and processing optimization through the use of partitions. Once the partitions were created, aggregations and other considerations could be designed for each, with the only limitation being mandated by the likelihood of future merges, as we suggested in our practice exercise.

Summary...

In this lesson we introduced basic partitioning, another tool in our MSAS cube design and optimization toolset. As its name implies, the Partition Wizard provides us an easy means of creating and modifying partitions within our cubes. We first discussed partitioning as a concept, within the context of MSAS cubes, and then we began a hands-on exercise where we created a partition upon a copy of the Budget sample cube. As we proceeded through the guided steps of the Partition Wizard, we examined the options that are available to us, commenting upon general properties and optimization considerations along the way.

» See All Articles by Columnist William E. Pearson, III

Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved