Using Sets in MDX Queries

Monday Aug 18th 2003 by William Pearson
Share:

Learn the fundamentals of an MDX query as we explore operations with Sets. Author Bill Pearson presents a practical overview of basic MDX queries and explores MDX sets in this high impact article.

About the Series ...

This is the sixth tutorial article of the series, MDX in Analysis Services. The series is designed to provide hands-on application of the fundamentals of MDX from the perspective of MS SQL Server 2000 Analysis Services ("Analysis Services,"); our primary focus is the manipulation of multidimensional data sources, using MDX expressions in a variety of scenarios designed to meet real-world business intelligence needs.

For more information on the series, as well as the hardware / software requirements to prepare for the tutorials we will undertake, please see Tutorial 1: MDX Concepts and Navigation.

Note: At the time of writing, Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples.

Introduction

In our last tutorial, we expanded upon the topics we introduced in our previous lesson, Retrieve Data from Multiple Cubes. Introducing the concept of time within the context of expression design, we worked through examples of providing support for time-based analysis, such as the quantification of change over time in values, with MDX functions designed for that purpose.

In this lesson, we will begin an exploration of MDX Sets. We will need to have a sound basis in MDX query fundamentals to do so, so we will examine the makeup and operation of MDX queries as a preparatory step. We will then move into our examination of MDX Sets. We will:

  • Discuss the differences between MDX expressions and MDX query statements;
  • Create rudimentary MDX queries to gain an understanding of their components and uses;
  • Explore MDX sets, their functional uses, and the steps involved in their creation;
  • Practice the use of set functions as a part of working with MDX sets.

Introducing MDX Queries

The first order of business in this tutorial will be to gain a good understanding of MDX query statements, and to understand how they differ from the multidimensional expressions with which we have worked through our last lesson. As we have noted in past tutorials, an expression calculates a single value. An expression can be used within various applications, among them Excel PivotTable Reports or an Office PivotTable List, to retrieve values from an OLAP data source. In contrast, a query is used to extract multiple values from a cube. Queries underlie the reporting that we might do in the PivotTable components, and are often largely invisible to us when we develop our reports visually with these and other tools.

We often juxtapose numerous expressions, and the values they retrieve, to obtain a multidimensional value for analysis. We will learn to create increasingly sophisticated MDX queries in this and prospective tutorials, and will be able to understand the underlying query components that are generated in the background for us when we create MDX expressions with visual tools. Knowing how to create queries will equip us to retrieve values to support the immediate reporting requirements of our organizations' information consumers, as well as to maintain those queries to meet changing demands, through the precise specification of dimensional criteria.

Key Concepts and Terminology

We will begin our exploration of MDX queries by emphasizing the concepts involved in controlling the values that we extract from our cubes within precisely defined levels of specific dimensions. Recall from previous lessons that a cube is a conceptually multidimensional structure; the intersect points / intersections of the dimensions are where data reside, in single or multiple elements called measures.

As we also discussed in previous lessons, MDX uses a reference system involving the tuples concept to identify and extract data, whether it be data in a single cell or a block of cells. Tuples list dimensions and their members (which include levels) to "address" individual cells, as well as sections of cells, within the cube, and, because any given cell is an intersection of all the dimensions in the cube, tuples can be used to uniquely identify every cell in the cube. As a means of reference, measures themselves are treated as a special sort of dimension, named Measures within Analysis Services.

Because tuples uniquely identify sections of the cube, based upon the dimensional intersections that define the section's "address," they have no need to refer to any specific cell or cells in doing so. They represent subsets of the multiple dimensions of the cube, and provide slices that encompass more than one cell.

We will work with ordered groups of tuples, referred to as sets, later in the lesson. Common examples of set usage include axis dimensions (the dimensions and members to be returned, specified in the SELECT statement in the query, as we will see) and slicer dimensions (the specific dimension and member criteria to which the returned data is restricted, used by the WHERE statement, as we will discover). The axis dimension exists to return data for multiple members, while the slicer dimension is used to return data for a single member. The axis and slicer dimensions work in conjunction to define, in terms of the source cube (identified in the FROM clause of the query), the subset of dimensions that make up the result cube, which itself can be composed of multiple dimensions. More than one dimension can reside on either of the two axes, as we shall soon see, and any given dimension can exist on either axis (but never both). This fact, coupled with the capabilities to move the dimensional members about at will within our reports, as well as with the capability to drill down and zoom on dimensional member levels, are what make multidimensional analysis possible in our reporting efforts.

Understanding the Metadata

As many of us are aware, MSSQL Server 2000 comes equipped with a tool called the Query Analyzer, which allows us to input a SQL statement, execute it, and see the results from a couple of different views. Similarly (in a simple sense), Analysis Services provides this capability via a sample application that is installed with the typical installation of the package. Also similar to its MSSQL Server counterpart, the MDX Sample Application provides a Metadata pane that outfits us with a means of interacting with cube objects visually. We can thus easily work with MDX expressions and build MDX queries. As we progress in our exploration of MDX queries, we will rely upon the MDX Sample Application to make writing queries easier, as well as to help us understand the structure of the information that our cube contains. This can be especially useful as a means of grasping the structure of a cube with which we are not familiar, and / or with which we had no design involvement.

In the Books Online, which are bundled with any typical installation of MSSQL Server 2000 / Analysis Services, or which can be accessed on the CD from which the installation is performed, Microsoft defines metadata as information about data, (or data about data, hence the term "metadata." ). Metadata concerns itself with data and its properties, such as data type (text, numeric, etc.), or column sizes. Metadata also comprises information surrounding data structures, or information concerned with the design or makeup of objects, such as cubes, dimensions, levels, and so forth. Analysis Services provides the MDX Sample Application to provide a vehicle from which new users can issue MDX queries against an Analysis server, as well as to illustrate (along with the underlying source code that is freely available) a working example from which developers can understand how to create custom applications for querying with MDX.

Sets in MDX Queries

We will begin our hands-on exposure to MDX by taking advantage of the Sample Application to initially generate a simple query, then to move into queries that are more sophisticated. In this way, we will gain an understanding of MDX queries "from the ground up" in an efficient manner, expanding our grasp of both MDX and our cube's metadata simultaneously.

Basic Query Components

Let's begin by creating a rudimentary query using the Sample Application. We will begin by taking the following steps:

1. Go to the Start button on the PC, then navigate to Microsoft SQL Server --> Analysis Services, then to the MDX Sample Application.

We are initially greeted by the Connect dialog, shown in Illustration 1.


Illustration 1: The Connect Dialog for the MDX Sample Application

The illustration above depicts the name of my server, MOTHER, and properly indicates that we will be connecting via the MSOLAP provider (the default).

2. Click OK.

(We might also choose to cancel the dialog box and connect later by clicking Connect on the File menu.)

The MDX Sample Application window appears.

3. Clear the top area (the Query pane) of any remnants of queries that might appear.

4. Ensure that FoodMart 2000 is selected as the database name in the DB box of the toolbar.

5. Select the Warehouse cube in the Cube drop-down list box.

The MDX Sample Application window should resemble that shown in Illustration 2, complete with the information from the Warehouse cube displaying in the Metadata tree (left section of the Metadata pane).


Illustration 2: The MDX Sample Application Window (Compressed)

Many similarities exist between the arrangements of objects in the Metadata pane and the Calculated Member Builder, which we have explored in earlier lessons of this series. We will discuss various attributes of the MDX Sample Application when they are relevant to the exercises we undertake; it might prove highly useful to explore the Books Online for a wealth of detail about the application.

6. Type the following query into the Query pane:

SELECT 
FROM Warehouse

7. Click the Run Query button on the toolbar (the button sporting the green arrowhead-shaped icon - a tool tip will alight when the cursor is placed upon the button to positively identify it for us).

We see the number 102,278.41 appear in the in the Results pane. This amount represents the "grand total" of all measures in the Warehouse cube, because the query we have input is simple, and specifies no members from any dimensions within the cube. We are thus awarded the grand total, because the query supplies the default member for each dimension; this is standard behavior when no specifics are stipulated in the query.

8. Save the query by selecting File -> Save As (the Save Query File button on the toolbar will work equally well), and call the file MXAS06-1.

9. Click Save to save the file, after navigating to a meaningful location of your choice.

Note: I typically prefer to save files to a context-oriented directory / folder (for example, a folder I have created for a client for whom I am writing MDX queries as a part of an engagement, or for a class I am teaching). This is obviously a point of personal taste; our objective is to keep track of where the queries are so that we can find them in time of need. Much rewriting and confusion between altered versions can be avoided by storing the queries in a logical system of some sort to keep organized. My favorite way to do this is to create a database within which to store the query strings, together with descriptions, author and keyword information, along with creation / modification datetime data and "version" information, if applicable.

Working with Sets in Our Queries

We can generate queries that are far more sophisticated by adding sets of members to one or both of the axes. Let's explore this, with the following steps:

10. Select Query -> New from the top menu to set up a new Query pane.

11. Type in the following query once again, for starters:

SELECT 

FROM Warehouse

NOTE: I often intentionally add generous space between rows, and between components of the coding, in my lessons. This is to make the coding easier to follow.

12. Placing the cursor behind word SELECT, type in one space.

13. Expand (by clicking on the "+" sign to its left) the Set folder in the Syntax Examples list on the right side of the Metadata pane.

14. Select the Members-Level function by double-clicking.

The function appears, amid our initial query, in the Query pane, as shown in Illustration 3.


Illustration 3: Building the Query: Inserting a Set Function

15. Click the <<Level>> token once to highlight it.

16. Expand the Store dimension in the Metadata tree to expose its member levels.

17. Double-click the Store State level, within the expanded Stores dimension.

18. Type another space just behind the .Members function of the top row.

19. Type the words ON ROWS.

Now we will arrange the query a bit, as we build it further.

20. Click to the left of [Store], just after SELECT.

21. Press the ENTER key.

22. Place the cursor behind the SELECT statement.

23. Press the ENTER key again, to create a blank row under SELECT.

24. Expand the Set folder in the Syntax Examples list, once again, if necessary.

25. Double-click the Children function.

The function populates the newly created row.

At this stage, the statement should duplicate the following:

SELECT

<<Member>>.Children

[Store].[Store State].Members ON ROWS

FROM Warehouse

26. Select the <<Member>> token with a single-click.

27. Double-click the Store Type dimension in the Metadata tree to select it into the <<Member>> token.

Our statement should now appear as below:


SELECT

[Store Type].Children

[Store].[Store State].Members ON ROWS

FROM Warehouse

28. Type in a space, then the words ON COLUMNS after Children in the above statement.

29. Add a comma (",") immediately after the word COLUMNS.

Our statement now appears as follows:


SELECT

[Store Type].Children ON COLUMNS,

[Store].[Store State].Members ON ROWS

FROM Warehouse

The ON COLUMNS addition specifies that the second set be added to the column axis. Anytime that we create multiple axes, as we are doing within this step of our example, we are required to designate the respective axis with which each such string is associated. Order in the query is not as important as the fact that the designation itself is made. (A single axis need not be defined, as it defaults to Column axis, similarly to the case we observed in the first query in this lesson).

30. Select Query --> Run from the main menu.

We obtain the results shown in Illustration 4 below.


Illustration 4: The Results of the Two-Axis Query

31. Save the query as MXAS06-2.MDX.

In addition to using other members' functions (explore these in the Set folder of the Syntax Examples list as you find time), we can make a simple alteration to our query statement to cause it to return the entire membership of the Store dimension.

32. Remove "[Store State]." from the query (including the "." delimiter that follows the right bracket) from the query we have designed.

Our modified statement appears in the Query pane as shown in Illustration 5. (I have created a composite picture, removing many of the empty rows, to conserve space.)


Illustration 5: The Modified Query

33. Click the Run Query button to execute the query.

We obtain the results partially shown in Illustration 6 (scrolling to the appropriate rows):


Illustration 6: The Store Dimension's Entire Population is Retrieved (Composite View)

34. Save the query as MXAS06-3.MDX.

Specifying Members in a Set

As we see from the fruits of our labor so far, we can assign sets to axes with relatively little programming effort. This is particularly true when we can rely upon many of the default member settings to "fill in the blanks," with regard to the requirement to supply "addresses" for the values we seek to retrieve, in terms of all dimensions in the cube. However, to approach the content of a report that leverages multi-dimensional concepts to produce denser information in a useful form, we must go further than this. We must be able to specify individual members, for purposes of extracting precise information from the wealth of cube data, as well as to be capable of combining multiple dimensions on a single axis. MDX rises to the occasion, and supports our need for precise control of the presentation quite well.

Let's explore meeting these further requirements, with the following steps:

35. Start a new query using the New Query button.

36. Type the following into the Query pane:

SELECT 

{[Measures].[Warehouse Cost]} ON COLUMNS

FROM Warehouse

We enclose the set we want to retrieve in braces ({}). The braces inform the application that the enclosed member(s) represent a set.

The results we obtain should be a single "grand total" for Warehouse Cost, and should duplicate those shown in Illustration 7.


Illustration 7: the Grand Total of the Store Cost Measure for the Cube

The focus here is the selection of an individual measure for placement on the column axis. Our enclosing the measure in braces defines its status as a set, as we have noted, which is a requirement for anything that is placed on an axis within that status.

Now let's say that our business requirement changes: we are asked to display not only the Warehouse Cost value, but two additional values, Warehouse Sales and Units Shipped. We will deliver these tuples with the following enhancement to our query.

37. Add the two additional measures, separated by commas and spaces, to the query as follows:


SELECT 

{[Measures].[Warehouse Cost],[Measures].[Warehouse Sales],
[Measures].[Units Shipped]} ON COLUMNS

FROM Warehouse

38. Click the Run Query button.

The results appear as shown in Illustration 8.


Illustration 8: Three Tuples Yield Three Measure Columns

Our set now consists of three tuples. We have thus presented a multiple member set, identifying those members specifically in our query.



Using the CrossJoin Function to Nest Multidimensional Sets



Now, let's nest multiple dimensions into the row axis, to demonstrate a useful approach to meeting business needs in this way, if required. We can do this with the CrossJoin() function.



39. Modify the query to match the following:


SELECT 

{[Measures].[Warehouse Cost] , [Measures].[Warehouse Sales], 
[Measures].[Units Shipped]} ON COLUMNS,

[Store].[Store State].Members ON ROWS

FROM Warehouse

Here we are simply juxtaposing our last statement, where we are specifying the multiple member set for the column axis, with a statement using a set function (.Members) to specify the row axis. Because we are using a set function, note that we can leave off the braces.

40. Click the Run Query button.

The results appear as shown in Illustration 9.


Illustration 9: The Result Set, Specifying Both Axes

Now, let's say we need to nest two sets in the row axis set. We will do this with the CrossJoin() function we discussed above.

41. Modify the query to add the CrossJoin() function, just before the ON ROWS instruction, as shown below (the CrossJoin() function appears in blue):


SELECT 

{[Measures].[Warehouse Cost] , [Measures].[Warehouse Sales], 
[Measures].[Units Shipped]} ON COLUMNS,

CrossJoin([Store].[Store State].Members, [Product].[Product Family].Members) ON ROWS

FROM Warehouse

42. Click the Run Query button to view the effect of the changes (scroll to the USA Store State rows).

The results appear as partially depicted in Illustration 10.


Illustration 10: The Result Set, With a Little Help from the CrossJoin Function

The CrossJoin() function has enabled us to juxtapose two sets, from two dimensions, as if it were one combined set, on the row axis. This produces the nesting of the Product Family level members into the Store State levels that we see above, and gives us a practical preview of the power of the CrossJoin() function.

Now, let's go one step further for the sake of finesse, and remove the many empty spaces in our new report. This is a simple matter of replacing the CrossJoin function with a derivative function, the NonEmptyCrossJoin() function, as shown below (I have highlighted the only change over the last query in blue):



SELECT 

{[Measures].[Warehouse Cost] , [Measures].[Warehouse Sales], 
[Measures].[Units Shipped]} ON COLUMNS,

NonEmptyCrossJoin([Store].[Store State].Members, [Product].[Product Family].Members) ON ROWS

FROM Warehouse

43. Modify the query to reflect the change in the function name noted above.

44. Click the Run Query button.

The results appear as depicted in Illustration 11.


45. Save the query as MXAS06-4.MDX, and exit the Sample Application as desired.

We now have a report, albeit simple in structure that hints at the potential available to us within MDX for multidimensional analysis. We have used the CrossJoin() function to create a report that allows us to simultaneously analyze multiple dimensions, as well as to nest two of these dimensions on one axis (in our case the Row axis) as part of the objective.

Next in Our Series ...

In this lesson, we began an exploration of MDX queries. We discussed the differences between MDX expressions and MDX query statements, and as part of this overview, discussed key MDX concepts and terminology. We followed our overview with an exploration of MDX query building from the ground up, using the MDX Sample Application as a vehicle for crafting our statements and practicing their use.

We delved into set functions, and the creation and use of sets, discussing their importance in our MDX queries. We progressively built the specification of members, and the combination of multiple dimensions, into our row and column axes, to add truly multidimensional capabilities to the reports we produce for ourselves and for the information consumers we support.

Our next topic will span a three-article set. In these three lessons, we will focus on building and using calculated members within our queries to meet various business needs. We will begin by previewing the creation of dynamic calculated members to set the stage for the functionality and processes we will explore together.

Throughout the Building Calculated Members articles, we will explore and practice the creation of increasingly sophisticated calculated members, discussing various types and their uses. We will further transit into the realm of using functions within our calculated members, practicing different uses for the components we have explored. Finally, we will preview member functions, in preparation for lessons to follow Building Calculated Members.

» See All Articles by Columnist William E. Pearson, III

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved