MDX Essentials: Basic Set Functions: The EXCEPT() Function

Monday Jan 12th 2004 by William Pearson
Share:

Return the difference between two sets using MDX. In this lesson, Bill Pearson explores the EXCEPT() function, and leads practice in putting it to work to meet illustrative business needs.

About the Series ...

This is the fifteenth article of the series, MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs.

For more information about the series in general, as well as the software and systems requirements needed for getting the most out of the lessons included, please see the first article, MDX at First Glance: Introduction to MDX Essentials.

Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services ("MSAS"), and the related Books Online and Samples. Images are from a Windows 2003 Server environment.

What We Accomplished in Our Last Article

In the last article of the series, Basic Set Functions: The Intersect() Function, we explored the commonly used Intersect() function, whose purpose is to return the intersection, or the common members, of two sets. We discussed the Intersect() function in general, and emphasized its capabilities within MDX, and its usefulness within our analysis toolsets.

In addition to discussing the purpose and operation of the Intersect() function, we focused on the treatment of duplicates by the function. We practiced the use of the function in general, then with an example of the addition of the ALL flag, to override the Intersect() function's default duplicate handling. Throughout the multiple-step practice exercise, we discussed the results we obtained with each step's execution, remarking on the distinguishing characteristics of each.

Introduction

In this lesson, we will focus our attention on a function that behaves in a manner that is dramatically different from the INTERSECT() function. Instead of returning the common members of two sets, EXCEPT() returns the difference between two sets. Duplicate handling is managed through the use of a flag, as we shall see, similar to the manner we have exposed for other functions in previous articles.

The EXCEPT() function provides important capabilities within MDX, and is yet another tool that MDX provides us to make access to multidimensional data more efficient, and our coding more concise. Along with an introduction to the EXCEPT() function, this lesson will include:

  • an examination of the syntax surrounding the function;
  • illustrative examples of the uses of the function in practice exercises;
  • a brief discussion of the results datasets we obtain in the practice examples.

The EXCEPT() Function

The EXCEPT() function, according to the Analysis Services Books Online, "finds the difference between two sets, optionally retaining duplicates." We will examine the use of EXCEPT() to manage this, both with and without duplicates, in the sections that follow.

In keeping with the approach we have used in other lessons within the MDX Essentials series, we will examine the syntax for the EXCEPT() function in general, building to a determination of the difference between two sets after creating and running an individual query for each; this will show that the EXCEPT() function does, in fact, generate the results we might expect. Next, we will explore an example that introduces duplication within an EXCEPT() function, and the means that MDX affords us for managing duplication. Consistent with the design of our series, our objective is to gain a richer understanding of the options we have within the EXCEPT() function, as well as an overall perspective of how the function can be used to meet our business needs.

Discussion

The EXCEPT() function allows us to return a set consisting of the difference between a pair of sets upon which it is enacted. The syntax with which we employ EXCEPT() allows us an optional flag, as was the case with the INTERSECT() and UNION() functions, among others; the flag allows us the capability of presenting duplicates that might occur within the sets we are subjecting to the EXCEPT() function. The default for the function (without the flag), again like the INTERSECT() and UNION() functions, is the elimination of duplicates, which would likely be the typical requirement.

Let's look at some syntax illustrations to further clarify the operation of EXCEPT().

Syntax

Syntactically, the sets between which a difference is to be determined by the EXCEPT() function are placed within the parentheses to the right of EXCEPT, and separated by a comma. The optional ALL flag overrides the default behavior of the function, and allows duplicates to remain prior to the determination of the difference. The syntax is shown in the following string:

EXCEPT(Set1, Set2[, ALL])

The EXCEPT() function returns, in effect, whatever appears in the first set, and that does not appear in the second set, upon which the function is being enacted. This might also be expressed as the "data that is not in common between the two sets." Importantly, the two sets must be composed of the same dimension, and must exist at the same level within the dimension, for the EXCEPT() function to work.

While the default behavior of the function dictates that duplicates are eliminated prior to the determination of the difference, use of the optional ALL flag allows duplicate members to remain within the newly produced set, as we shall see in a step in the practice example. Duplicates matching in the first set are discarded, while non-matching duplicates are retained.

The following query contains an example EXCEPT() function. We are selecting the total annual General and Administrative ("G & A") expense for 1998 for the set of all store countries, minus the subset of the Canadian store countries.

SELECT

{Measures.[Amount] } ON COLUMNS,

   { EXCEPT ([Store].[Store Country].Members, {[Store Country].[Canada]} )} 
   
      ON ROWS

FROM 

Budget

WHERE 

   ([Account].[All Account].[Net Income].[Total Expense].[General & Administration],

      [Time].[1998])

This simple query would result in the return of a dataset similar to that depicted in Table 1.

Amount

Mexico

$78,645.84

USA

$ 32,513.40


Table 1: Results of a Simple Query Containing the EXCEPT() Function, with Total G & A Expense for 1998 as the Measure

We will practice the use of the EXCEPT() function in the section that follows. Moreover, we will explore the use of the ALL flag we have discussed, to activate and reinforce the concept of its use in retaining duplicates.

Practice

The Basics

As we have done in previous articles of this series, and to reinforce our understanding of the basics we have covered so far, we will use the EXCEPT() function in a manner that illustrates its operation through a multi-step example. We will begin with the construction of a pair of simple SELECT queries, then build to a combination of the two within an EXCEPT() function, which will itself be housed within a query that we construct to address an illustrative business need. We will then explore the use of the ALL flag to force retainage of duplicates for presentation purposes.

We will call upon our old friend, the MDX Sample Application, once again, to construct and execute the MDX we examine, and to view the results datasets we obtain.

1.  Start the MDX Sample Application.

2.  Clear the top area (the Query pane) of any queries or remnants that might appear.

3.  Ensure that FoodMart 2000 is selected as the database name in the DB box of the toolbar.

4.  Select the Budget cube in the Cube drop-down list box.

Let's assume for our practice example that we have been asked, as MSAS architects for our organization, to present the change, or delta, in the total G & A expense, a measure that is stored in the FoodMart 2000 Budget cube, between years 1997 and 1998, for all stores within two states, California and Washington. We are to present the change as "Annual Delta," the term used by management when reporting from the Budget cube for such year-to-year differences in values, in our ultimate results dataset.

We are aware that our organization operates in three states, California, Washington and Oregon, and, especially since we are excluding only one of the state stores groups (comprising the Oregonian stores) from the final dataset, we see an opportunity to use the EXCEPT() function to achieve our ends. We will first construct a SELECT query that returns the Annual Delta for all three states' stores, and then construct a second SELECT query to present the same data for the Oregonian stores alone.

At this stage, we should easily be able to understand the difference between total G & A expense for all three states, and the corresponding total for just the Oregonian stores. The next step in our practice example will be the creation of a query that performs the retrieval of this difference with an EXCEPT() function. Finally, we will modify the query to demonstrate the use of the optional duplicates flag to see the effect that can be induced via that approach.

As our first objective is to present the total Annual Delta of General & Administration expense, between years 1997 and 1998, for the stores of all three states, we will compose a simple query to select the total value for this set, by taking the following steps.

5.  Type the following query into the Query pane:

-- MDX15-1:  Tutorial Query Step 1

WITH MEMBER

  [Time].[Annual Delta] 

AS 

  '[Time].[1998] - [Time].[1997]'

SELECT

  {Time.[1998] , Time.[1997] , [Time].[Annual Delta] } ON COLUMNS,

  {[Store].[All Stores].[USA].Children} ON ROWS

FROM 

   Budget

WHERE 

  ([Account].[All Account].[Net Income].[Total Expense].

    [General & Administration], Measures.[Amount])

Again, the purpose of this query is to simply generate the first set of the pair of sets upon which we intend to enact the EXCEPT() function, within a final, "consolidated" query.

6.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated by Analysis Services, and the dataset shown in Illustration 1 appears.


Illustration 1: Result Dataset - Initial Half of Set Pair Whose "Difference" is Required

We see the total G & A expense returned for each of years 1997 and 1998, as well as the Annual Delta between the years (created through the use of a calculated member within the WITH MEMBER clause - see Lessons 7, 8, and 9 of my MDX in Analysis Services series, here at DatabaseJournal, for more information), for each state. We obtain confirmation, since we use the .Children function (for more information, see Article Five in this series, MDX Member Functions: The Family Functions) in constructing our rows, that the three states we see are the complete set for the information presented.

7.  Select File -> Save As, name the file MDX15-1, and place it in a meaningful location.

Now let's construct a second SELECT query that retrieves the identical dataset as that above, but this time for the Oregonian stores alone.

8.  Select File --> New from the main menu.

A new, blank query pane appears.

9.  Type the following query into the Query pane:

-- MDX15-2:  Tutorial Query Step 2

WITH MEMBER

  [Time].[Annual Delta] 

AS 

  '[Time].[1998] - [Time].[1997]'

SELECT

  {Time.[1998] , Time.[1997] , [Time].[Annual Delta] } ON COLUMNS,

  {[Store].[OR]} ON ROWS

FROM 

  Budget

WHERE 

  ([Account].[All Account].[Net Income].[Total Expense].

    [General & Administration], Measures.[Amount])

The purpose of this query is to simply generate the "second half" of the pair of sets between which we intend to obtain a "difference" using the EXCEPT() function in yet another query, which we will construct next.



10.  Execute the query by clicking the Run Query button in the toolbar.



The Results pane is populated by Analysis Services, and the dataset shown in Illustration 2 appears.




Illustration 2: Result Dataset - The "Second Half" of the Pair of Candidate Sets



11.  Save the file as MDX15-2.



We have now become familiar with the data populating the two sets upon which we wish to perform our EXCEPT() operation. This should make the operation of the EXCEPT() function clearer in our next step, as the result dataset that we obtain through its use should equal our first dataset after we "subtract" our second dataset.



12.  Select File --> New from the main menu.

A new, blank query pane appears.

13.  Type the following query into the Query pane:

-- MDX15-3:  Tutorial Query Step 3

WITH MEMBER

  [Time].[Annual Delta] 

AS 

  '[Time].[1998] - [Time].[1997]'

SELECT

  {Time.[1998] , Time.[1997] , [Time].[Annual Delta] } ON COLUMNS,

  { EXCEPT

    ([Store].[All Stores].[USA].Children, {[Store].[OR]} )} ON ROWS

FROM 

  Budget

WHERE 

  ([Account].[All Account].[Net Income].[Total Expense].

    [General & Administration], Measures.[Amount])

The purpose of this query is to return the selected data for the difference between the pair of sets we have constructed and examined individually, using the EXCEPT() function.

14.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 3 appears.


Illustration 3: The Results of Our Query with the EXCEPT() Function in Place

Finally, the effects of the EXCEPT() function are made clear. We note that, in this simple scenario, the resulting dataset is identical to the results dataset of our first SELECT query, "minus" the results of the second SELECT query. While the actions we have taken might seem not to save us a great deal of effort (after all, we might have simply selected the California and Washington store states as our rows directly, and skipped the EXCEPT() function entirely), we can easily see that in a much larger group of members (ours was a small set of three states), much typing might be saved by simply using a .Children, .Members, or other function, and then "backing out" the members that we choose not to include in our presentation.

15.  Save the file as MDX15-3.

Next, we will examine the operation of the EXCEPT() function from the perspective of how it handles duplicate members in the sets between which it is being asked to return a difference.

Dealing with Duplication

Let's next establish a scenario whereby we can explore the handling of duplicates by the EXCEPT() function. We will accomplish this by creating a query whose objective is to introduce sets into the function that we know to contain a duplicate member, as we have in earlier lessons with other functions offering a similar duplicates flag option.

While our next step may be a bit of a departure from a completely "real world" scenario, the idea is to make the operation of the function clear through the use of an example that can be easily understood. As most of us are aware, we can certainly rely upon the fact that there are many situations in the business environment where dealing with duplicates is a fact of life.

1.  Select File --> New from the main menu.

The blank query pane appears.

2.  Type the following query into the Query pane:

-- MDX15-4:  Tutorial Query Step 4

WITH MEMBER

  [Time].[Annual Delta] 

AS 

  '[Time].[1998] - [Time].[1997]'

SELECT

  {Time.[1998] , Time.[1997] , [Time].[Annual Delta] } ON COLUMNS,

  { EXCEPT

    ({[Store].[All Stores].[USA].Children, [Store].[OR], [Store].[CA], 

      [Store].[WA]} , {[Store].[OR]}, ALL )} ON ROWS

FROM 

  Budget

WHERE 

  ([Account].[All Account].[Net Income].[Total Expense].

    [General & Administration], Measures.[Amount])

The EXCEPT() function is called into action this time to return a difference between two sets where we know duplicate members exist. We are asking that the complete set of the USA store-state children, which includes the states of Oregon, Washington and California, be combined with the sets of the single states of Oregon, Washington and California in the initial set specification. Then we are asking that the resulting set be used in the EXCEPT() function with the set of the single state of Oregon, and that a difference be returned.

Were the ALL flag not in place to override default behavior, elimination would be imposed within both sets prior to the evaluation and return of the difference. We will see an instance of the default behavior later.

Were the ALL flag not in place to override default behavior, elimination would be imposed within both sets prior to the evaluation and return of the difference. We will see an instance of the default behavior later.

3.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 4 appears.


Illustration 4: Query with EXCEPT() Function Results - Duplicates Retained Flag in Place

4.  Save the file as MDX15-4.

The behavior of the EXCEPT() function, with the ALL flag in place, can be explained in this way. Because it eliminates matching duplicates in first set first (that is, duplicates in the first set that match a member in the second set), Oregon (the only member in the second set that matches a pair of duplicates in the first set, is eliminated from the outset.

The non-matching duplicates (that is, duplicates in the first set that have no match in the second set) are retained. In the present case, that means that Washington and California, which exist as duplicates in the first set (they are included in [Store].[All Stores].[USA].Children, as well as appearing as [Store].[WA] and [Store].[CA], respectively), and are unmatched in the second set (where only [Store].[OR] appears), are retained as duplicates. Hence we see two of each of [Store].[WA] and [Store].[CA] appearing in the result dataset.

More succinctly, the rule for retaining duplicates is that matching duplicates in the first set are eliminated, and nonmatching duplicates are retained.

We can prove this to ourselves easily by making changes to the duplicates and re-executing the query, until we grasp the results that we obtain in each case. In addition, the remaining steps, which show the default operation of the EXCEPT() function, will demonstrate that duplicates are, indeed, eliminated when the default behavior is not overridden by the optional override flag.

5.  Remove the word "ALL" (along with the comma that precedes it) from the second set specification within the EXCEPT() function, as shown below:

{ EXCEPT

  ({[Store].[All Stores].[USA].Children, [Store].[OR], [Store].[CA], 

    [Store].[WA]} , {[Store].[OR]}, ALL )} ON ROWS

So that the EXCEPT() function appears as below:

{ EXCEPT

  ({[Store].[All Stores].[USA].Children, [Store].[OR], [Store].[CA], 

    [Store].[WA]} , {[Store].[OR]} )} ON ROWS

6.  Modify the comment line to read:

-- MDX15-5:  Tutorial Query Step 5

The Query pane appears as shown in Illustration 5.


Illustration 5: The Query with Modifications Circled

7.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 6 appears. We see clearly that the removal of the overriding flag results in the elimination of duplicates that were retained in our last step.


Illustration 6: EXCEPT() Function Results - Duplicates are Eliminated by Default

8.  Save the file as MDX15-5.

We have thus explored the use of the ALL flag within the EXCEPT() function to allow us to retain duplicates in a simple example. We have also exposed clearly the default behavior of the function, which is to eliminate the duplicates.

In conclusion, we can see that the EXCEPT() function provides us the significant capability of presenting the difference between two sets. Our use of the optional ALL flag within the syntax, can also afford us flexibility in the handling of duplicates in queries we construct that use this function - flexibility that can be paramount in meeting the needs of organizational information consumers.

Summary ...

In this lesson, we explored the useful EXCEPT() function, whose purpose is to return the difference between two sets. We have noted that, like the INTERSECT() function and other functions that we have explored in previous lessons, EXCEPT() allows us flexibility in duplicate handling, which is managed through the use of a flag which can be used to override the default elimination of duplicates.

The EXCEPT() function provides important capabilities within MDX, and is yet another means that MDX provides us to more efficiently access multidimensional data, as well as to write more concise code. In addition to discussing the purpose and operation of the EXCEPT() function, we focused on the treatment of duplicates by the function. We practiced the use of the function in general, through a multi-step practice example, then explored an additional example where we practiced the addition of the ALL flag, to override the EXCEPT() function's default duplicate handling. Throughout the multiple-step practice exercise, we discussed the results we obtained with each step's execution, remarking on the distinguishing characteristics of each.

» See All Articles by Columnist William E. Pearson, III

Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved