MDX Essentials: Basic Set Functions: The Intersect() Function

Monday Dec 8th 2003 by William Pearson
Share:

Return the intersection of two sets using MDX. In this lesson, author Bill Pearson explores the Intersect() function and offers practice in its operation.

About the Series ...

This is the fourteenth article of the series, MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs.

For more information about the series in general, as well as the software and systems requirements needed for getting the most out of the lessons included, please see the first article, MDX at First Glance: Introduction to MDX Essentials.

Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples.

What We Accomplished in our Last Article

In the last article of the series, Basic Set Functions: The Union() Function, we explored the commonly used Union() function, whose purpose is to combine two sets into one. We discussed the function in general, along with the capabilities it affords us within MDX, emphasizing its value within our analysis toolsets.

In addition to discussing the purpose and operation of the Union() function, we examined both primary and alternate syntaxes involved in its use, illustrating the application of each in practice exercises. Focusing on the treatment of duplicates in each of the syntactical approaches, we discussed the results we obtained in each exercise, remarking on distinguishing characteristics of each.

Introduction

In this lesson, we will focus our attention on a function that has some characteristics in common with the Union() function, yet produces a very different result. The rather simple purpose of the Intersect() function is to compare two sets, then to return a set that consists of members that exist in both original sets; that is, to return a mathematical set intersection of the sets specified in the function. Indirect uses of Intersect() are quite common, as well. Like the Union() function, the Intersect() function provides important capabilities within MDX, and is a valuable part of our analysis toolsets

Along with an introduction to the Intersect() function, this lesson will include:

  • an examination of the syntax surrounding the function;
  • illustrative examples of the uses of the function in practice exercises;
  • a brief discussion of the MDX results we obtain in the practice examples.

The Intersect() Function

The Intersect() function, according to the Analysis Services Books Online, "Returns the intersection of two input sets, optionally retaining duplicates." We will examine the manner of accomplishing intersects, both with and without duplicates, in the sections that follow.

We will examine the syntax for the Intersect() function in general, building to the intersection of two sets after creating an individual query for each, to show that the Intersect() function does, in fact, generate the results we might expect. Next, our practice example will be modified to introduce the possibility of duplication within an Intersect() function, and the means that MDX affords us for managing duplication. Our objective is to gain a richer understanding of the options we have within the Intersect() function, together with a feel for the similarities that it has with the Union() function that we discussed in Lesson 13.

Discussion

Intersect() allows us to intersect sets; that is, it allows us to return a set consisting of the members that the two original sets have in common. The syntax with which we make use of Intersect() affords us an optional flag, as was the case with the Union() function, to force the presentation of duplicates that might occur within the sets we are subjecting to the Intersect() function. The default for the function (without the flag), again like the Union() functions, is the elimination of duplicates.

Let's look at some syntax illustrations to further clarify the operation of Intersect().

Syntax

Syntactically, the sets to be "crossed" in the Intersect() function are placed within the parentheses to the right of Intersect, and separated by a comma. The optional ALL flag overrides the default behavior of the function with regard to duplicates, and allows duplicates to remain in the set that is produced. The syntax is shown in the following string:

Intersect(Set1, Set2[, ALL])

The Intersect() function returns data that two sets have in common. While the default behavior of the function dictates that duplicates are eliminated from both sets prior to their intersection, use of the optional ALL flag allows duplicate members to remain within the newly produced set, as we shall see in a step in the practice example.

The following example expression illustrates a use of the Intersect() function. We will intersect a set of three cities' employee population information with a subset of the first three cities' population information, to see that only the members "in common" between sets 1 and 2 are returned; in short, the employee population data for the two-city subset is returned.

INTERSECT(

{([Store].[All Stores].[USA].[OR]),([Store].[All Stores].[USA].[CA]), ([Store].[All Stores].[USA].[WA])} , 

{([Store].[All Stores].[USA].[OR]), ([Store].[All Stores].[USA].[WA])} )

This simple expression in a proper query, for the measure Number of Employees, would result in the return of a set similar to that depicted in Table 1.

Number of Employees

OR

136

WA

287


Table 1: Results of a Simple Intersect, Selecting Number of Employees as the Measure

We will practice the use of the Intersect() function in the section that follows. Moreover, we will explore the use of the ALL flag we have discussed, to activate the concept of its use in retaining duplicates.

Practice

The Basics

To reinforce our understanding of the basics we have covered so far, we will use the Intersect() function in a manner that illustrates its operation through a multi-step example: we will build from a pair of simple select queries, to a combination of the two in an Intersect() illustration. We will then explore the use of the ALL flag to force retainage of duplicates.

We will call upon the MDX Sample Application again, as our tool to construct and execute the MDX we examine, and to view the results datasets we obtain.

1.  Start the MDX Sample Application.

2.  Clear the top area (the Query pane) of any queries or remnants that might appear.

3.  Ensure that FoodMart 2000 is selected as the database name in the DB box of the toolbar.

4.  Select the HR cube in the Cube drop-down list box.

Let's assume for our practice example that we want to replicate the data we discussed in the example used in the syntax section. Using this simple illustration will allow us to be certain of our "answer," without the distraction of any complicating factors. To reiterate, our objective will be to present total Number of Employees, a measure that is stored in the HR cube, for three Northeastern states, intersected with two states that form a subset of the three.

We will begin by composing a simple query to select the employee population figure for the three-state set, followed by a query to select the same information for the two-state subset of the initial three-state set. We will then intersect the two sets with the Intersect() function, noting the results immediately.

5.  Type the following query into the Query pane:


-- MDX14-1:  Tutorial Query Step 1
SELECT
{[Measures].[Number of Employees]} ON COLUMNS,
{([Store].[All Stores].[USA].[OR]),([Store].[All Stores].[USA].[CA]), 
   ([Store].[All Stores].[USA].[WA])} ON ROWS
FROM HR

The purpose of this query is to simply generate the "first half" of the pair of sets that we intend to "cross" in the Intersect() query that will follow.

6.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated by Analysis Services, and the dataset shown in Illustration 1 appears.


Illustration 1: Result Dataset - Initial Half of Set Pair to be "Intersected"

We see the total Number of Employees for three Northeastern states returned (actually for 1997, the default member of the Time slicer).

7.  Select File -> Save As, name the file MDX14-1, and place it in a meaningful location.

8.  Select File --> New from the main menu.

A new, blank query pane appears.


-- MDX14-2:  Tutorial Query Step 2
SELECT
{[Measures].[Number of Employees]} ON COLUMNS,
{([Store].[All Stores].[USA].[OR]), ([Store].[All Stores].[USA].[WA])} ON ROWS
FROM HR

The purpose of this query is to simply generate the "second half" of the pair of sets that we intend to "cross" in the Intersect() query, which we will construct next.

10.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated by Analysis Services, and the dataset shown in Illustration 2 appears.


Illustration 2: Result Dataset - The "Second Half" of the Pair of Candidate Sets

11.  Save the file as MDX14-2.

We have now become familiar with the data populating the two sets that we wish to intersect. This should make the operation of the Intersect() function clearer in our next step.

12.  Select File --> New from the main menu.

A new, blank query pane appears.

13.  Type the following query into the Query pane:


-- MDX14-3:  Tutorial Query Step 3
SELECT
{[Measures].[Number of Employees]} ON COLUMNS,
INTERSECT
({([Store].[All Stores].[USA].[OR]), 
    ([Store].[All Stores].[USA].[CA]), 
    ([Store].[All Stores].[USA].[WA])} , 
   {([Store].[All Stores].[USA].[OR]), 
   ([Store].[All Stores].[USA].[WA])} )
  ON ROWS
FROM HR

The purpose of this query is to return the intersection of the pair of sets we have examined individually, using the Intersect() function.

14.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 3 appears.


Illustration 3: Our First Intersect() Function Results

Finally, we see the effects of the Intersect() function. Note that the resulting dataset is identical to the last query, shown in Illustration 2 above. This "second half" of the pair of sets upon which we intended to perform the Intersect() function represents a complete subset of the first (the other "partner" in the Intersect() function. By its very nature, the subset is a duplicate of the first set, and thus forms the "intersecting" members. We are thus presented with the identical result in Step 3 as in Step 2, of our exercise above - this is the set that the two initial sets held "in common."

15.  Save the file as MDX14-3.

Next, we will examine the operation of the function from the standpoint of how it handles duplicate members in the sets it is being asked to intersect.

Dealing with Duplication

Let's establish a scenario whereby we can explore the handling of duplicates by the Intersect() function. We will accomplish this by creating a query whose objective is to intersect sets that we know to contain a duplicate member, as we did in Lesson 13 with the Union() function, and will thus venture out of a completely "real world" scenario. This is to make the operation of the function clear, however, and we can certainly rely upon the fact that there are many situations in the business environment where dealing with duplicates is a fact of life.

1.  Select File --> New from the main menu.

The blank query pane appears.

2.  Type the following query into the Query pane:


-- MDX14-4:  Tutorial Query Step 4
SELECT
{[Measures].[Number of Employees]} ON COLUMNS,
INTERSECT
({([Store].[All Stores].[USA].[OR].Children),([Store].[All Stores].[USA].[OR].[Portland])} , 
   {([Store].[All Stores].[USA].[OR].Children),([Store].[All Stores].[USA].[OR].[Portland])} )ON ROWS
FROM HR

The Intersect() function is called into action this time to perform an intersect between two sets that we know to contain an identical member. We are asking that the complete set of the Oregon city-children, which includes the city of Portland, be combined with the set of the single city of Portland in the initial set specification, then that the resulting set be intersected with a set identical to itself. (Were the flag not in place to override default behavior, elimination would be imposed within the first set prior to its intersection to the second set.) The purpose, again, is to illustrate clearly the default behavior of the Intersect() function, with regard to the handling of such duplicates.

3.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 4 appears.


Illustration 4: Intersect() Function Results - Duplicates Eliminated

4.  Save the file as MDX14-4.

The behavior of the Intersect() function is demonstrated in its elimination of the duplicate members - a characteristic that is the default of the syntax with which we have been working.

We can prove this to ourselves by overriding the default behavior, and using the ALL flag to allow the duplicate to be retained. To do so, we will make an adjustment to the query we created in Step 4 above.

5.  Insert the word "ALL" into the second set specification within the Intersect() function, that is:

   {([Store].[All Stores].[USA].[OR].Children),([Store].[All Stores].[USA].[OR].[Portland])} )ON ROWS

between the right-most curly brace and the closing parenthesis, as shown:

   {([Store].[All Stores].[USA].[OR].Children),([Store].[All Stores].[USA].[OR].[Portland])}, ALL )ON ROWS

6.  Modify the comment line to read:

-- MDX14-5:  Tutorial Query Step 5

 

The Query pane appears as shown in Illustration 5.


Illustration 5: The Query with Modifications Circled

7.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 6 appears. We see the instance of the duplicate set is retained, thanks to the insertion of the ALL flag into our Intersect() function.


Illustration 6: Intersect() Function Results - Duplicates Retained via the ALL Flag

8.  Save the file as MDX14-5.

And so we see how to retain duplicates in a simple example that illustrates the influence of the ALL flag upon the Intersect() function. In conclusion, we can see that the Intersect() function provides us the significant capability of performing intersections of sets. Our use of the optional ALL flag within the syntax, can also afford us flexibility in the handling of duplicates in queries we construct that use this function.

Summary ...

In this lesson, we explored the commonly used Intersect() function, whose purpose is to return the intersection, or the common members, of two sets. We discussed the Intersect() function in general, and emphasized its capabilities within MDX, and its usefulness within our analysis toolsets.

In addition to discussing the purpose and operation of the Intersect() function, we focused on the treatment of duplicates by the function. We practiced the use of the function in general, then with an example of the addition of the ALL flag, to override the Intersect() function's default duplicate handling. Throughout the multiple-step practice exercise, we discussed the results we obtained with each step's execution, remarking on the distinguishing characteristics of each.

» See All Articles by Columnist William E. Pearson, III

Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved