MDX Essentials: Basic Set Functions: The Filter() Function

Monday Feb 9th 2004 by William Pearson
Share:

Filter out records you do not want without compromising the efficiencies of MDX sets. In this lesson, we will explore using the powerful Filter() function to return subsets of larger sets, based upon conditions we supply.

About the Series ...

This is the sixteenth article of the series, MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs.

For more information about the series in general, as well as the software and systems requirements needed for getting the most out of the lessons included, please see the first article, MDX at First Glance: Introduction to MDX Essentials.

Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples.

What We Accomplished in our Last Article

In the last article of the series, Basic Set Functions: The Except() Function, we explored the useful Except() function, whose purpose is to return the difference between two sets. In addition to discussing the purpose and operation of the Except() function, we focused on the treatment of duplicates by the function. We examined the syntax surrounding its uses, and illustrated its application in practice exercises, providing hands-on exposure to the use of Except().

We first undertook a multi-step example in which we exposed default handling of duplicates by the function, then explored an additional example where we practiced the use of the ALL flag to override the Except() function's default duplicate handling. Throughout the practice exercises, we discussed the results we obtained with each step's execution, remarking on the distinguishing characteristics of optional flag settings.

Introduction

In this lesson, we will expose what many consider one of the most useful functions in the MDX arsenal, the Filter() function. The general purpose of the Filter() function is to allow us to filter out parts of a set that we do not need in a given situation, and to return a subset of a larger set as a result. Uses of the Filter() function, as with many MDX functions, can range from the sublimely simple to the impressively advanced, and it can be used in many innovative ways. The objective, of course, is the support of precise analysis to meet our business needs. We will see in this article how the Filter() function is a prime example of the efficiency and precision we can attain by using judiciously chosen functions from our MDX toolsets.

Along with an introduction to the Filter() function, this lesson will include:

  • an examination of the syntax surrounding the function;
  • illustrative examples of the uses of the function in practice exercises;
  • a brief discussion of the MDX results we obtain in the practice examples.

The Filter() Function

The Filter() function, according to the Analysis Services Books Online, "returns the set resulting from filtering a specified set based on a search condition." That is, it returns a subset of the specified set based upon the action of a filter, whose operation is selectively enforced by a condition in the function. As we shall see, the uses of the function are varied, and its innovative employment with the correct logic can result in the efficient delivery of very precise results.

We will examine the syntax for the Filter() function in general, and then will undertake practices exercises to reinforce the concepts. We will first create an unfiltered query, and then build to the placement of the function into the query, to meet an illustrative business requirement. In this way, we will be able to see clearly how Filter() operates to create a subset from a set that is already known to us.

Next, we'll explore a more complex scenario where we make innovative use of the Filter() function to meet a more sophisticated requirement. This will give us a richer understanding of the potential that the Filter() function offers, and allow us to examine it from a couple of different perspectives.

Discussion

Filter() allows us to filter a set based upon a particular search condition. In other words, the function allows us to return a subset of a set that the query extracts from a cube, based upon whether it meets certain criteria. Members of the retained subset are returned in their natural order. While the basic use of the Filter() function is quite straightforward, we can use it to return complex results, as potential combinations with other functions and operators are myriad.

Examples that occur in a business scenario might include the need to see only the members of a group of employees whose organizational salary costs have increased over the prior year, or to perhaps return a list of customers (out of the larger set of all customers) whose average monthly expenditures at our retail outlets have decreased.

Let's look at some syntax illustrations to further clarify the operation of Filter().

Syntax

Syntactically, the set to be returned by the Filter() function is placed within the parentheses to the right of Filter, and separated by a comma from the specified criteria that the returned subset is to meet, or the search condition applied. The syntax is shown in the following string:

Filter(<<set>>, <search condition>)

Remarks

The Filter() function returns the tuples of the specified set that meet the criteria of the search condition. The returned subset represents the portion of the larger, specified set that evaluates as "true" with regard to the logical expression contained in the Boolean search condition.

The order of the returned subset is the same as that found in the larger specified set. If there are no tuples in the specified set for which the search condition is true, an empty set is returned by the function.

The following example expression illustrates use of the Filter() function in its most basic form. Suppose we are asked by a group of FoodMart information consumers to present total Warehouse Sales for the cities in which we operate warehouses, where the total 1998 sales of each city is $ 10,000 and above. We might approach this need with an expression similar to this:

FILTER (

[Warehouse].[City].MEMBERS,

    ([Measures].[Warehouse Sales], [Time].[1998]) > 9999)

The Filter() function might be housed in a query like the one below:


SELECT
     {[Measures].[Warehouse Sales]} ON COLUMNS,
     {Filter (
         [Warehouse].[City].MEMBERS,
             ([Measures].[Warehouse Sales], [Time].[1998]) > 9999)
                  } ON ROWS
FROM
    [Warehouse]
WHERE
    ([Time].[1998])

The query, with the measure Warehouse Sales specified for the column, the results of the Filter() function specified for the rows, and the year 1998 from the Time dimension used as slicer, would result in the return of the set depicted in Table 1.

Warehouse

Sales

Vancouver

21,730.73

Mexico City

10,662.54

San Andres

22,291.58

Acapulco

23,817.12

Orizaba

20,294.02

Camacho

23,140.65

Hidalgo

14,279.90

Los Angeles

23,998.14

San Diego

19,462.39

Portland

25,343.95

Salem

29,796.57

Seattle

26,692.80

Tacoma

30,336.79


Table 1: Results of a Query with Filter(), Selecting Warehouse Sales as the Measure and 1998 as the Slicer

In the example above, we use the Filter() function, in combination with the .Members function (see MDX Members: Introducing Members and Member Functions for a tutorial on this and related family functions), to enumerate the cities for whom the data reflects warehouse sales of greater than $ 9,999.

We will get a chance to see these components in operation again, as we practice the use of the Filter() function in the section that follows. We will look at a relatively simple instance, then a slightly more complex one, as we explore the use of the syntax we have discussed, within the steps of our examples.

Practice

Let's reinforce our understanding of the basics we have covered so far, by using the Filter() function in a manner that illustrates its operation in reducing a specified set to return only a subset whose tuples meet its search condition. As we have done throughout the series, we will employ the MDX Sample Application for constructing and executing the MDX we examine, and for viewing the result datasets we obtain.

1.  Start the MDX Sample Application.

2.  Clear the top area (the Query pane) of any queries or remnants that might appear.

3.  Ensure that FoodMart 2000 is selected as the database name in the DB box of the toolbar.

4.  Select the Warehouse cube in the Cube drop-down list box.

We will next compose a straightforward query, to gain an understanding of the use of the Filter() function to return a subset of a specified set that we will illustrate through a business need scenario. Let's say that a group of information consumers within the organization come to us with a requirement: they wish to see a simple report that lists the Product Categories that have experienced a decrease in shipping volume between the first and second quarters of 1998. Our query will focus on Shipping Units, a value that is captured monthly within the FoodMart organization and which is stored in the Warehouse cube.

First, we will create a core query to retrieve the data for all Product Categories. This will generate a result dataset which we can next compare to the same query with the Filter() function in place, to reinforce our understanding of its operation.

5.  Type the following query into the Query pane:


-- MDX16-1:  Tutorial Query No. 1
SELECT
    {[Time].[1998].[Q1],
     [Time].[1998].[Q2]} ON COLUMNS,
          {[Product Category].MEMBERS} ON ROWS
FROM  
     [Warehouse]
WHERE
     (Measures.[Units Shipped])

NOTE: A word of explanation for those that might be joining the MDX Essentials series with this article: The layout I use in my code illustrations might seem unlike that of other references (indeed, it's rare to find any two references that illustrate MDX expressions and queries with identical presentations!); my intent here is to attempt to make the components of the coding easily understandable by organizing them in a clear manner.

The code will execute in virtually any case, whether it is arranged exactly as shown or not, assuming that the elements contained in the example are typed in correctly. I have learned from experience with large audiences that there are, beyond doubt, a few "purists" out there that maintain there is a "correct" way to arrange the code - and everything else; I don't belong to any such "clubs," however, and am focused more on clarity in my examples. Please do not take my eccentricities as being reflective of any "manual of style."

6.  Execute the query by clicking the Run Query button in the toolbar.

Analysis Services populates the Results pane, presenting the dataset partially shown in Illustration 1.


Illustration 1: Result Dataset - Without Filtering (Partial View)

We see the total Units Shipped volume for all Product Categories, returned for the first two quarters of year 1998. A cursory review reveals that some categories experienced increases in volumes shipped between quarters, while others experienced a decline in volume.

7.  Select File -> Save As, and name the query MDX16-1, leaving the query open.

Let's add the Filter() function to our ON ROWS specification next, and confirm its effects on the returned dataset.

8.  Modify the ON ROWS specification of the query from the following:

{[Product Category].MEMBERS} ON ROWS

to the following:


{FILTER (
    [Product].[Product Category].MEMBERS,
        (([Time].[1998].[Q1] , [Measures].[Units Shipped]) >
            ([Time].[1998].[Q2],[Measures].[Units Shipped])))} ON ROWS

We are thus placing the Filter() function within our rows axis. The modified query appears in the Query pane as depicted in Illustration 2.


Illustration 2: The Query - with the Addition of the Filter() Function

9.  Replace the existing comment line for the query with the following:

-- MDX16-2:  Tutorial Query No. 2

10.  Execute the query by clicking the Run Query button on the toolbar.

The filtered dataset is returned, as shown in Illustration 3.


Illustration 3: Result Dataset - With Filter() Function in Place

Our modified query contains a Filter() function similar to the one we presented in the Remarks section above, but with a slight complication. The example not only demonstrates the operation of the function, but also accomplishes a complementary objective: It shows how we can easily compare the same measure under different dimensional scenarios - that is, in two different sets. In the current example, Units Shipped in two different quarters is compared by adding a Boolean expression as the search condition.

11.  Select File -> Save As, and name the query MDX16-2. Leave the query open, once again.

As we have already noted, the Filter() function can be used in conjunction with other functions to produce more complex results. Let's compound the business requirement we outlined earlier, and create an illustrative example of just such a case.

Let's say that the information consumers, while satisfied with the response we have supplied using the previous query, have been emboldened to ask for more. (This is probably a scenario well known to readers who act in a support capacity of this nature in the business world.) The consumers now state that they want to see the list of products that have experienced a decrease in shipping volume, as before, but this time they would like to see the products concerned within the context of the warehouses involved. In other words, they wish to see the products whose shipping volumes declined, grouped by the warehouses involved.

We can handle this easily with the introduction of a CrossJoin() function in our specification of the rows axis. To do so, we will take the following steps:

12.  Modify the ON ROWS specification of the query from the following:


{FILTER (
    [Product].[Product Category].MEMBERS,
        (([Time].[1998].[Q1] , [Measures].[Units Shipped]) >
   ([Time].[1998].[Q2],[Measures].[Units Shipped])))} ON ROWS

to the following:


{FILTER ( 
    CROSSJOIN (
        [Warehouse].[City].MEMBERS,
            [Product].[Product Category].MEMBERS),
                (([Time].[1998].[Q1] , [Measures].[Units Shipped]) >
                    ([Time].[1998].[Q2],[Measures].[Units Shipped])))} 
                        ON ROWS

The modified query appears in the Query pane as depicted in Illustration 4.


Illustration 4: The Query - CrossJoin() Added to Filter() Function

13.  Replace the existing comment line for the query with the following:

-- MDX16-3:  Tutorial Query No. 3

14.  Execute the query by clicking the Run Query button in the toolbar.

The filtered dataset is returned, as partially depicted shown in Illustration 5.


Illustration 3: Result Dataset - With CrossJoin() inside Filter() Function (Partial View)

Our modified query meets the objective established by the information consumers, and displays the products with declining shipping volumes over the Q1 and Q2 operating periods, by the warehouse location. We can thus see that, regardless of the "compound" nature of the set in our Filter() function, the results are returned with accuracy and completeness.

15.  Select File -> Save As, and name the query MDX16-3.

Many far more sophisticated uses for the Filter() function are possible, and, as we build our functional toolset within the MDX Essentials series in the months ahead, we will employ the function in numerous ways. It is often more efficient to ask for subset of a larger set than to specify each of the targeted set's member tuples.

Summary...

In this lesson, we explored the powerful Filter() function, whose general purpose is to filter out parts of a set to return a precise subset to meet our needs. The Filter() function offers many flexible and innovative uses, and, as we have seen, can incorporate the operation of additional functions within its search condition to allow us to build substantial logic into our filtering efforts. The Filter() function is an excellent tool for the support of precise analysis.

In addition to discussing the purpose and operation of the Filter() function, we illustrated its application in a couple of practice exercises. We began with an unfiltered query, to which we then applied Filter(), to show its effectiveness in meeting a business need to return a subset of the initially unfiltered dataset. We then substituted a more complex Filter() function into our query to provide a solution within a second example, illustrating the way that we can combine other functions with a Filter() function to return more sophisticated datasets. Finally, throughout the creation and execution of the practice examples, we discussed the results we obtained to confirm our understanding of the function and its operation.

» See All Articles by Columnist William E. Pearson, III

Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved