MDX Essentials: Basic Set Functions: The TopCount() Function, Part I

Monday May 2nd 2005 by William Pearson
Share:

Isolate best performers from the member population at large, and perform sophisticated analysis with TopCount(). Join MSAS Architect Bill Pearson in the first of a pair of articles focusing upon the powerful TopCount() function, where we experience "hands-on" some of the possible ways it can offer multi-perspective decision support.

About the Series ...

This article is a member of the series, MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs.

For more information about the series in general, as well as the software and systems requirements for getting the most out of the lessons included, please see my first article, MDX at First Glance: Introduction to MDX Essentials.

Note: Current updates are assumed for MSSQL Server, MSSQL Server Analysis Services, and the related Books Online and Samples.

Overview

In this article, we will examine the TopCount() function, whose highest value lies in its capability to enable us to isolate best performers from among hundreds or thousands of fellow members. This ranking capability is critical in data analysis and decision support scenarios: In many business situations, we seek to report upon "best performers" for various reasons. TopCount() facilitates our doing so, allowing us to sort on a numeric value expression that we can provide. We can tell the function how many "top" members we wish to retrieve (say, the "top ten," or the "top twenty"), for a "custom-fit" approach, that matches our needs precisely.

In this article, we will introduce and overview the TopCount() function. We will first comment upon the general operation of TopCount(), and then we will:

  • Examine the syntax surrounding the function;
  • Undertake illustrative examples of the rudimentary uses of the function in practice exercises;
  • Briefly discuss the results datasets we obtain in the practice examples.

The TopCount() Function

According to the Analysis Services Books Online, the TopCount() function returns a specified number of items from the topmost members of a specified set, optionally ordering the set first." The TopCount() function stands out as an excellent general example of the potential power of MDX. We specify three parameters, a set, a count, and a measure, in most cases, and TopCount() returns the number of top performers (or "top anything," in effect), based upon our input.

As we shall see, TopCount() sorts the set we specify by another expression we provide within the function, thus breaking the natural hierarchy of the set. The basis of sorting by TopCount() closely resembles that used by the Order() function. We should therefore keep in mind that, in the absence of a specified sort, the function will institute a sort based upon the default member.

NOTE: For information surrounding the Order() function, see my article Basic Set Functions: The Order() Function, a member of the MDX Essentials series at Database Journal.

We will examine the syntax for the TopCount() function, and then look at its behavior based upon parameter input we might provide. Next, we will undertake practice examples constructed to support hypothetical business needs that illustrate uses for the function. This will allow us to activate what we explore in the Discussion and Syntax sections, by getting some hands-on exposure in creating expressions that leverage the function. Our next article will build upon the basics we expose here, and leverage TopCount() within more sophisticated examples of business usage.

Discussion

To restate our initial explanation of its operation, the TopCount() function sorts a set based upon a numerical expression we provide, and then picks the top (whatever number we provide in the Count parameter of the function) items in that set, based upon the rank order established by the function. The frequent requirement to examine a subset of members at the top of the general set is easily answered by the TopCount() "sort, and then pick this number of members from the top down" approach.

Let's look at some syntax illustrations to further clarify the operation of TopCount().

 

Syntax

Syntactically, the set upon which we seek to perform the TopCount() operation is specified within the parentheses to the right of TopCount, a common arrangement within MDX functions, as we have seen in our previous articles. The syntax is shown in the following string.

TopCount(&Lt&Lt Set >>, &Lt&Lt Count >> [,&Lt&Lt Numeric Expression >>])

We follow <<Set>>, the set specification, with a comma, which is followed by <<Count>>, the numeric expression we provide to dictate the number of "top" members we wish for the function to return. <<Count>> is, in turn, followed by a numeric expression whereby we specify the "sort by" criteria. As we have mentioned, TopCount() always breaks the natural hierarchy of the set - due to the fact that the set is sorted upon the third argument we provide as <<Numeric Expression>>, before returning the top (specified number of) members from that sort.

The items returned by TopCount() are limited to the << Count >> input that we provide, even in the case of "ties" in the <<Numeric Expression>> value. This may be important to consider when evaluating the data retrieved by the function, in addition to default behavior of the function when no sort criteria (in the form, again, of the <<Numeric Expression>>) is provided.

The following example expression illustrates the use of the TopCount() function, within a simple business context. Let's say that a group of information consumers within the FoodMart Planning organization wishes to see the top ten Warehouse Cities for operating year 1997, based upon total Warehouse Profit for each city.

The basic TopCount() function specifies the "top ten Warehouse Cities" (with the number "10" as the Count specification, and [Warehouse].[City].Members as the Set specification, of the function). TopCount() assembles the top ten members from the perspective of Warehouse Profit (the Numeric Expression upon which the complete set of Warehouse Cities will first be sorted by the function). The function with arguments is represented in the ON ROWS line of the following query:

SELECT
   {[Measures].[Warehouse Profit]} ON COLUMNS,
{TOPCOUNT([Warehouse].[City].Members, 10, [Measures].[Warehouse Profit])} ON ROWS
FROM [WAREHOUSE] WHERE ([Time].[Year].[1997])

Assuming that we placed the TopCount() function within the query as constructed above, our returned dataset would resemble that shown in Table 1.

Warehouse Profit

Salem

17,041.57

Tacoma

16,725.63

Los Angeles

12,789.25

San Diego

12,536.28

Bremerton

12,063.17

Seattle

12,021.58

Spokane

5,835.46

Beverly Hills

5,674.85

Yakima

5,635.98

Portland

4,186.58


Table 1: Results Dataset, with TopCount() Defining Columns

As we expected, the top ten Warehouse Cities are returned, having been sorted by Warehouse Profit, and having had the top ten values in that sort isolated as we requested in the function.

As is the case with many MDX functions, TopCount() can be used to deliver sophisticated analysis datasets, particularly when we employ it along with other MDX functions. We will practice the use of TopCount() in the section that follows, building from a simple example, similar to that which we saw above, to a scenario where we apply the function, in conjunction with a calculated member, to reveal an additional analytical perspective. In our next article, we will examine even more sophisticated applications of the function, combining its use with that of other functions we have explored in articles of the series.

Practice

To reinforce our understanding of the basics we have covered so far, we will first use the TopCount() function in a simple scenario to illustrate its operation as the primary focus. We will do so in a backdrop that places TopCount() within the context of meeting a business need for a group of hypothetical information consumers.

Let's return to the MDX Sample Application as a platform from which to construct and execute the MDX we examine, and to view the results datasets we obtain.

1.  Start the MDX Sample Application.

2.  Clear the top area (the Query pane) of any queries or remnants that might appear.

3.  Ensure that FoodMart 2000 is selected as the database name in the DB box of the toolbar.

4.  Select the Warehouse cube in the Cube drop-down list box.

Let's assume, for our practice example, that we have received a call from the Finance department of the FoodMart organization, requesting some information surrounding general profitability among the organization's Warehouse Cities. The Finance information consumers specifically wish to know the Warehouse Profit figures attributed to each of the top ten performing cities for operating year 1998.

To rephrase, the objective will be to present a single measure, Warehouse Profit, for 1998 within the context of the FoodMart Warehouse cube. (For our exercise, the cube can be assumed to represent the prior year's activity of the organization.) We wish to sort the Warehouse Cities by the total Warehouse Profit of each, and then to return data for the highest ten Warehouse Cities based upon the profit measure.

Let's construct a simple query, therefore, to return the top ten performers, as requested.

5.  Type the following query into the Query pane:


-- MDX031-1,Simple use of TopCount():  "Top Ten 1998 Profit Producers"
SELECT
  {[Measures].[Warehouse Profit]}  ON COLUMNS,
{ TOPCOUNT([Warehouse].[City].Members, 10, 
      ([Measures].[Warehouse Profit]))}  ON ROWS
FROM
   [WAREHOUSE]
WHERE
([Time].[Year].[1998])  

6.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated by Analysis Services, and the dataset shown in Illustration 1 appears.


Illustration 1: Result Dataset - Simple Use of the TopCount() Function

In a manner similar to that in the Syntax section illustration above, the top ten Warehouse Cities are returned (this time specifically for operating year 1998). The operation of TopCount() has again been to sort by Warehouse Profit, and then to return the top ten values from that sort, as we requested in the function.

7.  Select File -> Save As, name the file MDX031-1, and place it in a meaningful location.

8.  Leave the query open for the next section.

Next, let's say that the Finance information consumers are provided with the simple "top ten" data we have generated. They state that they need to add the Warehouse Sales and a Margin % to the presentation, keeping the TopCount() based upon the Warehouse Profit measure, as before. This, they feel, will "round out" the presentation to provide more utility from an analysis perspective, presenting "more information at a single glance."

9.  Within the query we have saved as MDX031-1, replace the top comment line of the query with the following:

-- MDX031-2, "Top Ten 1998 Producers":  Add Sales and Margin Calc Member

10.  Save the query as MDX031-2, to keep MDX031-1 intact as a working sample.

11.  Add the following lines to the query, between the top comment line we just modified, and the SELECT keyword that begins the query:


WITH
MEMBER
    [Measures].[Margin %]
AS
   ' [Measures].[Warehouse Profit]/[Measures].[Warehouse Sales]'

This will create the calculated member Margin %, which we will present alongside the Warehouse Sales and Warehouse Profit measures in the following steps.

12.  Modify the ON COLUMNS line of the query to contain the Warehouse Sales measure, together with the new Margin % calculated measure defined above, as follows:


{[Measures].[Warehouse Sales], [Measures].[Warehouse Profit],   
     [Measures].[Margin %]}  ON COLUMNS,

13.  Leave the remainder of the query in its original state.

The Query pane appears as depicted in Illustration 2, with our modifications marked.


Illustration 2: The Query with Added Measure and Calculated Measure

14.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 3 appears.


Illustration 3: Result Dataset - With Our Modifications

We note that, although the order of the "top ten" has not changed (the TopCount() function continues, after all, to be based upon the Warehouse Profit measure), the Margin % calculated measure is not sorted. (We will focus on this calculated measure as an "additional perspective" next.)

15.  Re-save the file as MDX031-2.

16.  Leave the query open for the next step.

We present the results to the information consumers, who are quite happy with the outcome. At this point, we propose to extend the requirement once again, and to modify the query to return the top ten Warehouse Cities based upon the new Margin % calculated measure. This will provide another analytical perspective, we say; we also mention our confidence, based upon our experience within the realm of business intelligence, that this analytical "view" may act to broaden the perspective of Warehouse Profit alone.

We have only to make one simple change to the function to deliver this additional perspective. We will do so, and verify the operation of the modified TopCount() function, by taking the following steps:

17.  Within the query we have saved as MDX031-2, replace the top comment line with the following:

-- MDX031-3, Top Ten 1998 Producers":  
   Calc Member is the TopCount() Basis

18.  Save the query as MDX031-3, to keep MDX031-2 intact as a working sample.

19.  Replace [Measures].[Warehouse Profit] within the Topcount() function (in the ON ROWS line) with [Measures].[Margn %], the new calculated member we added above.

The ON ROWS line appears as follows after the change:


{ TOPCOUNT([Warehouse].[City].Members, 10, 
      ([Measures].[Margin %]))}  ON ROWS

The Query pane appears as depicted in Illustration 4, with our modifications marked, once again.


Illustration 4: The Query with Our Modifications Marked

20.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 5 appears.


Illustration 5: Result Dataset - Presenting a New View, Indeed

It quickly becomes obvious that we have derived a new view of the business, indeed, with this small modification. The operation of the TopCount() function, based in this instance upon the Margin % attained by each of the Warehouse Cities, results in a sort of the Warehouse Cities accordingly. The subsequent selection of the top ten from this sort presents Warehouse Cities that did not appear earlier, because their Profit levels were not among the highest in the organization. The substitution of the Margin %, however, as the basis for the TopCount() function has revealed a higher margin of return in Warehouse Cities that simply do not "register "in the former query, where their relative size obscures their performance.

This revelation is welcomed by the information consumers, who can now enhance their analysis capabilities by taking into consideration both perspectives (Warehouse Profit and Margin Percent) to isolate top performers. The study of the top groups from the perspective of Margin Percentage will likely add valuable lessons that can be extrapolated to other Warehouse operations, regardless of size. This is an excellent illustration of the power of multidimensional analysis, courtesy of the pairing of the TopCount() function with a meaningful calculated member to delve deeper in the analysis of top organizational performers.

21.  Re-save the file as MDX031-3.

22.  Exit the MDX Sample Application when ready.

Summary ...

This article served as the first of a pair of articles surrounding the potentially powerful TopCount() function. In this introductory session, we examined the TopCount() function, noting its obvious value in equipping us with a means of isolating high performers from among hundreds or thousands of fellow members. We noted that this ranking capability is often critical in data analysis and decision support scenarios, and then discussed how TopCount() facilitates our performing such ranking, first discussing the general operation of TopCount(), and then examining the syntax surrounding the function.

We then undertook exercises where we practiced using the function in meeting the business requirements of a hypothetical group of information consumers. We focused on a simple use of the function, to allow for minimal distraction while grounding ourselves in the basics, before providing a straightforward, yet meaningful, example of how we might leverage our core query with the addition of a calculated member to achieve a revealing additional perspective in our analysis of the performance of operational units. Throughout the practice examples, we briefly discussed the results datasets we obtained with regard to the TopCount() function, together with other surrounding considerations.

» See All Articles by Columnist William E. Pearson, III

Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved