MDX Essentials: Basic Numeric Functions: The Count() Function

Monday Mar 8th 2004 by William Pearson
Share:

Return the number of cells within a set. MSAS Architect Bill Pearson introduces the Count() function and leads practice in its use.

About the Series ...

This is the seventeenth article of the series, MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs.

For more information about the series in general, as well as the software and systems requirements needed for getting the most out of the lessons included, please see the first article, MDX at First Glance: Introduction to MDX Essentials.

Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples.

What We Accomplished in our Last Article

In the last article of the series, Basic Set Functions: The Filter() Function, we explored the useful Filter() function. We learned that the purpose of the function is to allow us to filter out parts of a set that we do not need, and to return a subset of a larger specified set as a result. We discussed Filter() in general, along with the capabilities it affords us within MDX, emphasizing its value in helping us to attain efficiency and precision in our MDX queries.

In addition to discussing the purpose and operation of the Filter() function, we examined the syntax involved in its use. We then illustrated its application in practice exercises. Finally, we provided, as a complement to our hands-on exposure to the Filter() function, a brief discussion of the MDX results we obtained.

Introduction

Microsoft Analysis Services ("MSAS"), as most of us know, leads the enterprise business intelligence arena with its rich set of mathematical and analytical tools. Most of these tools rely upon functions based in the MDX query language. MDX is integrated not only within MSAS, but also throughout the entire Microsoft Business Intelligence Platform in applications that include Data Mining and Reporting Services. This integration provides a distinct advantage for the platform over the offerings of other vendors, and, particularly in the case of numerical functions, allows for easy, consistent application of built-in logic.

In this lesson, we will extend our attention from the purely set-related functions upon which we have concentrated in the last several articles; we will focus on a numerical function, Count(), specifically as it is applied to sets. We will discuss the straightforward purpose of the function, to return the number of cells in a specified set, and then expose approaches within the syntax for handling the inclusion or exclusion of empty cells within the returned result.

Along with an introduction to the Count() function, this lesson will include:

  • an examination of the syntaxes surrounding the function;
  • illustrative examples of the uses of the function in practice exercises;
  • a brief discussion of the MDX results we obtain in the practice examples.

The Count Function

According to the Analysis Services Books Online, the Count function "returns the number of items in a collection, depending on the collection." The term "collection" here is generic, as the "collections" to which the Books Online refers can consist of the dimensions in a cube or a tuple, the cells in a set, or the levels in a dimension or a hierarchy. Count handles these "collections" in a couple of different syntactical ways. The scope of this article will be to focus more on the set-specific use of Count(), and we will refer to the function with the appended parenthesis ( "()" ) in that context.

We will examine the syntax for the Count() function, and then take a look at the options it affords with regard to the exclusion of empty cells from the counts it performs. Next, we will undertake practice examples built around hypothetical business needs that illustrate logical uses for the function. We will extend the examples to include exclusion of empty cells in a set we construct, to afford us a scenario for examining the options to include or exclude these cells in the overall count.

Discussion

The Count() function counts the cells defined by the set that is specified in the function. The syntax with which we employ the function affords us the option of specifying whether or not we wish to count empty cells within the range of the specified set. If we do not wish to include empties, we simply add the appropriate keyword to restrict the returned count to populated cells only.

Let's look at some syntax illustrations to further clarify the operation of Count().

Syntax

Syntactically, the set we use to specify the range of cells to count is placed within the parentheses to the right of Count, and separated by a comma from the optional flag we have described. The optional EXCLUDEEMPTY flag overrides the default behavior of the function (which is INCLUDEEMPTY) with regard to empty cells, and forces the function to ignore any unpopulated cells within the set specified.

The syntax is shown in the following string:

Count(Set, [, INCLUDEEMPTY | EXCLUDEEMPTY])

The following example expression illustrates a use of the Count() function. Let's say that FoodMart information consumers, whose data is housed within the Warehouse cube, wish to see the number of Warehouses supporting each of our Product Departments in a given year. The basic Count() function involved, disregarding for the time being the empty or non-empty consideration, would be as follows:

Count ({[Warehouse].[City].MEMBERS })  

This is simply expressing that we wish to count the "number of Warehouse City members" for a given purpose. Looking at our cube structure within the metadata pane of the MDX Sample Application, as shown in Illustration 1, we can verify that there are twenty-three Warehouse City locations by performing a manual count.


Illustration 1: We Verify Twenty - Three Warehouse Cities in the Warehouse Cube

If we were to create an MDX query, within which we construct a calculated member to perform the count (see Calculated Members: Introduction and Calculated Members: Further Considerations and Perspectives, among kindred articles, in the Database Journal MDX in Analysis Services series for a hands-on introduction to calculated members), and to obtain the information requested, we might begin with the following:

WITH
    Member [Warehouse].[No. Locations]
AS
    'COUNT ( {[Warehouse].[City].MEMBERS })'
SELECT
    { [Warehouse]. [No. Locations] } ON COLUMNS,

    { [Product].[Product Department].Members} ON ROWS
FROM
    [Warehouse]
WHERE 
     [Time].[1998]

This simple query, for a count of the Warehouse Cities by Product Department, would return a dataset similar to that depicted in Table 1.

Product Department

No. Locations

Alcoholic Beverages

23

Beverages

23

Dairy

23

Baked Goods

23

Baking Goods

23

Breakfast Foods

23

Canned Foods

23

Canned Products

23

Dairy

23

Deli

23

Eggs

23

Frozen Foods

23

Meat

23

Produce

23

Seafood

23

Snack Foods

23

Snacks

23

Starchy Foods

23

Carousel

23

Checkout

23

Health and Hygiene

23

Household

23

Periodicals

23


Table 1: Results of a Simple Count of Warehouse Cities by Product Department

If we were to scrutinize the data, or perhaps to query the cube for the same information in a different manner, we would soon realize that the above results reflect every possible combination of Product Department and Warehouse City. This is because we have left the Count() function at default; not stipulating that we wished to leave empty cells (representing Product Department / Warehouse City combinations that witnessed no action), through the use of the EXCLUDEEMPTY keyword in our Count() function means that INCLUDEEMPTY flag is assumed by default.

To prove this, we merely add the EXCLUDEEMPTY keyword, flagging the function to ignore empty cells in the count, as follows:

   COUNT ( {[Warehouse].[City].MEMBERS}, EXCLUDEEMPTY)

Upon execution with the change above, the query would return a dataset similar to that depicted in Table 2.

Product Department

No. Locations

Alcoholic Beverages

20

Beverages

23

Dairy

21

Baked Goods

23

Baking Goods

23

Breakfast Foods

22

Canned Foods

23

Canned Products

18

Dairy

23

Deli

23

Eggs

22

Frozen Foods

23

Meat

17

Produce

23

Seafood

16

Snack Foods

23

Snacks

22

Starchy Foods

20

Carousel

16

Checkout

19

Health and Hygiene

23

Household

23

Periodicals

18


Table 2: Results of a Count of Warehouse Cities by Product Department, Empty Cells Not Counted

We will practice the use of the Count() function in the section that follows. Moreover, we will explore the use of the EXCLUDEEMPTY flag we have discussed, to activate the concept of its use in eliminating empty cells from consideration by the function. Finally, we will go a step further, and confirm our results by looking at the data from another perspective.

Practice

The Basics

To reinforce our understanding of the basics we have covered so far, we will use the Count() function in a manner that illustrates its operation through a multi-step practice example: We will first construct a simple select query that performs a basic count.

We will use the MDX Sample Application again, as our tool to construct and execute the MDX we examine, and to view the results datasets we obtain.

1.  Start the MDX Sample Application.

2.  Clear the top area (the Query pane) of any queries or remnants that might appear.

3.  Ensure that FoodMart 2000 is selected as the database name in the DB box of the toolbar.

4.  Select the Warehouse cube in the Cube drop-down list box.

Let's assume for our practice example that we have received a call from corporate Finance needing some statistics to insert into a footnote of the 1998 Financial Statements. They wish to know how many of each Store Type the organization operates in each Country. "Operation" in this context is taken to mean that placement of orders from organizational warehouses.

To rephrase, our objective will be to present a count of Store Types by Country, for Year 1998, based upon data in the FoodMart Warehouse cube.

We will begin by composing a simple query to return the count without regard to empty cells within the specified set.

5.  Type the following query into the Query pane:


-- MDX17-1:  Counting that includes Empty Cells
WITH
    Member [Store Type].[Types Count]
AS
    'Count( {[Store Type].[Store Type].MEMBERS })'
SELECT
    {  [Store Type].[Types Count] } ON COLUMNS,
    { [Store].[Store Country].MEMBERS } ON ROWS
FROM
   [Warehouse]
WHERE 
([Time].[1998], [Measures].[Units Ordered])

The purpose of this query is simply to generate a basic count of all cells that lie within the range of the specified set. Keep in mind that, though no flag appears in the Count() function, the default condition, INCLUDEEMPTY, is in force.

6.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated by Analysis Services, and the dataset shown in Illustration 2 appears.


Illustration 2: Result Dataset - Simple Count with Default INCLUDEEMPTY

While we see the count of Store Types in each of our three Countries of operation to be six, a quick glance at the Store Type dimension in the metadata pane of the Sample Application indicates that the FoodMart organization coincidentally has exactly six Store Types, as depicted in Illustration 3.


Illustration 3: The Store Type Dimension in the Metadata Pane

One of those store types is Headquarters, a type that exists for classification purposes only, as an entity that experiences no ordering activity. For this and other reasons, we can see that the counts we deliver to Finance need to be adjusted for empty cells / invalid combinations.

7.  Select File --> Save As, name the file MDX17-1, and place it in a meaningful location.

8.  Leave the query open for the next section.

We will next adjust the Count() function in our query to enforce removal of empty cells from the count.


Excluding Empty Cells


The elimination of empty cells is a mere keyword away. We will accomplish our objective in the steps that follow:


1.  Within the query we have saved as MDX17-1, add the EXCLUDEEMPTY keyword to the Count() function, separated by a comma from the existing specified set, as shown:


'Count( {[Store Type].[Store Type].MEMBERS }, 
  EXCLUDEEMPTY)'

2.  Replace the top comment line of the query with the following:

-- MDX17-2:  Counting that excludes Empty Cells

The Query pane appears as shown in Illustration 4.

Click for larger image

Illustration 4: The Query with Modifications Circled

3.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 5 appears. We see that the counts drop to two, five and five, for the Canada, Mexico and USA Store Types, respectively.


Illustration 5: Result Dataset - Count() with EXCLUDEEMPTY Flag in Place

4.  Save the file as MDX17-2.

Let's tidy things up a bit by ordering our results dataset based upon the counts we have obtained.

5.  Within the query we have saved as MDX17-2, make the following change to the ON ROWS portion of the SELECT section of the query, as shown:

Change:

    { [Store].[Store Country].MEMBERS } ON ROWS

To:


{ORDER ( [Store].[Store Country].MEMBERS,
         [Store Type].[Types Count] , BDESC ) } ON ROWS

6.  Replace the top comment line of the query with the following:

    -- MDX17-3:  Ordering the Row Axis

The Query pane appears as shown in Illustration 6.


Illustration 6: The Query with Modifications Circled

7.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated, and the dataset shown in Illustration 7 appears. The results are ordered by Types Count, largest to smallest.


Illustration 7: Result Dataset - Ordered by Types Count (Descending)

NOTE: For a tutorial surrounding the Order() function, see MDX Essentials: Basic Set Functions: The Order() Function.

8.  Save the file as MDX17-3.

We appear to have attained our objective. One more step, near and dear to the hearts of those who wish to ensure the quality of their work, and particularly to recovering CPA's like myself, is a test of accuracy and completeness. Let's look at the answer we are about to give Finance from another perspective - and at the same time look ahead to the subject of our next lesson, the CrossJoin() function.

Ascertaining Accuracy and Completeness

To get a feel for the correctness of the information we are about to report to the information consumers in Finance, let's draft a query that asks for the data from a different viewpoint. We will request the information indirectly, without using the Count() function at all, but simply asking for Units Ordered information for each Store Type, by Country.

We will use a Crossjoin() function to accomplish this alternative perspective. CrossJoin() allows us to combine two or more dimensions on a single row, column, etc., and can be a powerful tool to use in MDX queries. We will devote an article to CrossJoin() in our next lesson, but, for now, we will use it as a means of checking our results from the prior sections. Consider this a preview of CrossJoin() as well, and as an opportunity to give it some thought before we focus on its use in the next and later articles.

1.  Select File --> New to create a new query in the Sample Application.

2.  Type the following into the Query pane:


-- MDX17-4:  Proof of  Count() Results
SELECT
    {[Measures].[Units Ordered] } ON COLUMNS,
NON EMPTY
CROSSJOIN (
    {[Store].[Store Country].Members},
    {[Store Type].[Store Type].Members}
        ) ON ROWS
FROM
    [Warehouse]
WHERE 
[Time].[1998]

3.  Execute the query by clicking the Run Query button in the toolbar.

The Results pane is populated by Analysis Services, and the dataset shown in Illustration 8 appears.


Illustration 8: Result Dataset - Using CrossJoin() to Generate Another Perspective

While we have asked for the Units Ordered quantities for each Store Type, by Country (excluding empty combinations with another keyword for that purpose, NON EMPTY, that we will encounter again in the future), we have also achieved another objective: we can see that our counts were accurate in reviewing the physical breakdown of the Store Types that had activity for each country. This provides proof positive that our counts in earlier sections were correct, and, in our business case, that the numbers we intend to provide to Finance are accurate. (With only a little additional work, we will even be ready to respond to the virtually certain request for "more information" once we provide the statistics requested!)

Summary ...

In this lesson, we stepped out of the purely set-related functions that we have examined in recent articles to focus on the numerical Count() function, as it is applied to sets. We introduced the Count() function with a discussion of its straightforward purpose, to return the number of cells in a specified set, and then exposed options within the syntax for overriding the default behavior of the function, to exclude empty cells within the range of the specified set, and therefore within the returned result of a query using the function.

Along with an introduction to the Count() function, our lesson included an examination of the syntaxes surrounding the function and illustrative examples where we used the function, both with and without the EXCLUDEEMPTY keyword, to meet a hypothetical business need. We performed a query using the CrossJoin() function to act as a "proofing" procedure, to ensure the accuracy and completeness of the results we obtained within our practice with the Count() function, as well as to preview CrossJoin() for our next article. Finally, throughout the steps of our examples, we discussed the results we obtained using MDX.

» See All Articles by Columnist William E. Pearson, III

Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved