MS Access for the Business Environment: Access Query Techniques: Crosstab Queries

Monday Nov 3rd 2003 by William Pearson
Share:

View data from another perspective with a crosstab query. Join author Bill Pearson in a hands-on tutorial that explains how to summarize and present information in a concise table of rows and columns.

About the Series ...

This article continues the series, MS Access for the Business Environment. The primary focus of this series is an examination of business uses for the MS Access relational database management system. The series is designed to provide guidance in the practical application of data and database concepts to meet specific needs in the business world. While the majority of the procedures I demonstrate will be undertaken with MS Access 2002, many of the concepts that we expose in the series will apply to other versions of MS Access.

For more information on the series, as well as the hardware / software requirements to prepare for the tutorials we will undertake, please see Tutorial 1: Create a Calculated Field with the Expression Builder.

Introduction to this Tutorial

As many of us know from general business experience, the crosstab format is popular for data presentation among a wide range of information consumers. The typical objective of a crosstab query is to display summarized values from one field in a table, grouping them by one set of dimensions listed down the left side of the dataset and another set of dimensions listed across the top of the dataset.

While we can certainly rely upon the Crosstab Query Wizard to guide us through the creation of a basic crosstab query, the objective of this tutorial is to explore the creation and use of an SQL query to render a result dataset in a crosstab format. To accomplish this, we will introduce the TRANSFORM statement, and then use it in a multiple-step practice example to become familiar with its operation.

Along with an introduction to the TRANSFORM statement, this lesson will include:

  • an examination of the syntax surrounding the use of the statement;
  • illustrative examples of the use of the statement in a multi-step practice exercise that constructs a crosstab query;
  • a brief discussion of the results datasets that we obtain in each step of the practice example.

    We will use the Northwind sample database that installs with MS Access for purposes of this lesson, in an attempt to make the lesson useful to virtually anyone who has access to MS Access.

    The TRANSFORM Statement

    When we summarize data using a crosstab query, we select and position values, from specified fields or expressions, as column headings to obtain a results dataset that is presented in a more compact format than what we would see with a standard SELECT query. The TRANSFORM statement puts into play an aggregate function within a SELECT statement to bring about the generation of the crosstab query.

    We will examine the syntax of the TRANSFORM statement to gain an understanding of its components, and then we will build a crosstab query. Starting with a basic SELECT statement, we will add the TRANSFORM statement in multiple steps to illustrate its use in reaching our end objective.

     

    Discussion

    As we noted in the introduction, a crosstab query usually displays summarized values, or aggregates, from one field in a table. The crosstab query generates a table that has both a row header and a column header; it groups the aggregates by a set of dimensions on the left-axis, and another set of dimensions listed across the top of the dataset. The resulting aggregate data is therefore listed in the table under a cross-reference between these row and column headers - an intersect point, in other words. This is a convenient way to display data in a compact, summarized format.

    The available aggregate functions, which are used to calculate statistical information when applied to a specified numeric field in the SELECT query, include the functions shown in Table 1.



     

    FUNCTION

     

    OUTPUT

         

    AVG

     

    Calculates the arithmetic mean of a specified set of values

    COUNT

     

    Calculates the number of records returned by a query

    MAX

     

    Returns the maximum of a set of specified values or a table

    MIN

     

    Return the minimum of a set of specified values or a table

    STDEV

     

    Estimates standard deviation, a measure of how widely values are dispersed from a mean, based upon a sample of a population

    STDEVP

     

    Estimates standard deviation, a measure of how widely values are dispersed from a mean, based upon a population

    SUM

     

    Returns the sum of a set of values contained in a specified field

    VAR

     

    Returns estimates of the variance for a specified population sample

    VARP

     

    Returns estimates of the variance for a specified population

    Table 1: Aggregate Functions

    We most often see the COUNT() and SUM() functions used in cases where the SELECT query is being subjected to TRANSFORM.

    When we use the TRANSFORM statement to convert a standard SELECT query to a crosstab query, we specify the fields we want to display as row headings, together with using a GROUP BY clause that specifies row groupings. We specify the column headings with a PIVOT operator, and have the option of using an IN clause to limit those headings to fixed values. Finally, we can include other clauses, such as WHERE, as in any SELECT query, to specify additional selection or sorting criteria.

    We will see how a simple SELECT statement constructed for the Northwind database can be converted to a crosstab query, and how we can leverage the components of the TRANSFORM statement, in the steps of the practice example that follow.

    Syntax

    Syntactically, the TRANSFORM statement is used in the following basic manner:

    PARAMETERS [prompt] type;

    TRANSFORM aggregate function
    SELECT select statement

    PIVOT pivotfield [IN (value1 [, value2 [, ...] ] ) ]


    The TRANSFORM statement contains the components described in Table 2:

    Component

    Description

    PARAMETERS declaration (optional)

    The optional PARAMETERS declaration allows us to specify a prompt that we wish to appear at run time, as well as to specify the data type of the prompt input.

    aggregate function

    An SQL aggregate function that operates on the selected numerical field values. The subtotals generated appear under column headings in the result dataset.

    select statement

    A SELECT statement.

    The SELECT statement may contain the following additional clauses (of the following, only GROUP BY is required):

    • WHERE (optional) - Restricts the results dataset to a specified subset
    • GROUP BY (mandatory) - Enforces aggregation within row headings as selected in the specified fields list of the SELECT statement
    • ORDER BY (optional) - specifies the ordering of the row headings

    pivotfield

    Defines the field or expression used in creating column headings in the query's result dataset.

    IN Predicate (value1, value2, etc.)

    (optional)

    Value1, value2, etc. comprise a comma-separated list of desired column headings. The optional IN predicate, followed by our specified list of column headings, provides us the flexibility to show, or otherwise control order and appearance of, specific fields.

    Table 2: Transform Statement Components

    AS we have already seen, the TRANSFORM statement needs to come before the SELECT query in the SQL string. The SELECT statement specifies the fields used as row headings, and contains a GROUP BY clause that specifies row grouping. (The SELECT query can be quite sophisticated; we can add subqueries, for example, as predicates in the WHERE clause within our crosstab queries.

    The pivotfield values appear as column headings in the query's result dataset. A common example we see in the business world might be pivoting expense values on the month of the posting to generate twelve columns in the crosstab's result dataset - one for each month of the year (assuming that transactions had occurred on the specified expense accounts during each of the months). To carry this simple illustration a step further, we could easily restrict the results to only show the first quarter of the year by specifying months 1, 2, and 3, in an optional IN predicate. This would constrain the pivotfield to generate headings solely from the fixed values of the first three months of the year.

    I like to base whole reports off crosstab queries, in many instances. In such a scenario, little more than formatting need be accomplished within the reporting application. This can mean that we leave the "heavy lifting" involved in the report to the db engine, which is often the more optimal place for this to be handled.

    Let's move into a hands-on illustration to reinforce our understanding of crosstab queries, using the Northwind sample database that accompanies a typical installation of MS Access.

  • Practice

    We will confirm our understanding of the basics we have covered so far, by using the TRANSFORM statement in a multi-step practice example that illustrates various aspects of its operation. As we have in past lessons, we will create a simple query, then evolve it step by step to expose several of the concepts that we have discussed. As we shall also see in prospective lessons within this series, the query will be our tool for constructing and executing the SQL we examine together, and for viewing the result datasets we obtain.

    We will start MS Access and proceed, taking the following steps:

    1.  Go to the Start button on the PC, and then navigate to the Microsoft Access icon, as we did in Lesson 1: Create a Calculated Field with the Expression Builder.

    2.  Click the icon to start MS Access.

    MS Access opens, and may display the initial dialog. If so, close it.

    3.  Select File -> Open from the top menu, and navigate to the Northwind sample database (the file might also be accessed from the Open a File menu atop the task pane, if it has not been disabled previously, at the right side of the main window in Access 2002.)

    4.  Select Northwind.mdb.

    The splash screen may appear.

    NOTE: As we have noted earlier in the series, we can preclude the appearance of the splash screen each time we enter the sample Northwind database by checking the "Don't show this screen again." checkbox. For now, we will leave it unchecked.

    5.  Click OK.

    The splash screen disappears, and is replaced by the Main Switchboard, as shown in Illustration 1.


    Illustration 1: Inside MS Access, Northwind Main Switchboard

    6.  Click the Display Database Window, or get there by an alternative approach.

    We arrive at the Database Window, which appears as depicted in Illustration 2.


    Illustration 2: Inside Access, Northwind Main Switchboard

    7.  Click Queries, under Objects in the Database window.

    The existing queries appear, as shown in Illustration 3.


    Illustration 3: Queries in the Northwind Database (Partial View - with New Button Circled)

    NOTE: The queries that appear in your individual view may differ, depending upon past activities within the sample database, etc.

    8.  Click the New button, shown circled in Illustration 3 above.

    The New Query dialog appears, as shown in Illustration 4.


    Illustration 4: The New Query Dialog

    9.  Ensuring that the Design View option is selected, click OK.

    The Select Query dialog appears by default, with the Show Table dialog appearing in front, as shown in Illustration 5.


    Illustration 5: The Select Query and Show Table Dialogs

    10.  Click Close on the Show Table dialog.

    The Show Table dialog closes, leaving only the Select Query dialog present.

    We will now move to SQL view, as we wish to use direct SQL and to work with queries at a level that goes beyond working with wizards, or even the Design view we have used in the past.

    11.  Select SQL View using the View Selector button in the main toolbar (it appears under the File item on the main menu), as shown in Illustration 6.


    Illustration 6: Select SQL View

    The SQL view editor appears, complete with a SELECT keyword in place, followed by the ubiquitous ending character for MS Access queries, the semicolon (";"). Illustration 7 depicts the initial view.


    Illustration 7: Initial SQL View

    Here we can enter, display, and / or modify a query using SQL directly, as we will throughout this lesson. (We can do many things here that might prove difficult or impossible in Design view, or under the auspices of wizardry. We want to see the actual SQL for a crosstab in the present example, and the powerful flexibilities it affords us.)

    We will compose a simple SELECT query to gain an understanding of our fundamental dataset; our query will focus on the customer and product information, with the appropriate joins to create relationships between various tables in the Northwind database.

    12.  Type the following basic SELECT query into the editor:

    SELECT
    
       Customers.CompanyName, Categories.CategoryName, 
    
         Count(Products.ProductID) AS Qty
    
    FROM
     
       (Categories 
     
         INNER JOIN Products ON Categories.CategoryID = Products.CategoryID)  border=0 alt="">
    
            INNER JOIN ((Customers 
     
               INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID) 
     
                  INNER JOIN [Order Details] 
     
    ON Orders.OrderID = [Order Details].OrderID) 
     
       ON Products.ProductID = [Order Details].ProductID
    
    GROUP BY Customers.CompanyName, Categories.CategoryName, 
     
       Products.ProductID;
    

    We can run the query at this stage by selecting Query --> Run from the main menu.

    13.  Save the query as ACC06-Step1.

    14.  Select Query --> Run from the main menu.

    The resulting dataset appears as partially shown in Illustration 8.


    Illustration 8: The SELECT Query Dataset

    We have assembled the basic SELECT query, and run it to obtain the resulting dataset. Our objective in doing this is to be able to see the differences in the SELECT query and the crosstab query, which is essentially a "transformed" SELECT query that calculates and restructures the SELECT query dataset into a more useful presentation layout.

    Notice that our SELECT query dataset only groups the totals vertically by customer company name and product category. This results in multiple records in many cases, making comparisons between different customers' total products more difficult. It is even more difficult to readily tell at a glance the composition of each customer's activity, totaled by product category, or even to easily discern a total quantity of products. The database with which we are practicing is quite small - if we can extrapolate this same scenario to a much larger database, it is easy to see why the simple SELECT query dataset might not meet our needs in a user-friendly way.

    A crosstab query displays the same information, but groups it both horizontally and vertically so the datasheet is more compact and easier to analyze. We shall see the effect in our next steps, as we use the TRANSFORM statement to present our data in a crosstab, grouping by product category horizontally and by customer vertically.

    15.  Ensure that the query is saved as ACC06-Step1.

    16.  Shift back to SQL view, if necessary.

    17.  Skip a couple of lines below the existing query, and type a dotted line approximately the width of the query.

    18.  Skip a couple lines below the dotted line, and input the following crosstab query exactly as shown:

    TRANSFORM Count(Products.ProductID) AS Qty
    
    SELECT Customers.CompanyName
    
    FROM 
    
    (Categories 
    
       INNER JOIN Products ON Categories.CategoryID = Products.CategoryID)  border=0 alt="">  
    
          INNER JOIN ((Customers 
    
             INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID) 
    
                INNER JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID)  
    
                   ON Products.ProductID = [Order Details].ProductID
    
    GROUP BY Customers.CompanyName
    
    PIVOT Categories.CategoryName;
    

    The query should now resemble that shown in Illustration 9.


    Illustration 9: Juxtaposed SELECT and Crosstab Queries, for Comparative Purposes

    My intent in adding the TRANSFORM statement below the original SELECT query, after the "dividing line," is simply to offer a platform whereby we might see the two juxtaposed briefly, to allow for a comparison. Obviously we would not attempt to run the query in its present state.

    We can see the rearrangement of the upper SELECT query into the lower TRANSFORM statement's layout: First, the TRANSFORM statement applies the COUNT aggregate function. We see that the TRANSFORM statement appears before the SELECT statement that it is "transforming."
     
    After the mandatory SELECT statement, which, as we can easily note in our arrangement from above, is identical to the initial SELECT query (including the GROUP BY clause) , we can see the PIVOT clause, where we have placed the Categories.CategoryName field that once existed in the first line of the SELECT query. Recall that the PIVOT clause determines the column names for the crosstab - in our case, it sets up a column for every existing Category Name in the dataset under consideration.

    19.  Delete the top half of the existing text in the query - down to and including, the dotted line. Align the new crosstab query (from TRANSFORM down) to the top of the SQL view editor window, as shown in Illustration 10.

    20.  Save the query as ACC06-Step2.

    The query should now resemble that shown in Illustration 10.


    Illustration 10: The New Crosstab Query

    21.  Select Query --> Run from the main menu.

    The resulting dataset appears as partially shown in Illustration 11.


    Illustration 11: The Dataset Retrieved by the Crosstab Query (Partial View)

    We can see that the crosstab query returns a list of different Customers with product quantities (counts), summarized by product category, for each. As we have noted, the PIVOT is pivoting over Category Name, presenting an individual column for which sales occurred. We also note that there are a total of 89 records.

    Let's take a look at restricting the data retrieved with an IN clause addition to the PIVOT clause.

    22.  Shift back to SQL view, once again, if necessary.

    23.  Add the following between the end of the existing PIVOT line of the query and the ";" character at its end:

    IN ("Produce", "Seafood", Magazines);

    NOTE: Be sure to avoid surrounding Magazines with quotation marks in the added clause. We shall see why momentarily.

    The query should now resemble that shown in Illustration 12.


    Illustration 12: The Crosstab Query with Modified PIVOT Portion

    24.  Select Query --> Run from the main menu.

    The resulting dataset appears as partially shown in Illustration 13.


    Illustration 13: The Dataset Retrieved by the Restricted Crosstab Query (Partial View)

    We note that, while the number of records returned remains the same number as we saw retrieved in our last query (89), we now have only three columns; two of the columns ("Produce" and "Seafood") are driven by our request for fixed, existing columns in the dataset. I added the third field (Magazines) to the IN clause to illustrate the manner by which we might add a fixed value that does not exist naturally in the database table, as a means of creating a column in the returned dataset. (Had I enclosed Magazines in quotation marks, then the quotation marks would have been literally presented in the newly created column heading.)

    25.  Save the query as ACC06-Step3.

    In wrapping up our practice example, we will now examine another way to restricting the returned data. This time, we will add an optional PARAMETERS declaration, whereby we create a filter that is selected by the user of the query at run time. The user is prompted, in our example, for the year for which he / she wishes to restrict the query. Many options exist in the typical database for parameters by which to filter via this means.

    26.  From SQL view add the following, precisely as shown, above the TRANSFORM clause, to the SQL in ACC06-Step3:

    PARAMETERS [What Year?] Long;

    27.  Add the following between the end of the FROM clause (ON Products.ProductID = [Order Details].ProductID) and before the GROUP BY clause:

    WHERE DatePart("yyyy", Orders.OrderDate) = [What Year?] 

    The query should now resemble that shown in Illustration 14.

     


    Illustration 14: The Crosstab Query with Modifications

    28.  Select Query --> Run from the main menu.

    We are prompted for "What Year?"

    29.  Type 1996 into the Enter Parameter Value dialog, as shown in Illustration 15.


    Illustration 15: Enter 1996 into the Dialog

    30.  Click OK to run the query.

    The Enter Parameter Value dialog disappears, and the query is executed, retrieving the filtered dataset.

    31.  Save the query as ACC06-Step4.

    The final dataset appears as partially shown in Illustration 16.


    Illustration 16: The Retrieved Dataset, Filtered for the Year Input

    We now see the power of the Year filter: the records are filtered for those containing order dates of 1996 (and, as a quick review of the SQL makes clear, are joined to the fields that we entrain via the SELECT query). We note that the number of records represented by Illustration 16 (66) is a subset of the pre-filter query result dataset that we viewed in Illustration 13 (89).

    With even minimal examination, we can discern the utility of the TRANSFORM statement, and of the crosstab query in general, for generating compact, user-friendly presentations of data. There are many other considerations in deciding whether to deploy crosstab functionality, such as the number of records in the underlying tables, the complexity of the underlying SELECT statement and so forth. Executing a crosstab query can cause serious performance degradation on what might be a critical server, among other negatives, depending on these factors. Careful planning would be prudent in a production environment, as always. But, in general, crosstab queries can be of dramatic benefit.

    Conclusion...

    With this lesson we explored the creation and use of an SQL query to render a result dataset in a crosstab format. To this end, we introduced the TRANSFORM statement, then examined the syntax surrounding the use of the statement. We then practiced the construction of crosstab queries within the context of an illustrative, multiple-step example, discussing the relevant characteristics of some of its components to become familiar with crosstab query operation. Finally, we briefly discussed various aspects of the results datasets that we obtained in each step of the practice examples.

    » See All Articles by Columnist William E. Pearson, III

    Share:
    Home
    Mobile Site | Full Site
    Copyright 2017 © QuinStreet Inc. All Rights Reserved