MS Access for the Business Environment: Access Query Techniques: Using the TOP Keyword

Monday Jan 5th 2004 by William Pearson
Share:

Find the first (n) records in an Access query. In this lesson, Bill Pearson explores the use of the TOP keyword to return a specific number or percentage of records at the top or bottom of an ordered query result dataset.

About the Series ...

This article continues the series, MS Access for the Business Environment. The primary focus of this series is an examination of business uses for the MS Access relational database management system. The series is designed to provide guidance in the practical application of data and database concepts to meet specific needs in the business world. The majority of the procedures I demonstrate in this article and going forward will be undertaken with MS Access 2003, although most of the concepts that we explore in the series will apply to earlier versions of MS Access, as well.

For more information on the series, as well as the hardware / software requirements to prepare for the tutorials we will undertake, please see Tutorial 1: Create a Calculated Field with the Expression Builder.

Introduction to this Tutorial

In our last lesson, Access Query Techniques: Subqueries, Part I, we introduced subqueries within the context of MS Access, and practiced their creation and use in rendering a result dataset to meet illustrative business needs. As a part of our exploration of subqueries, we examined the syntax surrounding the use of a straightforward sample subquery; we then began an illustrative, hands-on example of the use of a subquery in a multi-step practice exercise. After evolving an initial query, we created a second query, within which we nested the first, to demonstrate the operation of a subquery in our practice exercise. Finally, we briefly discussed various aspects of the results datasets that we obtained in each step of the practice examples.

Before moving to our next article, Access Query Techniques: Subqueries, Part II, we will first consider a keyword whose use we will see again in the second half of our exploration of subqueries, as well as elsewhere. We will overview the TOP keyword, and get a feel for its operation in an undistracted scenario, before proceeding into the creation of a subquery that leverages the keyword to illustrate more sophistication in the use of subqueries than we saw in Part I.

The objective of this lesson is to introduce the TOP keyword as it is used in Microsoft Jet SQL, and to explore ways we can use it to meet illustrative business needs. To accomplish this, we will follow our approach in previous lessons, and perform the following activities surrounding our examination of the TOP keyword:

  • an examination of the syntax surrounding the use of TOP;
  • an illustrative, hands-on example of the use of TOP in a practice exercise;
  • a similar example where we use the optional PERCENT keyword, in conjunction with TOP, to return a percentage, versus a number, of rows in an illustrative dataset;
  • a brief discussion of the results datasets that we obtain in each of the practice examples.

We will use the Northwind sample database that installs with MS Access for purposes of this lesson, in an attempt to make our discussion useful to virtually anyone who has access to the application.

Introduction to the TOP Keyword



The need to limit a list to a "top" number or percentage is quite common in business, as it is in many other facets of life. The "top ten" customers with regard to revenue, for example, for a given year would be a useful list for most organizations, in that it would be likely to present a clear idea of which customers have contributed the most to the all-important profit margins. By the same token, a "bottom" number or percentage can prove useful, as well; it can serve, for example, as an indicator of a good place to initiate steps to improve performance, or to take other remedial or preventative action.



We can use the TOP capability in the QBE pane of MS Access by simply typing (or selecting one of the defaults for) the number by which we wish to narrow a query's results dataset into the combo box provided for that purpose, as shown in Illustration 1, should we be using the Design view approach to writing a simple query to avoid writing "direct SQL." As we mentioned in our last lesson, however, the more real life business requirements we encounter, the more likely we will find ourselves in scenarios where we need to understand the SQL that is constructed behind the queries we create within the graphical environments. This lesson will focus on the construction of the SQL required to use the TOP keyword, along with the optional PERCENT keyword, to help us to leverage the TOP capabilities to their fullest, and to prepare us for their use in Access Query Techniques: Subqueries, Part II.




Illustration 1: Using TOP from the Top Values Combo Box in the Design View

Discussion

As we noted in the introduction, the TOP keyword allows us to restrict the number of rows returned in a results dataset to a number that we specify. The TOP keyword works in conjunction with the ORDER BY clause, and, in fact, is rendered useless without it. For example, if we order our results by a given value, in descending order, and we use TOP to specify twenty values, then it is obvious that the twenty values returned, being the top twenty in the results dataset, are the largest of the entire set.

Should we have specified ascending in the ORDER BY clause, the "top twenty" returned by the same TOP construction would actually be the bottom twenty values of the entire results dataset, as the first twenty entries in the ordered set would begin with the lowest value and continue in ascending order through the twentieth value.

It thus becomes clear why TOP is ineffective without the ORDER BY clause: the absence of order in the returned dataset would mean that the "top twenty" records selected from our dataset would be in no particular sequence, resulting in a haphazard selection of twenty records with no discernable pattern.

Using the PERCENT keyword in conjunction with the TOP keyword allows us to return a percentage of dataset rows, in contrast to the number of rows that we get when we use TOP alone. But like the solitary TOP keyword, the combination of TOP and PERCENT returns its results from the top or the bottom of the dataset, depending upon the selection within the ORDER BY clause.

Syntax

Let's take a look an example query, where we employ the TOP keyword as a means of examining the syntax involved:

SELECT TOP 10 ProductName, UnitPrice
FROM Products
ORDER BY UnitPrice DESC

The example query delivers the dataset shown in Illustration 2.


Illustration 2: Example Query Results - TOP Keyword

In this simple example, we are asking that the top ten records, in terms of Unit Price, be delivered from the population of the Products table; this results because we have asked for the top ten in a list of Products, which is sorted from highest to lowest with respect to Unit Price. The ten highest priced Products, therefore, compose the result dataset.

Now let's inject the PERCENT keyword into our example, to explore the syntax surrounding its use.

SELECT TOP 10 PERCENT ProductName, UnitPrice
FROM Products
ORDER BY UnitPrice DESC

The modified query delivers the dataset shown in Illustration 3.


Illustration 3: Example Query Results - TOP and PERCENT Keywords

We have modified our query to request the records that compose ten percent of the rows in the Products table. This turns out to be ten percent of 77 total rows (a simple Select * FROM Products shows 77 to be the total number of records in the results dataset), or, as our illustration indicates, eight rows (the closest we can get with whole rows) from our ordered range.

Let's move into a couple of hands-on illustrations to reinforce our understanding of these concepts, using the Northwind sample database.

Practice

We will again reinforce our understanding of the rudiments by working through a multi-step practice example that illustrates the concepts in operation.

As we have in past lessons, we will create a simple query, then evolve it to expose each of the concepts that we have discussed. The query that we build will be our tool for constructing and executing the SQL we examine together, and for viewing the results datasets we obtain.

Let's say, for purposes of our exercise, that we have been approached by organizational information consumers with a special year-end business requirement: Accounting, at the direction of the external auditors, has asked that we help them to provide information needed for a year-end, operational cutoff test that the auditors would like to conduct. While the audit team has the backup for the sales figures that the organization has generated for the year 1997, they wish to verify the reports that they have for accuracy and completeness.

As a part of their test work, the audit team wishes to review a sample list of the last independently recorded sales shipments for the year, to ensure that they are properly reflected in the organization's financial ledgers. To accomplish this, they want a report from the database showing the last twenty orders of 1997 - by shipment date, (revenue is properly recorded when earned, meaning that the shipping date is more relevant than the order date.) They will use this sample list to verify that the sales transactions to which the shipping records relate appear to be properly booked in the financial system.

To summarize, we need to generate a list of the last twenty shipments of 1997. This turns out to provide another excellent use of the TOP keyword, as we will soon see. We will build the query through multiple steps, so as to comment on its various components in a manner that reinforces the concepts involved. First, we will create a core SELECT query, and then we will add the refinements required to meet the information consumer's needs.

We will start MS Access and proceed, taking the following steps:

1.  Go to the Start button on the PC, and then navigate to the Microsoft Access icon, as we have in previous lessons in this series.

2.  Click the icon to start MS Access.

MS Access opens, and may display the initial dialog. If so, close it.

3.  Select File -> Open from the top menu, and navigate to the Northwind sample database (the file might also be accessed from the Open a File menu atop the task pane, if it has not been disabled previously, at the right side of the main window in MS Access.)

4.  Select Northwind.mdb.

The splash screen may appear.

NOTE: As we have noted earlier in the series, we can preclude the appearance of the splash screen each time we enter the sample Northwind database by checking the "Don't show this screen again." checkbox. For now, we will leave it unchecked.

5.  Click OK.

The splash screen disappears, and is replaced by the Main Switchboard.

6.  Click the Display Database Window, or get there by an alternative approach.

We arrive at the Database Window, which appears as depicted in Illustration 4.


Illustration 4: Inside Access, Northwind Main Switchboard

7.  Click Queries, under Objects in the Database window.

The existing queries appear.

8.  Click the New button on the Database window toolbar.

The New Query dialog appears, as shown in Illustration 5.


Illustration 5: The New Query Dialog

9.  Ensuring that the Design View option is selected, click OK.

The Select Query dialog appears by default, with the Show Table dialog appearing in front, as shown in Illustration 6.


Illustration 6: The Select Query and Show Table Dialogs

10.  Click Close on the Show Table dialog.

The Show Table dialog closes, leaving only the Select Query dialog present.

We will now move to SQL view, as we did in our previous lesson, because we wish to use direct SQL, and to work with queries at a level that goes beyond working with Design view.

11.  Select SQL View using the View Selector button in the main toolbar (it appears under the File item on the main menu).

The SQL view editor appears, complete with a SELECT keyword in place, followed by the ubiquitous ending character for MS Access queries, the semicolon (";").

We have again arrived at a point where we can enter, display and / or modify a query using SQL directly, as we will throughout this lesson. As we found in earlier lessons, we can do many things here that might prove difficult (if not impossible) in Design view, or within the realm of wizardry. While we can construct many queries within the query design grid, SQL view is clearly the environment from which to gain a basic understanding of SQL.

We will begin by composing a simple core SELECT query, to generate a result dataset from which we will meet the core consumer requirement; we will begin by listing all transactions as a means of establishing the layout of the final results dataset, whereby we want to provide the shipping date, order and customer identification, and the total amount of the order shipped. It appears that the data we need can be found within the Orders and Order Details tables, and so our query will center on these tables throughout the session.

12.  Type the following basic SELECT query into the editor:


SELECT
 
Orders.OrderID, ShippedDate, ShipName, ShipAddress, 

ShipCity, ShipCountry, 

UnitPrice*Quantity-(UnitPrice*Quantity*Discount) AS Total

FROM 

Orders INNER JOIN [Order Details]
ON 

Orders.OrderID = [Order Details].OrderID;

We can run the query at this stage, after saving it using File in the main menu.

13.  Save the query as ACC08-01.

14.  Select Query --> Run from the main menu.

The resulting dataset appears as partially shown in Illustration 7. We note that 2,155 records are returned, per the counter at the bottom of the Data view.


Illustration 7: The Core Query Results Dataset (Partial View)

We have assembled our basic SELECT query, and have run it to obtain the full population of the information fields we require from the Orders and Order Details tables. Our objective in doing is to get a layout established for the results we ultimately seek, and to ascertain that the calculation we have inserted for the Total amount of each record is working correctly (a quick test of the math will indicate that the Total calculation is, indeed, accurate). Beginning our query development in this way will also allow us to obtain a familiarity for the underlying data as we proceed.

We note that the above query delivers a record set containing multiple instances of the same Order ID in many cases. This is because the Order Details table breaks order information into a separate line item for each product, so that, for example, if a customer purchases three products in the same order, three records will exist with the same Order ID in the Order Details table.

Our information consumers want a single line item for each order, so we will take care of that feature next.

15.  Ensure that the query is saved as ACC08-01.

Let's modify the query to generate summed totals by Order ID. We will do this via two additions to our existing SQL, as shown in the next step.

16.  Shift back to SQL view, once again, if necessary.

17.  Add the following, immediately before UnitPrice, and just after ShipCountry, in the SELECT clause:

SUM(

18.  Insert a right parenthesis ")" between "Discount)" and "AS Total".

The objective is simply to enclose the calculation within the parentheses of a SUM function. The affected line within the SELECT clause should appear as below:

SUM(UnitPrice*Quantity-(UnitPrice*Quantity*Discount)) AS Total

19.  Add the following line to the query, just below the FROM clause:

GROUP BY Orders.OrderID, ShippedDate, ShipName, ShipAddress, ShipCity,
   ShipCountry;

20.  Remove the semicolon (";") that appears at the end of the line above that we just added.

The query should now resemble that shown in Illustration 8.


Illustration 8: The SELECT Query with Modifications

21.  Select Query --> Run from the main menu.

The resulting dataset appears as shown in Illustration 9.


Illustration 9: The Dataset Retrieved by the Modified SELECT Query (Partial View)

We note that the Order IDs are combined into single summed records. We can see also that the total number of records returned in the dataset is 830, which serves to confirm the fact that the product transactions with the same Order IDs have been combined.

22.  Save the query as ACC08-02.

Our next step will be to limit the transactions to the year 1997. Our results dataset currently displays the combined Order information for all years in the table. We can restrict the data to the year under examination with the addition of a simple WHERE clause, as we shall see.

23.  Shift back to SQL view, once again.

24.  Add the following WHERE clause between the existing FROM and GROUP BY clauses:

WHERE ShippedDate Between #1/1/1997#
  And #12/31/1997#

This will restrict the records returned to those representing transactions with Shipped Dates that lie between January 1st and December 31st, inclusive.

25.  Save the query as ACC08-03.

The query should now resemble that shown in Illustration 10.

Click for larger image

Illustration 10: The SELECT Query with Modifications

26.  Select Query --> Run from the main menu.

The resulting dataset appears as partially shown in Illustration 11.


Illustration 11: The Dataset Retrieved by the Modified SELECT Query (Partial View)

We can see that the transactions now appear to be limited to those that occurred in 1997, a circumstance that is also confirmed by the fact that the total number of records has again decreased markedly, this time to just under 400.

We are now ready to insert the TOP keyword to drive the generation of the "top" twenty records. Recall that the ORDER BY clause actually makes this work. We are simply going to order our existing result dataset by Shipped Date, descending, so that we obtain the records by date, most recent to oldest. Then, when we insert the TOP keyword, we are essentially asking for the "top twenty" of that list (which is ordered, again, most recent to oldest). We are thus saying "give me the twenty most recent" - or the twenty last - transactions that occurred in 1997.

27.  Shift back to SQL view, once again.

28.  Add the following to the SELECT clause, to the immediate right of the word SELECT:

TOP 20

29.  Add the following ORDER BY clause to the end of the query (be sure to remove the semicolon (";") that ends the line above, if it is present).

ORDER BY
   Orders.ShippedDate DESC;

30.  Save the query as ACC08-04.

The query should now resemble that shown in Illustration 12.


Illustration 12: The SELECT Query with the TOP Keyword Modifications

31.  Select Query --> Run from the main menu.

The resulting dataset appears as partially shown in Illustration 13.


Illustration 13: The Dataset Retrieved by the Modified SELECT Query (Partial View)

We immediately see the effects of the TOP keyword, together with the ORDER BY clause through which it is driven. We see, as well, how we can use TOP with dates, as well as the more intuitive number data types, to deliver a results dataset from a range of dated transactions. An examination of the number of records in the dataset, however, presents a perplexing phenomenon: the counter indicates "22," instead of the twenty records we might have expected.

The reason for the apparent deviation from the expected results is simply a manifestation of the behavior of the TOP keyword in the event of a "tie." When equal values are present in the range from which we are selecting the "top" records, all records that have the equal value are returned.

To prove this concept, take the following step.

32.  Select SQL View using the View Selector button once again.

The Query appears.

33.  Change the 20 after TOP to 19.

34.  Save the query as ACC08-05.

35.  Select Query --> Run from the main menu.

The resulting dataset appears as partially shown in Illustration 14, and is limited to nineteen rows, because the nineteenth Shipped Date from the end of 1997 has only one occurrence, 16-Dec-1997.


Illustration 14: Results of Modification of the TOP Keyword

Now let's take a quick look at the TOP keyword used in conjunction with the optional PERCENT keyword.


36.  Select SQL View using the View Selector button for a final time.


The SQL view editor appears.


37.  Change the 19 after TOP to 10.


38.  Add the word PERCENT immediately to the right of the 10 inserted above.


The PERCENT keyword in the above case instructs the query to return 10 percent of the rows (that is, approximately 40 rows, as we shall see).


39.  Change the last word in the query (DESC in the ORDER BY clause) to ASC.


In this example, ASC directs that the ORDER of the range be from January, 1997 dates to December 1997 dates. TOP 10 PERCENT therefore will return the "top" ten percent of the rows from the January / beginning end of the range.


40.  Save the query as ACC08-06


The query should now resemble that shown in Illustration 15.



Illustration 15: The SELECT Query with TOP Modifications and PERCENT Add

41.  Select Query --> Run from the main menu.

The resulting dataset appears as partially shown in Illustration 16.


Illustration 16: Results of Modification of the TOP Keyword

We see that the first forty transactions for 1997 appear. A quick test of the accuracy of the result dataset can be performed by simply running the query again with the TOP 10 PERCENT keyword combination removed. The result set that returns indicates a total row count of approximately 398 rows. As we stated in the Syntax section above, the records returned using the TOP and PERCENT combination is composed of the stated numeric percent (ten, in this case) of the total rows (398 per the "proof" test), in whole rows (for a total of forty rows).

NOTE: Another way to see the total population that we have subjected to our keyword combination would be simply to substitute 100 (or "100 percent) in the place of the 10 in TOP 10 PERCENT. The result dataset again returns 398 records.

42.  Close the query, without saving if you performed the test step, to return it to the TOP 10 PERCENT example we saved as ACC08-06.

Within the context of the practice example we have explored, we can easily see the usefulness of the TOP keyword, together with its optional "modifier" PERCENT, to help us to meet the business needs of information consumers. While we took a look at only a couple of possible uses, there are many scenarios where these keywords can be invaluable in helping us to meet our objectives. We will provide a further illustration of such an instance, where we can use TOP within a subquery scenario, in our next article, and perhaps in prospective articles over the months to come.

Conclusion and Summary

In this article, we introduced a keyword that will again take the stage in our next article, Access Query Techniques: Subqueries, Part II, which continues the exploration of subqueries we began in our previous lesson. We overviewed the TOP keyword, both with and without the optional PERCENT keyword, to examine its construction and operation independently from other distractions, to prepare us for its use in our next lesson. In Part II, we will create a subquery that leverages the keyword to illustrate more sophistication in the use of subqueries than we saw in our introduction to them in Part I.

We introduced the TOP keyword as it is used in Microsoft Jet SQL, and explored ways we can use it to meet illustrative business needs. To accomplish this, we examined the syntax surrounding the use of the TOP, and the optional PERCENT, keywords, then undertook illustrative practice exercises where we received hands-on exposure to the concepts. We then discussed the results datasets that we obtained in each of the practice examples.

» See All Articles by Columnist William E. Pearson, III

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved