DB2 9 and Microsoft Excel 2007: Part 4: Creating a Pivot Table...

Monday Aug 13th 2007 by Paul Zikopoulos
Share:

Part 4 of this series examines how to leverage the framework created in part 3 and how to create and populate (with DB2 9 data) one of the most important and capable data analysis tools around: pivot tables.

In my previous article called DB2 9 and Microsoft Excel 2007: Getting the Data DB2 9, I showed you how easy it is to get data from an IBM DB2 9 data server into your Excel 2007 worksheets. I also showed you some of the neat Excel 2007 capabilities to ‘dress-up’ your data. In this article, I want to leverage the framework we created in the last article and show you how to create and populate (with DB2 9 data) one of the most important and capable data analysis tools around: pivot tables.

A what table did you say?

A pivot table is an interactive table that presents itself to users in an organized and summarized fashion. It can be instantly rotated to view data in multiple dimensions to detect patterns and relationships, and discover trends.

The best way to understand this concept is through an example. Assume you have a bunch of data, for example, the salary and commission payouts for all the employees in your company. You’re a data analyst and now you want to get some meaningful insight into this data so you can answer HR-related questions such as:

  • Are there any sales reps in a region with an underestimated quota? (This would be identified by an abnormally large commission payout compared with others in their department or across the organization.)
  • Who is the highest performing sales person as a function of their department, or as a function of a job type?
  • Which department is the highest performing with respect to sales and commissions? Which are the highest performing job categories within a specific department?

You can answers questions like these and more using a pivot table.

Of course, its name suggests this object’s most compelling feature – the ability to easily pivot. Using a pivot table, you can graphically change the display of the table such that a certain type of information is displayed in rows and then instantly displayed as columns (the pivot). The interactive nature of a pivot table lets you literally spin the table to see different summaries of the data.

This might not seem complex to you if you are used to working with Online Analytical Processing (OLAP) cubes; however, consider the following data definition language (DDL) statements for a relational engine:

  CREATE TABLE DJAUTOSALES (year int, quarter int, results int)

  INSERT INTO DJSALES VALUES  (2004,1,20),(2004,2,30),(2004,3,15),
    (2004,4,10),(2005,1,18), (2005,2,40), (2005,3,12),(2005,4,27)

Let’s assume your first report started out just listing the data by Year, Quarter, and Results as follows:

	SELECT * FROM DJAUTOSALES

The previous DML would return the following output:

Now let’s pivot this table to get some more interesting data. For example, you could use the following query to show 1 row per year with each column being the result of the sales by quarter (1 column per quarter):

SELECT YEAR, max(case when quarter = 1 THEN results end) as
Q1,max(case when quarter = 2 THEN results end) as Q2, 
max(case when quarter = 3 THEN results end) as Q3,
max(case when quarter = 4 THEN results end) as Q4
FROM DJAUTOSALES GROUP BY YEAR

The previous DDL and DML statements would look like this:

Creating a pivot table with DB2 data

To create a pivot table using DB2 9 data in Microsoft Excel 2007, perform the following steps:

1.  Open up the Data ribbon by clicking the Data tab:

2.  Click Existing Connections and select the connection you created in the first part of this series, select the DB29STAFFTABLE connection, and click Open.

3.  The Import Data window opens. Use this window to specify what you want to create and where you want the data to be placed. Select PivotChart and PivotTable Report, and select OK:

You can see in the previous figure that you can optionally select the data retrieved from DB2 9 to be placed as a regular table in your worksheet (as in my previous article), or just as a pivot table report without an accompanying chart.

You can also use the Properties button to further configure connection properties that relate to this data’s usage and the definition of the connection string built by Excel 2007 to DB2 9:

Excel 2007’s framework for Pivot Table opens:

A tour of the pivot table controls before you get started

Before I show you how to use this pivot table, it’s useful to go over a couple of its features. The PivotChart Filter Pane allows you to toggle on and off the actual columns and filters that you can apply to the pivot table report. Click the filter button () to turn this filter on or off:

In order to use the PivotChart Filter Pane, you need to have focus on the charting area. (The box in the previous diagram is where Excel will build the pivot table.)

You use the Pivot Table Field List to drag fields to your pivot table. The Fields Selection and Area Select Stacked button () allows you to define how you’d like to see the fields, report filters, legends, and values information displayed. This option gives you a granular level of visual control that’s especially useful if you are dealing with a large amount of data. For example, you might have such a large number of fields that it makes sense to concentrate at the field level of the pivot table when creating it and then focus on filtering data after you’ve defined columns for inclusion. You have five options for the display, as shown below:

Building your pivot table

To build a pivot table and a report that investigates the salary of each employee as detailed in the STAFF table, perform the following steps:

Note: The STAFF table is located in the SAMPLE database. You can create this database, populated with data, by entering the db2sampl command from your operating system’s command prompt. I inserted new data into the STAFF table to create salary ‘distortions’ for this example. You can insert any data that you want. (Your worksheet will look different, but you should still be able to follow along.)

1.  Select Salary from the Pivot Table Field List. Instantly, the pivot table chart is updated with this data and looks similar to the following chart:

Also, note that on the left side of the worksheet, Excel 2007 keeps a numeric table of all the analysis you perform. I will refer to this table when it dynamically changes in response to the operations we perform on the pivot table chart since they are linked:

For example, we’ve only included the sum of all salaries in the STAFF table, so that is the only data shown in column A row 4.

This option gives you the flexibility to work with your data numerically, or visually using a chart. Also, note that the Values field in the Pivot Table Field List is automatically updated with the Salary column after you click the corresponding check box:

As you can see in the previous figure, you can perform a number of options on any column in any field by clicking a column’s corresponding properties drop-down arrow ().

2.  Since we want to look at the distribution of salaries in this company by department, drag the DEPT column from the Choose fields to add to report list and put it in the Axis Fields (Categories) list so that your Pivot Table Field List looks like this:

You will also see that instantly the salary distribution changed from a simple summary to a summary by department:

Note that the details that underpin the previous chart changed as well:

Now the data is a little more interesting because we are able to investigate the breakdown of salary by department.

3.  Further break down the granularity of the salary information by dragging NAME below the DEPT column in the Axis Fields (Categories) list. Your report should now look similar to the following example:

With this report across two dimensions, I can learn a little more about my data. While the initial report in Step 2 suggested that departments 20 and 66 have relatively high salary costs, department 20’s Melnyk is certainly above all other employees and warrants further investigation – is he that good? -->(The answer is yes.)

4.  From the previous step, we know that employee Melnyk’s salary seems high, but we should likely investigate the kind of job he performs as well. Drag JOB to Legend Field (Series). Your pivot table should now look like this:

Again, as you drag and drop data into the pivot filter fields, the values and the corresponding chart are dynamically updated.

Note: If you don’t want your charts and reports to change every time you move a data column, you can select Defer Layout Update; nothing will change in the pivot table (or its report) until you click Update:

5.  Now that we have a pivot table, we can explore some of the browsing features that Excel 2007 offers you for your DB2 9 data. For example, in the pivot table itself, you can collapse columns of data by clicking the corresponding toggle (). In addition, when you collapse columns in the pivot table, they are automatically collapsed in the corresponding chart:

6.  Pivot the table such that salaries are shown by name then department as opposed to department then name (as is the case in the previous figure). Drag DEPT column in the Axis Fields (Categories) list below the NAME column. Your worksheet should now look similar to this:

Click for larger image

7.  Pivot the table again such you can look at the salary distribution of the employees by clerk by ensuring that your Pivot Table Field List looks like this:

Your pivot table chart should now look similar to the following chart:

Click for larger image

And the data should look similar to this:

8.  Clear all the data from your pivot table by selecting the chart and clicking Analyze>Clear>Clear All:

9.  Drag DEPT to the Axis Fields (Categories), SALARY to Values, and JOB to Legends Fields (Series) to create a new pivot table and report that displays the total salary broken down by department and then subdivided by job type. Your pivot table should now look like this:

And the corresponding data should look like this:

10.   Perform a final pivot of this table such that the report now breaks down the total salary costs by department as a function of job category by changing the location of the JOB and DEPT columns in the Drag fields between areas below list. Your worksheet should now look like this:

And the data should look like this:

Wrapping it up...

In this article, I showed you how to leverage the DB2 9 data server connection that you created in a previous article to create a pivot table and a corresponding chart. I also took you through the steps to populate this table, look at slices of data, drill down into this data, and pivot the data around salary. The point is that once you get your pivot table to Excel 2007 – it’s just Excel functions – but it’s all DB2 data. In my next article, I’ll show you some more things you can do with a pivot table.

» See All Articles by Columnist Paul C. Zikopoulos

About the Author

Paul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technology team. He has more than thirteen years of experience with DB2 and has written more than one hundred-fifty magazine articles and is currently working on book number twelve. Paul has authored the books Information on Demand: Introduction to DB2 9.5 New Features, DB2 9 Database Administration Certification Guide and Reference (6th Edition), DB2 9: New Features, Information on Demand: Introduction to DB2 9 New Features, Off to the Races with Apache Derby, DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). In his spare time, he enjoys all sorts of sporting activities, including running with his dog Chachi, avoiding punches in his MMA class, and trying to figure out the world according to Chloë – his daughter. You can reach him at: paulz_ibm@msn.com.

Trademarks

IBM and DB2 are registered trademarks of International Business Machines Corporation in the United States, other countries, or both.

Microsoft is a trademark of Microsoft Corporation in the United States, other countries, or both.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.

Copyright International Business Machines Corporation, 2007. All rights reserved.

Disclaimer

The opinions, solutions, and advice in this article are from the author’s experiences and are not intended to represent official communication from IBM or an endorsement of any products listed within. Neither the author nor IBM is liable for any of the contents in this article. The accuracy of the information in this article is based on the author’s knowledge at the time of writing.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved