DB2 9 and Microsoft Excel 2007 Part 3: Building Your Own Queries...

Monday Aug 6th 2007 by Paul Zikopoulos
Share:

Learn how to use the Microsoft Query function that’s part of Excel 2007 to customize the data that’s retrieved from the STAFF table as opposed to the entire data that’s retrieved when using the DB2STAFFTABLE connection object that we built in Part 1 of this series.

In my last two articles on Microsoft Excel 2007 (Excel 2007) and DB2 I showed you how to get DB2 data into your worksheets and work with that data with some of the new advanced and very powerful features that are new in Excel 2007; at the same time, I’ve been introducing you to the new framework that accompanies Excel 2007.

Since these articles have published, I received some questions that relate to the data set I pulled into my example. Specifically, all my examples showed you all the data within a table: complete data sets. I was asked what happens if you don’t need all the data, just some of it. The reason why all the data in the table is always returned is because I showed you how to use the From Data Connection Wizard option from the Data ribbon in the”DB2 9 and Microsoft Excel 2007 Part 1: Getting the Data...” article.

In this article, I’m going to show you how to use the Microsoft Query function that’s part of Excel 2007 to customize the data that’s retrieved from the STAFF table as opposed to the entire data that’s retrieved when using the DB2STAFFTABLE connection object that I showed you how to build in Part1.

Specifically, I’m going to show you how to list all the IDs, names, job classifications, years of service, and salaries of all the employees who work in the Sales department, aren’t in department 20, and make over $35,000 a year. Additionally, this data will be ordered according to the salary in descending order. This result set could be created by writing the following SQL query:

SELECT ID, NAME, DEPT, JOB, YEARS, SALARY FROM STAFF
 WHERE STAFF.JOB = 'Sales' AND 
 NOT STAFF.DEPT = 20 AND 
 STAFF.SALARY > 35000 
 ORDER BY STAFF.SALARY DESC    

If you run this query against the SAMPLE database in DB2 9, you should receive the following results:

Click for larger image

Note: To create the SAMPLE database if you haven’t already done so, enter the db2sampl command from your operating system’s command line processor (CLP).

Using Microsoft Query to get your data

To get your DB2 data into the Microsoft Query Builder in Excel 2007, you use the Data ribbon in the same manner that I showed you with the Data Connection wizard.

Click for larger image

From the Get External Data frame, perform the following steps:

1.  In the Data ribbon, select From Other Sources>From Microsoft Query. The Choose Data source window opens.

2.  Select the database alias name for the SAMPLE database (assuming it’s already defined) from the list in the Choose Data Source window. Ensure the Use the Query Wizard to create/edit queries (the default option) is selected, and click OK.

If you created the SAMPLE database on the same machine where you intend to import the data into your Excel 2007 worksheet, it should already appear in this window.

If the database you want to connect to doesn’t appear in this list, you can add it by performing the following steps:

a.  Double-click <New Data Source> in the Databases tab. The Create New Data Source window opens.

b.  Enter a database alias name for your database in the What name do you want to give your data source? field.

c.  Select the appropriate driver for this database connection from the Select a driver for the type of database you want to access field. The Create New Data Source window should look similar to this:

Note: If IBM DB2 ODBC Driver – DB2COPY1 doesn’t exist, select IBM DB2 ODBC Driver. Depending on what’s installed on your system, and the number of copies or different versions of DB2 software you have, you may see one, both, or more of these drivers.

d.  Click Connect. The Connect To DB2 Database window opens. Select the actual name of the database alias name you defined in Step b, as well as the appropriate credentials in the User ID and Password fields, and click OK.

If the connection was successful, you will see the name of the database beside the Connect button in the Create New Data Source window:

You can use the Select a default table for your data source (optional) field to pick a table that this connection will return. If want to connect to other tables in the database, you should leave this field blank. This way, Excel 2007 will give you the option to select the table you want to connect to each time you invoke this connection.

e.  Click OK.

3.  Expand the STAFF table in the Available tables and columns list and use to move the columns you want in the query to the Columns in your query list, and click Next. If you’re trying to build the query outlined at the start of this article, the Query Wizard – Choose Columns page should look like this:

So far, we’ve written the SELECT ID, NAME, DEPT, JOB, YEARS, SALARY FROM STAFF part of our query.

Note: As an alternative to selecting a column and clicking each time, you can simply double-click the column and it will automatically move over to the Columns in your query list.

You can use Preview Now to look at the data if you want to ensure it contains the values you are looking for. In the previous figure, you can see that I previewed the COMM column’s data.

If you’re dealing with a database that has a large schema (for example, some ERP systems have over 30,000 tables), the Options button is going to become your best friend. You can click this button to filter (by schema name or object type) the list of objects returned to this wizard. For example, the following configuration only returns tables in alphabetical order that belong to the PAULZ schema which makes locating the STAFF table much easier:

4.  Use the filter data part of this wizard to apply the WHERE clause to the query by performing the following steps, clicking Next on each page:

a.  Select JOB and set the Only include rows where box to equals and Sales:

b.  Select DEPT and set the Only include rows where box to does not equal and 20:

c.  Select SALARY and set the Only include rows where box to is greater than and 35000.00:

These steps define the WHERE STAFF.JOB = 'Sales' AND NOT STAFF.DEPT = 20 AND STAFF.SALARY > 35000 part of this query.

Each time you select a column from the Column to filter list, you're implicitly adding AND conditions to your query. This can be confusing because there are And and Or radio buttons for each column. You should be aware that these are for the highlighted column, not the query. You can see this by selecting each of the columns you just set in the previous steps; you will see that they each have their own setting. The And and Or radio buttons in the previous figure allow you to further filter data for a specific column.

5.  Finally, define the sort order by selecting the Salary column from the Sort Order drop-down list and select Descending, and then click Next.

This step defines the final portion of the query: ORDER BY STAFF.SALARY DESC.

6.  Select Return Data to Microsoft Office Excel and click Finish:

7.  Select Table and click OK (the rest of the options on this window are fine for now):

Excel 2007 now connects to the target database and retrieves that data using the SQL statement you just built (although you didn’t really hand-code any SQL). If the data server is slow, or the SQL statement takes a long time to process, Excel 2007 will tell you it’s working on your query:

The results of your query are returned to your Excel 2007 worksheet:

Refining and saving your query results

The results returned to your Excel 2007 worksheet are linked to the target database. If you want to refine this query, right-click anywhere in the result set and select Table > Edit Query:

If you click through this wizard, you’ll see that all of the settings you selected are preserved. For this example, click Next until you reach the Query Wizard – Finish page, and save your query so that you can easily use it another time.

Click for larger image

As you can see, it’s easy to define the data you want to return to your Excel 2007 worksheet without having to write a single line of SQL. What I showed you was Excel’s Query wizard, which sits on top of the Microsoft Query editor. While you can do a lot without knowing SQL, you can’t do everything. For example, in our query you may have noticed that I included the DEPT column and restricted the results of the query output to only those employees that aren’t in department 20. Suppose you wanted to refine your query such that the DEPT column wasn’t part of the result set, but you still wanted to restrict the results of your query to those employees that aren’t in department 20. Using the wizard, you wouldn’t be able to do this since the Query Wizard – Filter Data page only lets you apply to filters to data that is part of the result set.

To get around this problem, you can use the Query Editor that sits behind this Query wizard. This approach gives you a little more control over the query, but it’s also a little more complex. To invoke the Query Editor, you can click Cancel at any time as you step through the Query wizard, and you will be prompted to transfer whatever you’ve done in the wizard directly to the editor:

Note: You can also build your query and refine it using the Query Editor by selecting View data or edit query in Microsoft Query in the Query Wizard – Finish page.

You can use the Query Editor to remove columns and filter columns that aren’t part of the result set (but are part of the table), and more. For example, you can see in the following figure that I used the Query Editor to return the same query, only this time department data wasn’t part of the result set; however, only those employees that don’t belong to department 20 were part of the result set:

When you are finished altering the query using the Query Editor, simply close it and the results of the query are returned to your Excel 2007 worksheet:

Wrapping it up...

In this article, I showed you how to retrieve a subset of a table’s data without writing a single line of SQL. What’s more, I showed you how to work beyond the Query wizard to define a more complex query. Using these tools, you should be able to work beyond all the data in a table, and focus your analysis on the data you want. In my next article, I’ll introduce you to the world’s most popular analysis artifact: the pivot table, and show you how to create and leverage pivot tables with live DB2 9 data in Excel 2007.

» See All Articles by Columnist Paul C. Zikopoulos

About the Author

Paul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technology team. He has more than thirteen years of experience with DB2 and has written more than one hundred-fifty magazine articles and is currently working on book number twelve. Paul has authored the books Information on Demand: Introduction to DB2 9.5 New Features, DB2 9 Database Administration Certification Guide and Reference (6th Edition), DB2 9: New Features, Information on Demand: Introduction to DB2 9 New Features, Off to the Races with Apache Derby, DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). In his spare time, he enjoys all sorts of sporting activities, including running with his dog Chachi, avoiding punches in his MMA class, and trying to figure out the world according to Chloë – his daughter. You can reach him at: paulz_ibm@msn.com.

Trademarks

IBM and DB2 are registered trademarks of International Business Machines Corporation in the United States, other countries, or both.

Microsoft is a trademark of Microsoft Corporation in the United States, other countries, or both.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.

Copyright International Business Machines Corporation, 2007. All rights reserved.

Disclaimer

The opinions, solutions, and advice in this article are from the author’s experiences and are not intended to represent official communication from IBM or an endorsement of any products listed within. Neither the author nor IBM is liable for any of the contents in this article. The accuracy of the information in this article is based on the author’s knowledge at the time of writing.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved