In my last two articles on Microsoft Excel 2007 (Excel 2007) and DB2 I showed you how to get DB2 data into your worksheets and work with that data with some of the new advanced and very powerful features that are new in Excel 2007; at the same time, Ive been introducing you to the new framework that accompanies Excel 2007.
Since these articles have published, I received some questions that relate to the data set I pulled into my example. Specifically, all my examples showed you all the data within a table: complete data sets. I was asked what happens if you dont need all the data, just some of it. The reason why all the data in the table is always returned is because I showed you how to use the From Data Connection Wizard option from the Data ribbon in theDB2 9 and Microsoft Excel 2007 Part 1: Getting the Data... article.
In this article, Im going to show you how to use the Microsoft Query function thats part of Excel 2007 to customize the data thats retrieved from the STAFF table as opposed to the entire data thats retrieved when using the DB2STAFFTABLE connection object that I showed you how to build in Part1.
Specifically, Im going to show you how to list all the IDs, names, job classifications, years of service, and salaries of all the employees who work in the Sales department, arent in department 20, and make over $35,000 a year. Additionally, this data will be ordered according to the salary in descending order. This result set could be created by writing the following SQL query:
SELECT ID, NAME, DEPT, JOB, YEARS, SALARY FROM STAFF WHERE STAFF.JOB = 'Sales' AND NOT STAFF.DEPT = 20 AND STAFF.SALARY > 35000 ORDER BY STAFF.SALARY DESC
If you run this query against the SAMPLE database in DB2 9, you should receive the following results:
Note: To create the SAMPLE database if you havent already done so, enter the db2sampl command from your operating systems command line processor (CLP).
Using Microsoft Query to get your data
To get your DB2 data into the Microsoft Query Builder in Excel 2007, you use the Data ribbon in the same manner that I showed you with the Data Connection wizard.
From the Get External Data frame, perform the following steps:
1. In the Data ribbon, select From Other Sources>From Microsoft Query. The Choose Data source window opens.
2. Select the database alias name for the SAMPLE database (assuming its already defined) from the list in the Choose Data Source window. Ensure the Use the Query Wizard to create/edit queries (the default option) is selected, and click OK.
If you created the SAMPLE database on the same machine where you intend to import the data into your Excel 2007 worksheet, it should already appear in this window.
If the database you want to connect to doesnt appear in this list, you can add it by performing the following steps:
a. Double-click <New Data Source> in the Databases tab. The Create New Data Source window opens.
b. Enter a database alias name for your database in the What name do you want to give your data source? field.
c. Select the appropriate driver for this database connection from the Select a driver for the type of database you want to access field. The Create New Data Source window should look similar to this:
Note: If IBM DB2 ODBC Driver DB2COPY1 doesnt exist, select IBM DB2 ODBC Driver. Depending on whats installed on your system, and the number of copies or different versions of DB2 software you have, you may see one, both, or more of these drivers.
d. Click Connect. The Connect To DB2 Database window opens. Select the actual name of the database alias name you defined in Step b, as well as the appropriate credentials in the User ID and Password fields, and click OK.
If the connection was successful, you will see the name of the database beside the Connect button in the Create New Data Source window:
You can use the Select a default table for your data source (optional) field to pick a table that this connection will return. If want to connect to other tables in the database, you should leave this field blank. This way, Excel 2007 will give you the option to select the table you want to connect to each time you invoke this connection.
e. Click OK.
3. Expand the STAFF table in the Available tables and columns list and use to move the columns you want in the query to the Columns in your query list, and click Next. If youre trying to build the query outlined at the start of this article, the Query Wizard Choose Columns page should look like this:
So far, weve written the SELECT ID, NAME, DEPT, JOB, YEARS, SALARY FROM STAFF part of our query.
Note: As an alternative to selecting a column and clicking each time, you can simply double-click the column and it will automatically move over to the Columns in your query list.
You can use Preview Now to look at the data if you want to ensure it contains the values you are looking for. In the previous figure, you can see that I previewed the COMM columns data.
If youre dealing with a database that has a large schema (for example, some ERP systems have over 30,000 tables), the Options button is going to become your best friend. You can click this button to filter (by schema name or object type) the list of objects returned to this wizard. For example, the following configuration only returns tables in alphabetical order that belong to the PAULZ schema which makes locating the STAFF table much easier:
4. Use the filter data part of this wizard to apply the WHERE clause to the query by performing the following steps, clicking Next on each page:
a. Select JOB and set the Only include rows where box to equals and Sales:
b. Select DEPT and set the Only include rows where box to does not equal and 20:
c. Select SALARY and set the Only include rows where box to is greater than and 35000.00:
These steps define the WHERE STAFF.JOB = 'Sales' AND NOT STAFF.DEPT = 20 AND STAFF.SALARY > 35000 part of this query.
Each time you select a column from the Column to filter list, you're implicitly adding AND conditions to your query. This can be confusing because there are And and Or radio buttons for each column. You should be aware that these are for the highlighted column, not the query. You can see this by selecting each of the columns you just set in the previous steps; you will see that they each have their own setting. The And and Or radio buttons in the previous figure allow you to further filter data for a specific column.
5. Finally, define the sort order by selecting the Salary column from the Sort Order drop-down list and select Descending, and then click Next.
This step defines the final portion of the query: ORDER BY STAFF.SALARY DESC.
6. Select Return Data to Microsoft Office Excel and click Finish:
7. Select Table and click OK (the rest of the options on this window are fine for now):
Excel 2007 now connects to the target database and retrieves that data using the SQL statement you just built (although you didnt really hand-code any SQL). If the data server is slow, or the SQL statement takes a long time to process, Excel 2007 will tell you its working on your query:
The results of your query are returned to your Excel 2007 worksheet: