DB2 9 and Microsoft Excel 2007 - Part 1: Getting the Data...

Tuesday Jul 3rd 2007 by Paul Zikopoulos
Share:

Paul Zikopoulos demonstrates just how easy it is to access data from DB2 9 in your Excel worksheets.

Without a doubt, Microsoft Excel, is the world’s most ubiquitous reporting and data analysis tool. Quite often I see clients export massive amounts of data from their enterprise systems into another data server (guess which one) for the sole purpose of supporting Excel reporting.

The main problems associated with this approach are as follows:

1.  There are too many copies of the data. Will they all be the same? Either way, it leads to redundant data costs.

2.  There is too much latency with the data. It takes too long to get the data to the people who need it. In addition, everyone sees different data with inconsistent points in time with respect to when the data was captured.

3.  The solution is too complex. This much data movement requires its own life cycle maintenance, batch jobs, and more.

4.  The data is not secure. Data is left on minimally secured laptops. I don’t need to go much further on this point.

5.  The solution is too expensive. Numerous components lead to increased costs. These costs are often hidden in distributed organizations, but they can be massive -- core operating system licensing costs, additional data server license costs, network costs, patching costs, security costs for a whole new layer of data access, and more.

What if you could just leave all that data in your DB2 (including DB2 for z/OS data server) and just access it? After all, if your enterprise data warehouse (EDW) is built on DB2, it can pretty much scale limitlessly and is designed to handle all kinds of workloads. If you could do this, you could achieve the following benefits:

1.  A dramatic reduction in the number of copies of the data with a “Single View of the Business”.

2.  Reduced latency time with fewer input and access paths. Near real-time load and access tools providing up-to-date business intelligence.

3.  A simplified architecture making the environment easier to manage for faster solution development and quicker return on investment (ROI)!

4.  A more secure solution since data is locked in one place with enterprise-wide policies.

5.  A less expensive solution. Fewer components lead to reduced costs in both infrastructure and human resources – all this results in lower total cost of ownership (TCO).

The following figure summarizes these approaches:

In fact, the only good reason that I can see for the way things are typically done is easy access. Therefore, in this article, I’m going to show you just how easy it is to access data from DB2 9 in your Excel worksheets.

Getting your DB2 9 data into Excel 2007

To get your data into Excel from a DB2 9 data server, you need to start with a database connection. You can work with database connections in Excel 2007 using the Data ribbon, as shown below:

Ribbons are a new ‘usability’ feature in Office 2007, though I’m not sure how usable they are; they take some getting used to. With that said, once you fumble your way around the interface, and learn where things are now located, the product becomes easier and nicer to use. If you read this article, you can leverage the learning curve I’ve already traversed and save yourself some valuable time (and frustration).

From the Data ribbon, you can perform all the data-specific actions you need (outside of regular Excel operations such as formatting, and so on). The Data ribbon has a number of subcategories. For example, in the far left, you can see the Get External Data frame, which is obviously used to import external data sources into your Excel 2007 worksheet, and it’s exactly what you use to get your DB2 9 data into your Excel worksheet:

To import the contents of the STAFF table into your Excel 2007 worksheet, perform the following steps:

1.  In the Data ribbon, select From Other Sources>From Data Connection Wizard from within the Get External Data frame:

In the previous figure, you can see that you can import data from a number of sources but for this article, since we are going after data that resides in a DB2 9 data server that isn’t known to Excel 2007 yet, the From Data Connection Wizard option is the one we want to select.

2.  The Data Connection window opens. Select Other/Advanced, and click Next.

3.  The Data Link Properties window opens. Select the IBM OLE DB Provider for DB2 OLE DB provider, and click Next.

When you install a DB2 9 client on your workstation, the corresponding OLE DB provider is automatically installed as well, and this is what Excel 2007 uses to get data from a DB2 database.

The figure in Step 2 shows you the option to select an ODBC DSN. When you catalog a DB2 database, you can choose to configure the database as a user, system, or file DSN using the native Windows facility - the Data Source (ODBC) tool. Since my SAMPLE database is set up as a system DSN, I could have used that connection to retrieve the data as well:

4.  Select the database that you want to connect to from the Specify the DB2 database box, ensure that you have the correct credentials for this database connection in the Log on information box, and click Test Connection:

If the database that you want to connect to isn’t cataloged on the local machine where you are running Excel 2007, you can search for existing DB2 data servers and their databases (as well as DB2 Connect gateways if you are attempting to add a connection to a DB2 for System i or DB2 for System z data server) using the DB2 Discovery feature that’s integrated with Excel. You can leverage this facility by selecting the Direct server connection radio button and select a database from the Select the database on the server drop-down list (you might need to click Refresh to update the server list or the database list is returns):

It’s outside the scope of this article, but you should be aware that you can use the Advanced tab to set advanced properties of this connection such as the connection timeout allotment, or perhaps a hard-coded password:

5.  Click OK if the connection is successful.

The DB2 OLE DB provider is well integrated into Excel 2007. For example, if Excel 2007 wasn’t able to get a connection to the SAMPLE database because the database instance was down, it would display the following message:

If the database was currently in use, (perhaps it was in a quiesce state or a backup was being performed), Excel 2007 would display a message like this:

6.  The Data Connection wizard opens. Select the table whose data you want to import into Excel 2007, and click Next. For example, you can select the contents of the STAFF table located in the SAMPLE database:

Note: To create the DB2 9 SAMPLE database, enter db2sampl from your operating system’s command prompt.

7.  Optionally, save that data connection (I'll refer to this as a database connection since that’s what it is), a great idea if you plan to reuse it, and click Finish:

If you don’t want to save this database connection in the future, you can simply click Finish in the previous step. In the previous figure, you can see that I’ve saved the current database connection (which is solely a connection to the STAFF table) for future reference. You can optionally add a description of this connection object (in the Description field) and search metadata (in the Search Keywords field) for easier subsequent retrieval. Finally, note that you have fine grained control over the authentication settings for this connection by clicking Authentication Settings.

8.  The Import Data window opens. Use this window to specify where the data should be placed in your Excel 2007 worksheet. Select the defaults for now, and click OK:

You can see in the previous figure that you can choose to place the data that is retrieved from DB2 9 in a Pivot Table report or chart, or both. I will discuss these options in my next article. (If you want to further configure this connection string’s properties, click Properties.)

Your Excel 2007 worksheet should now look similar to this:

Of course, at this point you can use the data as you see fit. For example, you can leverage Excel 2007’s strong formatting capabilities and apply a Table Style from the Design ribbon, for example:

You can also use the Table Style Options frame in the Design ribbon, like this:

When the data is loaded in Excel 2007, you can use the External Table Data frame in the Data ribbon to manage the database connection. For example, selecting Unlink will detach the data from the table and make it independent:

And you can quickly filter the returned data without writing a single line of SQL:

One of the things I love about Excel is its strong charting capability. Using this method, it’s very easy to whip up some snazzy charts with your DB2 9 data:

In the previous chart, you can see the ratio of salary to commission expenses for the 5 employees in department 20. Of course, so long as you didn’t disconnect the record set, you can refresh the data and the chart will be updated immediately. For example, run the following data manipulation language statement to drastically change the amount of commission Pernal makes (he’s employee 2 in the chart):

            UPDATE TABLE STAFF SET COMM=55565.56 WHERE ID=20

Now refresh the data:

And, of course, the base data is updated as well:

Now think back to the way most clients leverage Excel: by staging the data in a de facto data server. If your real-time data changed, when would be the next time your analysts would be analyzing real-time data? That’s right, after the next batch job!

Wrapping it Up...

In this article, I showed you how easy it is to import your DB2 9 data into an Excel 2007 worksheet. After getting the data into Excel 2007, I showed you some cool features that you can use to change the way the data is displayed (using different Table Tools ribbon frames) as well as how to create a chart and refresh the data in real time.

» See All Articles by Columnist Paul C. Zikopoulos

About the Author

Paul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technology team. He has more than ten years of experience with DB2 and has written more than one hundred articles and several books about it. Paul has co-authored the books: DB2 9 New Features, Information on Demand: Introduction DB2 9 New Features, DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). In his spare time, he enjoys all sorts of sporting activities, running with his dog Chachi, and trying to figure out the world according to Chloë – his new daughter. You can reach him at: mailto:paulz_ibm@msn.com.

Trademarks

IBM, DB2, System i, and System z are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both.

Microsoft is a trademark of Microsoft Corporation in the United States, other countries, or both.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product, and service names may be trademarks or service marks of others.

Copyright International Business Machines Corporation, 2007. All rights reserved.

Disclaimer

The opinions, solutions, and advice in this article are from the author’s experiences and are not intended to represent official communication from IBM or an endorsement of any products listed within. Neither the author nor IBM is liable for any of the contents in this article. The accuracy of the information in this article is based on the author’s knowledge at the time of writing.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved