DB2 9.5 and IBM Data Studio: Using OLE DB to Integrate Data

Tuesday Jan 8th 2008 by Paul Zikopoulos

Part 5 of this series introduces the OLE DB function capability available in IBM Data Studio.

So far, in this series about the IBM Data Studio integrated development environment (IDE) that’s available with DB2 Version 9.5 (DB2 9.5), I’ve shown you how to set up database connection objects and the actions that you can perform on them. In addition, I introduced you (in Part 2) to some of the features available in the Database Explorer view, and Part 3 and Part 4 gave you a full tour of overview diagrams. In this article, I want to introduce you to the OLE DB function capability available in IBM Data Studio.

Object Linking and Embedding Database (OLE DB) Functions

Microsoft OLE DB is a set of application programming interfaces through which applications can uniformly access data that is stored in diverse information sources. OLE DB is conceptually divided into OLE DB providers and OLE DB consumers. An OLE DB provider is a software component that exposes OLE DB interfaces, and an OLE DB consumer is the application that accesses the OLE DB interfaces and consumes the data.

The IBM Data Studio gives you the ability, without writing a single line of code, to create OLE DB functions within a DB2 data server that talks to OLE DB providers. You can use this function to create a table and populate it with the target OLE DB data source’s data and persist that data locally in a DB2 table. You can also create a view that calls the OLE DB function and provides access to the target data in the same manner as a regular DB2 view.

Today, most data sources have some sort of OLE DB provider – with DB2 9.5, it’s automatically installed with an IBM Data Server client. For example, using IBM Data Studio, you can define an OLE DB table function to populate a local table automatically with data from a Microsoft Access table. (I will show you how to do that in this article.) In fact, lots of applications (not just data servers) have OLE DB providers today. For example, the Microsoft Exchange address book has a provider, which means that you can easily create a report in DB2 9.5 that seamlessly combines data from an OLE DB table function built over your Exchange address book and all the logic that lies within your DB2 database.

Using OLE DB table functions in DB2 9.5 reduces your application development effort by providing built-in access to most OLE DB providers. With OLE DB table functions, you use a generic OLE DB consumer interface that’s built into DB2 9.5 to retrieve data from most OLE DB providers. You only need to register a table function as LANGUAGE OLEDB, and refer to the OLE DB provider and the relevant row set as a data source; of course, IBM Data Studio takes care of all of this for you. Quite simply, you don’t have to do any function programming to take advantage of OLE DB table functions in DB2 9.5.

Once you get to know these functions in DB2 9.5, you may think they are some sort of panacea for integration. Well, they can be. However, you have to understand that the functionality provided by this method can only be as good as the OLE DB provider that’s installed with the target data source. Considering the fact that Microsoft is focusing on the .NET API these days, vendors aren’t investing in OLE DB of late. Of course, if you want a scalable and rich integration platform, look at the capabilities of the IBM Information Server, which provides native access to target data sources with function compensation and optimizations to make it even easier than this method. With that said, if you want a quick-and-dirty way to grab that data from Microsoft Access (or from any other OLE DB data source, for that matter), this may just do the trick.

Things you have to do to follow the examples in this article...

In case you haven’t read Parts 1 to 4 in this series, I assume, for this article, that you’ve at least created the SAMPLE database (using the db2sampl –xml –sql command) and that the SAMPLE database appears in the Database Explorer view. I’m also assuming that you have access to the NORTHWIND database that comes with Microsoft Access (which further assumes that you have Microsoft Access installed).

Finally, the db2olefn.dll file must be registered on your computer before you can create OLE DB functions using IBM Data Studio. To register this DLL file, enter the following command from your operating system’s command line processor (CLP):

 regsvr32 <DB2_installation_path>\sqllib\function\db2olefn.dll 

You will get a message confirming the registration of this DDL file when it is complete:

Note: Even if you’ve already registered this DLL, the message in the previous figure should still appear.

Building your first OLE DB Function

To create an OLE DB function that creates a table in your DB2 SAMPLE database and populates it with all the data from the CUSTOMERS table in your Microsoft Access NORTHWIND database, perform the following steps:

1.  Select New>Project. The New Project window opens.

2.  Select Data Development Project and click Next.

Before you create routines or other database development objects, you typically create a data development project to store your objects. I like to think of database development projects as a collection of thoughts that you can focus on. It’s important to come up with some kind of analogy like this because you have to understand that objects created in a database development project aren’t deployed there. For example, if you create a stored procedure in a database development project, deploy it to your database, and subsequently erase that stored procedure from the project, it still exists in the database.

A data development project is linked to one database connection in the Database Explorer view that gives the project a target database to deploy objects to. (At deployment time, though, you can choose to deploy an object to a different database.)

Specifically, a data development project is used to store and organize data application development artifacts. You can store and develop the following types of objects in a database development project:

  • Stored procedures
  • User-defined functions
  • SQL scripts
  • XML artifacts
  • Web services

You can also test, debug, export, and deploy these objects from a data development project. The wizards that are available in a data development project use the connection information that is specified for the project to help you develop objects that are targeted for that specific database.

Using the New Data Development Project wizard (shown in the following steps), you specify basic information about the project including the name and current schema. You also specify which database connection to use for the project. You can use an existing connection or create a new connection. Once created, a data development project can be found in the Data Project Explorer view.

3.  Enter DatabaseJournalProject in the Project Name field. The rest of the defaults are fine. Click Next.

Note: I’m just going to show the basics of database development projects to facilitate the example in this article. In a future installment of this series, I will take you through the “ins and outs” of database development projects.

4.  Select Use an existing connection, select a connection object that’s bound to the SAMPLE database, and click Finish.

You can see in the previous figure that I have a number of different database connection objects that connect to the SAMPLE database: some have filters and some don’t.

5.  Expand your new project to see all of the objects that a data development project can contain, as shown below:

6.  Click New>User-Defined Function, and then click Next.

If you don’t see the User-Defined Function option from the File menu, you may need to select Other>Data from the Select a Wizard window. After you select it once, it should automatically be added to the File menu for subsequent usage.

The New User-Defined Function wizard opens. Ensure that the Project field is set to the DatabaseJournalProject. If it isn’t, you can select it from the Project drop-down list. You can also use New to create a new data development project if one doesn’t exist.

Note: You can also access the New User-Defined Function wizard directly from your data development project by right-clicking the User-Defined Functions folder.

7.  In the Name field, enter OLEDB_2_MSACCESS__NORTHWIND_ CUSTOMERS.

8.  Set the Language field to OLE DB, and click Next.

9.  IBM Data Studio may try to connect to the target database where you want to create your OLE DB function. If the Database Authorization window appears, enter a set of credentials that has the authority to create functions on the SAMPLE database, and click OK. (If you are using the user account used to create the SAMPLE database, you should already have this authority.)

10.  The OLE DB Provider window opens. You can use the Connection string field to manually type a connection string (the way most programmers do); however, there’s a better way. Click Build String.

11.  Select the appropriate provider from the Providers page for the target data source. Since you want to access a Microsoft Access database, select the Microsoft Jet 4.0 OLE DB Provider, and click Next.

You can see in the previous figure that IBM Data Studio automatically lists all of the OLE DB providers that are registered on your system. (Just below the provider you selected for this example is the IBM OLE DB Provider for DB2, which is the driver you would use if you wanted to access another DB2 database in this manner.)

12.  The Connect window opens. Select the ellipsis ( ) in the Database file field and select the file that represents the Northwind database for your Microsoft Access installation. (A database file’s extension in Microsoft Access is .mdb; don’t select the .ldb extension since that’s the log file associated with a Microsoft Access database file.)

13.  Ensure that you can connect to the target data source using the selected OLE DB provider by clicking Test Connection. If the test is successful, you will see the following message:

If the connection isn’t successful, ensure that you selected the correct OLE DB provider for the target data source and that the data source is available; for example, if you are connecting to a DB2 or an Oracle database, you have to ensure that the database instance is started.

14.  The Options page opens. You can use this page to specify some additional settings for the selected provider. The settings available will depend on the provider you selected. For our example, just click Finish.

Click for larger image

15.  The connection string for the target data source is automatically added to the Connection string field, avoiding the pitfalls associated with manual entry of these strings within your application or function definition.

Click for larger image

Of course, you don’t find out about manual entry errors during the design phase or the build phase. You figure it out when you go to invoke the function or test your application and the database connection doesn’t work. Here’s another example of how IBM Data Studio will save you a lot of time.

16.  Click Test Connection, and click Next if the connection is successful.

If the connection is successful, you will see the same message shown in Step 13. (I always test connections as I work through this wizard in case I’ve introduced an error, or an option I selected makes the connection invalid.)

17.  The Source Data page opens. Select Access source data using a table and select Customers in the Name drop-down list, as shown below, and click Next:

The tables in the Name drop-down list are from the target Microsoft Access database. You can now test the connection such that the data in the target table is returned to IBM Data Studio when you click Show Sample Content:

This is a great way to ensure that your function will retrieve the data you actually want.

Also, note the Access source data using an SQL query radio button. You select this option if you want to manually write an SQL query to run against the target table. For example, suppose you were only interested in customers that reside in Mexico - you couldn’t apply such a restriction to the Customers data using the Access source data using a table option; however, you can with this option, as shown below:

If you write your own SQL, it’s important that you follow the semantics of the SQL used at the target data server. For example, if you were to restrict a result set using double quotation marks (“”s), as shown in the previous figure, you would see the following error message:

That’s because in DB2 9.5 you define strings using single quotation marks (‘’s). By contrast, if you used single quotation marks to restrict a result set in Microsoft Access, you would see an error message similar to this:

This example presents one of the drawbacks of using OLE DB functions to integrate data and should illustrate why I called it a ‘quick-and-dirty method’. Quite simply, you have to know the nuances of each target data server. Now imagine trying to integrate data from Oracle, SQL Server, Teradata, Excel, Sybase, and more! OLE DB functions just aren’t suitable for large integration projects. A product such as IBM Information Server (or a subset of it such as IBM WebSphere Federation Server) on the other hand, would automatically shield you from such nuances, which is why I recommended its use earlier in this article for large-scale integration projects.

18.  The Data Columns page opens. This page allows you to apply a projection to your result set by adding or removing the columns in the result set. For this example, leave the defaults as is, and click Next.

Note in the previous figure that you can click Show Sample Content to give you once again a sample of the data that will be returned by your function.

19.  The Column Mapping page allows you to specify the mapping of the OLE DB data types to DB2 data types. For our example, simply click Next.

This feature is very useful because it allows you to leverage logic in your DB2 data server for this integration effort. For example, you may have created a user-defined type (UDT) to handle phone numbers.

Note that you can also use the Move Up and Move Down buttons to perform another type of projection on the result set by reordering the columns that appear in it.

If you make any changes at this stage of the OLE DB definition process, you will have the opportunity to view the sample output of the function via the Show Sample Contents button that appears when a change is made.

20.  Click Next to move to the Options page. Here you can specify whether to create a view that uses the OLE DB function to return results to your application, have IBM Data Studio automatically create a DB2 table and use your function to import the data from the target table in a DB2 table, or both.

For our example, select both the Create a corresponding table view and Create a new table and import the UDF data into it check boxes, and use the View name and Comments fields to define and describe these objects as shown below.

You can optionally select Replace Existing if you want IBM Data Studio to automatically drop any corresponding objects that exist with the names specified on this page before creating the new objects. Click Next.

21.  The Summary page opens. This page summarizes all the options that you selected using the wizard. You can also see what IBM Data Studio will create as a result of the options you selected. If you are satisfied with the information here, click Finish and IBM Data Studio will create the object (or objects, as is the case in this example) for you.

You can click Show SQL to display the data definition language (DDL) that IBM Data Studio will use to create the objects defined in this wizard:

22.  In the Data Output window, you can see that IBM Data Studio successfully deployed the function you created along with its associated objects (the defined table and view).

23.  Refresh your database connection object in the Database Explorer view by right-clicking it and selecting Refresh. (If you recall from Part 1, this is a best practice to follow – although not always needed – because it assures that the object cache is up-to-date.)

24.  Expand the database connection object and locate the Tables, User-Defined Functions, and Views folders. You should be able to see all the new objects that were created by IBM Data Studio.

Seeing it all in action...

You can see the OLE_DB_2_MSACCESS_NORTHWIND_CUSTOMERS function within the User-Defined Functions folder. This function was used to populate the MSACCESSCUSTOMERS table and is called each time an SQL statement is run against the VIEW_OVER_MSACCESS_CUSTOMERS view.

For example, you can use IBM Data Studio to look at the data from the VIEW_OVER_MSACCESS_CUSTOMERS table, as shown below:

Click for larger image

Naturally, if you compared this result set to the same query run natively in Microsoft Access, you would end up with the same results:

Click for larger image

Of course, any method you use to access the table or view that you created will return data to your application. For example, in the following figure you can see I called the same view using the DB2 command line processor (CLP):

Click for larger image

Remember that if you deleted the NORTHWIND database, or changed its location, the table would continue to operate just fine (since the data was imported into it by the function you created), but the view would fail:

Notice that the columns are still available? This is because there is a meta-data layer for this function within DB2 9.5. So here is another nuance you should be aware of when using these kinds of functions: if someone moves the data, your objects can break. With IBM Information Server, there are health indicators that are surfaced when a nickname over a remote data server become invalid.

Wrapping it up...

In this article, I took you through the steps involved in creating an OLE DB function to integrate data from a broad spectrum of data sources, in this case, specifically Microsoft Access. You saw that this feature can be handy and that with IBM Data Studio you didn’t have to write a single line of code to make this all happen. In my next article, I’m going to show you how you can use IBM Data Studio to create an SQL statement.

» See All Articles by Columnist Paul C. Zikopoulos


IBM, DB2, and WebSphere are registered trademarks of International Business Machines Corporation in the United States, other countries, or both.

Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both.

Other company, product, or service names may be trademarks or service marks of others.

Copyright International Business Machines Corporation, 2007. All rights reserved.


The opinions, solutions, and advice in this article are from the author’s experiences and are not intended to represent official communication from IBM or an endorsement of any products listed within. Neither the author nor IBM is liable for any of the contents in this article. The accuracy of the information in this article is based on the author’s knowledge at the time of writing.

Mobile Site | Full Site