ETL Prototype using Oracle Warehouse Builder - Part 3

Thursday Aug 28th 2003 by DatabaseJournal.com Staff
Share:

Our author shares his views on choosing the Oracle Warehouse Builder as a tool in an Oracle data warehouse life cycle management project, along with quick steps to get started with a simple 'prototype.'

In the previous articles, we gained a basic understanding of Oracle Warehouse builder architecture and became familiar with its various components. My previous articles on Oracle Streams implementation etc., simply served as pointers towards standard implementation scenarios for the developer community. However, here, before we go ahead with building an ETL prototype, I would like to share my views on choosing the Oracle Warehouse Builder as a tool in an Oracle data warehouse life cycle management project and my experience (the "pain" or ease of using it) in building this prototype.

Common concerns of choosing an ETL tool

Very often, we notice, the common concerns that have been raised by project managers and other decision makers in a Data warehouse life cycle project is whether a "tool" is good enough to achieve the desired results in relatively less time than the traditional way of coding and maintaining the data warehouse objects. Also raised are issues such as advantages of warehouse builder over the other innumerable vendor tools that are available in the market today. Most importantly, in choosing a good team, the interviewer's tend to focus on whether the developer/ candidate has a good "knowledge" of the tool itself with very less or no importance to the basic concepts of the designing, developing and maintaining a data warehouse life cycle.

Building a prototype using Oracle Warehouse Builder will address most of the above concerns with the emphasis that one of the requirements would be that the candidate have a clear understanding of building and maintaining a data warehouse. The focus should also be on the developer's faculty in at least one programming language such as UNIX shell script and PL/SQL, Java, (for an Oracle based data warehouse) C etc. more than the tool itself.

As we already know, Oracle Warehouse Builder (OWB - as the name suggests) is a tool that can be effectively used in a complete Oracle based data warehouse life cycle development and maintenance environment and can very well integrate with other Oracle tools such as Oracle Portal, Oracle Discoverer and Oracle Workflow, for implementing or achieving pertinent business functionality.

Among the many phases, ETL is one of the most important phases and most often, an "ETL" tool is evaluated and chosen to implement this phase of the data warehouse life cycle. Again, we can have disparate source for loading data into a target warehouse schema and thus the ETL processes can range from being very basic in nature to very complex designs. In addition, loading data into the target warehouse schema can involve varied classes of transformations, the most commonly used being pre-written and provided as "transformation" libraries by vendors and others that require custom programming are developed by the warehouse developers.

Choosing Oracle Warehouse Builder

Some of the pros and cons of using the Oracle Warehouse Builder over traditional data warehouse development and management are (extensible to other tools too):

Pros

1.       All the stages and processes can be designed, planned and implemented systematically and actual "coding" (for most part it may not involve coding at all) can be come later during implementation. The graphic editors will give a complete picture of the how the components (source modules, mappings, transformations, targets, etc.) are involved/interact in the system and the associated process flows.

2.       Component management. - It is very important to keep track of all the components that go into the data warehouse--the source modules, objects, the mapping objects, the various transformations, the targets, processes and the real time status.

3.       Change management - A single change in the requirement can be implemented with relative ease as compared to implementing in a "home-grown" approach. Such changes can also be efficiently tracked (audit) and recorded for later analysis.

4.       Most transformations are readily available and do not have to be written from scratch, thereby saving a lot of valuable time and resources. This time and resources can be devoted to other tasks that demand attention such as performance management etc.

5.       Visibility - Visibility is very important in the transformation process as to what the processes are achieving.

6.       Standards - Unlike the "home grown" approach, design, development and implementation standards can be effectively enforced.

7.       Deliverables - A data warehouse project, is never a "do-doing-done" scenario, but among other things involves a continuous changing scenario (during the development phase and in most cases in the "maintenance" phases too), data and performance management (load tuning etc.) and interfacing with other systems. However, most often the development team has to produce quick deliverables and maintain the least possible time window in change management or implementations. In addition, Oracle Warehouse Builder does effectively facilitate such tasks and considerably reduces the deliverables time window.

Cons

1.       The initial stage in installing and configuring Oracle Warehouse Builder can be a little cumbersome due to the various versions, compatibility, patches and "bug-fixes" but this is offset by the many other advantages listed above.

2.       One of the biggest advantages of using homegrown over vendor provided ETL tools is that as a root level developer, it has been very easy to get-in and "tweak-and fix" any code as per the business requirement. However working with any ETL tool requires one to turn to the product support team for "fixes" in most cases (well, developers do get work arounds but the fact remains). Thus, there is a trade-off over such flexibility. However, one thing to note about Oracle Warehouse Builder is, since all of its transformations are written in PL/SQL, it is easy to "customize" the code to suit our requirements and even get-into the transformation and debugging if required. In addition, Oracle now provides the Java API /SDK to programmatically manipulate the metadata such as performing batch operations on the meta-data without having to use their Java client tools.

In a homegrown environment, it is only the development code/objects and plain old editors we tend to use (usually the preferred choice) and no Java clients etc., which require additional resources. Again, this is relatively trivial, and offset by the many advantages presented by Oracle Warehouse Builder and due to the availability of resources. However, in some cases it pays to use custom code rather than use any tool fabricated code due to performance and managibility reasons. Feel free to share post suggestions or comments or share your experiences in the Oracle Warehouse Builder group.

Building the prototype -- Initial steps

In this article, we will go through the some of the steps involved in building a prototype for the Extraction, Transformation and Loading (ETL) stage of the Data warehouse lifecycle.

The initial intent was to develop a "complete" ETL process design that including multiple type transformations, disparate source and target modules and a well-defined process flow scheduled for execution by the Oracle workflow server /OMS.

However, due to the scope of the article and because of the many emails I receive with questions on "how a specific task can be achieved", the focus eventually turned to "using a step-by-step" approach scenario on a per article basis, rather than taking a "complete" scenario. In addition, to cover all possible combinations and cases is beyond the scope of a single article.

Considering that Oracle rolls out umpteen versions, each requiring a "patch" for some reason (like Microsoft having their bulky product updates every now and then), you are bound to run into a set of problems depending upon the version you have (database, warehouse builder, etc.) and the feature you are trying to use (Like I did too!).

As you probably will experience, installation is one of the relatively "painful" tasks. However, following the steps below will help you avoid some common pitfalls in installation.

This prototype was done on a machine with following configuration. For the specific system requirements please use Oracle Warehouse Builder Installation and configuration guide.

Hardware:

1.       RAM 512 MB

2.       HDD 120GB

3.       CPU 667 MHZ (pretty old one in the days of GHz!)

The database was created with the Oracle provided "data warehouse" template. In this case the data warehouse is created with a DB_BLOCK_SIZE =8k and enqueque_resource parameter= 892.

Oracle recommends a DB_BLOCK_SIZE=16k and enqueque_resource parameter= 3000 for Oracle Warehouse Builder installations.

At the time of this prototype, the defaults were used.

Software versions:

1.       Oracle 9.2.0.1.0.

2.       OWB 9.2

Following are the guidelines only, not "detailed " steps for installation as these can be obtained using the Oracle Server installation manual and Oracle Warehouse Builder installation manual.

Installation

1.       Install the database server.

2.       Choose the Data warehouse template and use the default settings.

3.       Install Oracle Warehouse Builder version 9.2 (I chose to install as a separate component and not a bundled component that is available with 9iDS suite).

4.       Use a separate ORACLE_HOME for OWB.

5.       Technically, you should install the Oracle warehouse repository (design time and run time, both) in a separate database, but for this prototype, I have chosen to use the same data warehouse that was created in step 1 above.

6.       Installation and implementation of a complete business solution using the Oracle Management Server, the workflow server and the Oracle9i Application Server are not within the scope of this article. Oracle Warehouse Builder can be configured to work with these components, thereby delivering a total effective, efficient, highly scalable, and manageable business intelligence solution.

7.       Create the Oracle Warehouse Builder repository using the Repository Assistant. As shown below

8.       Follow the Repository Assistant wizard prompts (see fig. Below)

9.       Using this repository owner, you will later access the Oracle Warehouse Builder client through which will you will be able to design, develop and implement the complete data warehouse. (Make sure you note down all of the user names and passwords; it can be confusing between design time and runtime repository users for first time users)

10.   The next step is to create the Runtime repository where your runtime objects will be stored/deployed. (See figure below)

11.   Follow the runtime repository assistant wizard prompts .After providing the SYS user name and password and database connection details, you will see the window as shown:

12.   Now, the first step is to create the Runtime repository and the runtime repository owner (the wizard creates it for you, simply follow the prompts). The wizard also prompts to create a Runtime Access user. Oracle recommends accessing the runtime repository by logging in to the runtime repository as this user.

13.   After creating the runtime repository, the repository owner and the runtime access user, the wizard prompts to create the Target schema. This is the schema where the warehouse objects you have designed will be deployed and the processes will be executed. It is very important to follow the sequence because the wizard will grant security privileges to each of the users. The required privileges have to be rechecked again, if you get any "system privilege" messages during Deployment. The privileges for each user are documented under the Oracle Warehouse Builder release manual.

14.   Now you are all set to start building a prototype.

15.   Go to the start menu / Oracle Warehouse Builder / OWB client tool. (See figure below)

16.   Login to the OWB client using the Design repository owner user name that you specified in Step 8. (See below):

17.   Make sure you provide the server connection details by clicking on the "Connection Info." button (see below).

18.   When you login you will see a default project created for you in the repository as shown below:

19.   Click on the "+" sign to expand the node. (See below). Under the PROJECT node, you will see all the components discussed in the previous article.

20.   Under DATABASE, you can define the source/target modules. The source can be either Oracle or any other system including APPLICATIONS like SAP or Flat FILES. However, the target can only be an Oracle schema. You will also see the PUBLIC TRANSFORMATIONS that include CUSTOM or PRE-DEFINED transformations. Designed objects can then be deployed by using Runtime Repository that can be configured. You can define the process flows by choosing the PROCESS FLOW node.

21.   Some of the PRE-DEFINED transformations that can be used are for e.g.: the WB_LOOKUP_NUM (See below for description)

22.   You can also access Oracle tools such as workflow server, OMS etc through OWB client as follows:

23.   You can important /export Repository meta data to other tools such as Oracle Discoverer, Oracle Designer using Files/ Bridges as shown below:

24.   From the above figure, you will observe we can create a NEW PROJECT as well.

Building the prototype - Continued

The above were quick steps to get started with a simple "prototype." In my next article, that will follow shortly , we will see how to define three types of sources, namely, flat file, external table and a Oracle schema source table, target module, register locations, Map flat file definitions to target objects, use mapping editor, object editors and deploy these objects and initiate a data load.

OWB can be used to design dimensions, facts, and staging tables, ETL processes and generate mappings using complex transformations such as ADDRESS cleansing, table functions, look up transformations etc. in a relatively easy way and also handle SCDs. As already mentioned in previous articles, OWB can also handle multiple disparate sources and deploy objects to multiple target environments.

In addition, OWB can be configured with OMS or Workflow Server for automatic execution of ETL processes and import/export meta-data repository. Finally, OWB can be used in a multi-user environment seamlessly.

In subsequent articles, we will cover each of the above-mentioned points in relatively more details, followed by a complete "real-life" case scenario.

Useful Links

The Oracle Warehouse Builder group to discuss and share your views, opinions and experiences and ask any questions. Oracle Warehouse Builder Group

The Oracle Streams group to discuss and share your views, opinions and experiences and ask any questions. Oracle Streams Group

» See All Articles by Columnist Nandeep Nagarkar

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved