The development cycle of a typical data warehouse or data mart includes various stages such as data requirement analysis, modeling, extraction, transformation, loading, report development and testing.
This cycle typically involves a significant number of challenges such as extraction from different data sources, missing data, etc., which results in significant loss of time, developer productivity and other resources. These challenges are well documented in popular books, article and on websites such as rkimball.com and billinmon.com.
In addition to these challenges, the development team is left with having to make decisions regarding the purchasing of a variety of tools such as query tools, distribution tools, process automation and scheduling tools, ETL tools, etc. from the ever-increasing pool of softaware vendors.
Prototyping is one of the best-suggested approaches in mitigating the adverse effects of such unforeseen problems and in facilitating decision-making.
In this series of articles we shall restrict our focus to prototyping the ETL stage of the DW life cycle since data warehouse legend states that on average, 80% of the efforts of building a data warehouse go into the ETL related tasks.
In an Oracle DW environment, the ETL stage can be implemented using the "home grown" approach (PL/SQL, UNIX, etc.), by third party ETL tools (Informatica, Ab Initio, etc.) or by using Oracle Warehouse Builder (OWB).
In the series of articles that follow, we will discuss the advantages of using OWB over the home grown approach and third party ETL tools. We will develop a prototype ETL process using OWB and discuss the many features of OWB that have been added over different versions of OWB.
Oracle launched the first production version of OWB 188.8.131.52 sometime during the year 2000, followed by OWB 2.1.1 and OWB 3i versions in year 2001 and OWB 9i releases in Year 2002/2003. In addition, Oracle has added many new features to OWB over the versions listed above and is continuously adding more and more enhancements in line with industry requirements. As such, these enhancements make OWB an appropriate ETL tool in an Oracle data warehouse environment. As we go forward with developing the prototype, we will explore how OWB provides leverage in a typical ETL process and how it uses the database Oracle9i ETL toolkit functionality such as External tables, MERGE statement, etc.
In a typical Oracle DW environment, it is not uncommon to have all the processes developed using PL/SQL and UNIX and managed by the DBA's and developers. To keep this discussion brief, it can be stated in a few words that possibly the most important advantage in using an ETL tool is the automation and management of the complex ETL process(es), especially in a large environment which leads to faster development cycles and the ability to produce prototypes in the least possible time.
Some of the factors that possibly favor using OWB in an Oracle DW environment are:
Most of the development team is familiar with PL/SQL (or Java) and UNIX (this gives the ability for developers to create user defined transformations in PL/SQL or Java, in addition to the more than 150 preset transformations provided by OWB)
OWB's tight integration with Oracle 9i server and its ability to optimally utilize the ETL toolkit functionality that Oracle 9i provides like "Table functions", "external tables", "Multi-table inserts", Partition Exchange loading etc.
Seamless integration with other Oracle BI tools such as Discoverer, Oracle Designer, Oracle Work Flow, and OEM.
And ofcourse, its ability to extract data from disparate source systems such as SAP R/3, DB2 etc. using integrators, transparent gateways etc.
Ability to generate code (PL/SQL, ABAP).
For delving more into OWB features go to Oracle OWB data sheet.
In the case of choosing OWB over other ETL tools such as Informatica, the primary advantage, in addition to the above features, is the option to use the powerful features of Oracle Application Server (OracleiAS) for publishing web based meta data reports, integrating with Oracle Work flow Server, etc.
On the other hand, other factors might possibly influnce the choice of not using OWB. However, this usually happens in the case of a non-Oracle target environment, or when multiple non-Oracle source systems and complex integration and transformations are involved.
In most cases though, if the development team is comprised of Oracle skills, OWB would be the most appropriate ETL tool.
Some of the key OWB components are:
OWB repository that stores Meta definitions used in the DW and the transformation library, which contains more than 150, preset transformations.
OWB client that is used to design the DW and the ETL processes. This component also includes bridges, integrators, and editors and acts as a point of access to other OWB components.
OWB design browser client to view metadata, run web reports, perform lineage, and impact analysis on your metadata.
OWB runtime audit browser client where you run reports on the audit and error information captured when you run jobs to load and refresh your ETL processes.
OWB runtime assistant that helps you to set up a runtime repository and target schema and maintain the deployment and runtime audit/error information.
Some of the other components that will be discussed in later articles along with the above are the repository and browser assistants, file upgrade utility, etc.
In my forth-coming articles, we will briefly discuss the OWB architecture, followed by a prototype developed using OWB, some of the most commonly applicable transformations and the various features of OWB that can be utilized in a typical ETL scenario.