Synopsis. Oracle 10g extends many of the already robust features of Structured Query Language (SQL). This article the first of a series illustrates some of the more intriguing new features of SQL within Oracle, including upgrades to the MERGE command, enhancements to hierarchical query capabilities, and improvements to query and access methods for nested tables.
As a DBA and as an applications developer, I have worked with DB2, Teradata, Informix, Sybase, and SQLServer databases. I have to admit that Oracle's extensions to standard ANSI Structured Query Language (SQL) have always impressed me as some of the most powerful query and data maintenance tools in the industry. Since I spend a lot of time writing, reviewing, and testing SQL statements for efficiency and elegance, I always get excited when Oracle makes that part of my job easier.
Oracle has made some impressive improvements to this already-robust suite of extensions in Oracle 10g, and I will cover some of the more significant ones in this next series of articles, starting with one of my personal favorites, the MERGE command.
Data Traffic, MERGE Right: MERGE Command Syntax Upgrades
I was thrilled when Oracle 9i introduced the MERGE statement because it acknowledged that there is a constant need to either include new data or update existing information from a common data source, but on a conditional basis. As a recycled mainframe programmer whose first programming experience was with a now almost-defunct 4GL language called FOCUS, I still remember its most powerful feature for updating databases: the MODIFY command, which offered the ability to either create a new entry or update an existing entry in a database table based on a simple set of conditional tests against matching transaction and database values.
The Oracle 9i version of MERGE offered the ability to either insert new rows or update existing values in a table based on matching conditions. Oracle 10g has enhanced the MERGE statement even further: it now allows execution of a combination of INSERT, UPDATE, and DELETE DML commands within one PL/SQL block. This makes for extremely versatile code structures. I will illustrate this new versatility using the following scenario:
The investment management firm that manages your client's retirement benefits program has provided an Excel spreadsheet that lists all employees that have recently either:
- joined the 401(k) retirement benefits program
- made a contribution to their 401(k) accounts since the last file was received
- terminated their involvement in their 401(k) plans
The Human Resources department has asked this information be applied immediately to the appropriate database tables so that they can determine if there is a downturn in employee participation in the 401(k) retirement program.
To handle this request, I will convert the incoming Excel spreadsheet to a comma-separated values (CSV) file and then create an external table to read the CSV file. Using the enhanced MERGE command, I will then apply those transaction data against the table that contains the status of every employee's participation in the company's 401(k) program.
Listing 1.1 shows the transaction table, the external table definition, and the MERGE statement that illustrate this solution.
CONNECTing With Your Inner Hierarchy: Hierarchical Query Enhancements
Oracle already provides elegant features for traversing and reporting on the contents of hierarchical data structures. Some of the more common hierarchical structures that I have encountered and needed to parse include:
- Corporate organization structures, especially the relationships of employees to their supervisors
- Part-subpart relationships, e.g., components made up of subcomponents that are themselves made up of subcomponents
- Geopolitical boundaries that describe relationships between countries, states, counties, townships, and cities
- Taxing structures that are usually intimately connected to geopolitical boundaries
Thankfully, Oracle offers the CONNECT BY operator for connecting together elements of a hierarchy in a query. I will utilize a common example with which we are all familiar -- a company's organization chart -- to illustrate:
-- A traditional hierarchical query (before Oracle 10g)
COL mgr_id FORMAT 99999 HEADING 'MgrID'
COL mgr_name FORMAT A24 HEADING 'Manager Name'
COL level FORMAT 999 HEADING 'Lvl'
COL emp_id FORMAT 99999 HEADING 'EmpID'
COL emp_name FORMAT A24 HEADING 'Employee Name'
,(M.last_name || ', ' || M.first_name) mgr_name
,E.last_name || ', ' || E.first_name emp_name
WHERE E.manager_id = M.employee_id
START WITH E.manager_id = 101
CONNECT BY PRIOR E.employee_id = E.manager_id
ORDER BY SIBLINGS e.manager_id, e.employee_id;
MgrID Manager Name Lvl EmpID Employee Name
------ ------------------------ ---- ------ ------------------------
101 Kochhar, Neena 1 108 Greenberg, Nancy
101 Kochhar, Neena 1 200 Whalen, Jennifer
101 Kochhar, Neena 1 203 Mavris, Susan
101 Kochhar, Neena 1 204 Baer, Hermann
101 Kochhar, Neena 1 205 Higgins, Shelley
108 Greenberg, Nancy 2 109 Faviet, Daniel
108 Greenberg, Nancy 2 110 Chen, John
108 Greenberg, Nancy 2 111 Sciarra, Ismael
108 Greenberg, Nancy 2 112 Urman, Jose Manuel
108 Greenberg, Nancy 2 113 Popp, Luis
205 Higgins, Shelley 2 206 Gietz, William
11 rows selected.
Note that I used the PRIOR operator of the CONNECT BY operator to tell Oracle to link up each employee to his or her manager based on the value contained in each row's MANAGER_ID column. In this case, it interrogates the expression that follows immediately after the PRIOR operator for the parent row of the current row. I also used the optional START WITH operator to direct Oracle to start querying the hierarchy at employee #101 (Neena Kochar) and to show only those employees that report up to that employee. Finally, I employed the LEVEL pseudo-column to show which level in the hierarchy each employee belongs to.
As powerful as these methods are, Oracle 10g has expanded upon them considerably:
Traversing To the Top Node in a Hierarchy. Oracle 10g's new CONNECT_BY_ROOT unary operator makes it easy to gather information directly from the "root" or top node in a hierarchy for any child row. In addition, another new function, SYS_CONNECT_BY_ROOT, traverses the hierarchy from the child up to and including the root node in the hierarchy. This function will then return a list of values separated by the character string chosen as a delimiter.
Listing 1.2 illustrates two queries: one whose root node starts at the top of the hierarchy, and one whose root node starts at a lower point in the hierarchy based on the value supplied to the START WITH operator.
Establishing Child vs. Parent Status. Oracle 10g also provides a new pseudo-column, CONNECT_BY_ISLEAF, that establishes whether a returned row itself has additional children in the hierarchy, or is at the top of the hierarchy. If a row has additional children, it will return a value of zero (0) for this pseudo-column; otherwise, if the row has no further descendant nodes, it will return a value of one (1). Listing 1.3 shows how to utilize this new pseudo-column to find all rows that have no children.
CONNECT_BY_ISCYCLE: Handling Recursion Within Hierarchies. Most hierarchies have a simple tree-like node structure that can be traversed from the "root" entry or entries down through the parent to child relationships. However, in some cases, it is possible that a hierarchical relationship may "loop back" upon itself. This may occur as the result of a data entry error, or it may even be intentional.
To illustrate, I will update the MANAGER_ID column value for a few employees so that a recursion is introduced. Note that I am making Employee #902 the parent of Employee #901, and further disrupting the hierarchy by making Employee #903 the parent of Employee #902:
SET manager_id = 902
WHERE employee_id = 901;
SET manager_id = 903
WHERE employee_id = 902;
Now when I attempt to run a query against the EMPLOYEES table, Oracle will detect the "loop-back" condition and return an error:
,(last_name || ', ' || first_name)
,CONNECT_BY_ROOT last_name "Manager"
CONNECT BY PRIOR employee_id = manager_id
ORDER BY SIBLINGS employee_id;
ORA-01436: CONNECT BY loop in user data
To detect this recursion, I can use the new CONNECT_BY_ISCYCLE pseudo-column. This introduces a powerful, elegant method for detecting unwanted recursion, and would be especially useful as part of an AFTER statement trigger for data validation of hierarchies. Note that CONNECT_BY_ISCYCLE can only be queried when the new NOCYCLE directive of the CONNECT BY operator is employed. When a loop-back condition is detected, CONNECT_BY_ISCYCLE will return a value of one (1); otherwise, it will return a zero (0). The query in Listing 1.4 returns the entries that are participating in the recursion.
New Nested Table Functions: COLLECT, CARDINALITY, and SET
Over the past year, I have spent a lot of my spare time investigating, experimenting with, and transforming existing PL/SQL code to use Oracle's implementation of PL/SQL collections. I have found that they are an excellent alternative to global temporary tables for temporarily storing and manipulating moderate amounts of data during complex processing (please see my prior article on creating table functions that return PL/SQL collections.)
Encouraged by my progress with collections, I have now progressed to some initial implementations of nested tables. I have found them extremely useful for storing multiple entries of complex data in a single column in a table. The good news is that Oracle 10g has expanded support for powerful capabilities by adding some new functions for querying, manipulating, converting, and translating nested tables:
COLLECT. This new function is designed to transform values from any column selected from a table in a SELECT statement to a collection in the form of a nested table. COLLECT must be used within a call to the CAST function; it accepts a column from any type of table as input, and then returns a nested table of the input type from the rows selected. If the COLLECT function is applied against a column that is itself a collection, then it will return a nested table of collections.
CARDINALITY. On the other hand, the CARDINALITY function simply returns the number of elements present in a table's nested table column. This is extremely useful when determining the amount of data stored in a nested table column, or for controlling a FOR loop. If the nested table column is empty, the CARDINALITY function simply returns a NULL value.
SET. Unfortunately, there is no DISTINCT operator that can be applied against nested table values to compress them down to only unique entries. However, the new SET function accepts the value(s) in an existing nested table as input and returns a set of non-duplicate values with the same TYPE as the nested table itself.
I will illustrate these three handy and powerful new features with a simple but realistic business scenario: the need to capture and query telephone numbers for every employee in a company. I will first create a traditional table named HR.TELEPHONE_NBRS to store this information, and then I will create a simple TYPE and a second table named HR.LOCATION_COMM_IDS to store each location's telephone numbers in a nested table column. Listing 1.5 shows the statements to create these new tables, create the new TYPE, and then load the tables with sample data.
I will then query the traditional table's column with COLLECT to return a collection, apply the CARDINALITY function to count the number of entries for each of the nested table columns in the second table, and engage the SET function to return only the unique phone number values stored in the nested table column. See Listing 1.6 for the SQL queries and sample output.
Oracle 10g has significantly improved hierarchical queries, expanded the utility of the MERGE statement for data maintenance, and upgraded the capabilities for querying nested tables. These new features and enhancements to Structured Query Language make an already-robust database query and data manipulation engine even more powerful and make a positive impact on any Oracle DBA's most mundane tasks.
The next article in this series will concentrate on the enhanced analytical reporting capabilities that Oracle 10g provides via its new partitioned outer join and data densification features.
References and Additional Reading
While there is no substitute for direct experience, reading the manual is not a bad idea, either. I have drawn upon the following Oracle 10g documentation for the deeper technical details of this article:
B10750-01 Oracle Database New Features Guide
B10759-01 SQL Reference
» See All Articles by Columnist Jim Czuprynski