*
This article covers multipath operations that require a different hierarchical logic than SQL naturally produced hierarchical processing. These include nonlinear data ordering, the renormalization of data to remove replicated data, and the processing of network structures.
*

This series of articles on ANSI SQL Hierarchical Processing Capabilities has so far demonstrated the many inherent full hierarchical processing capabilities that exist naturally in SQL. These have included: multipath hierarchical processing; hierarchical data mashups; hierarchical data filtering; and hierarchical data transformations. This article will cover those multipath operations that require a different hierarchical logic than SQL naturally produced hierarchical processing. These include nonlinear data ordering, the renormalization of data to remove replicated data, and the processing of network structures.

## Hierarchical Data Relationship Review

A review of hierarchical data modeling relating relational and hierarchical data structures is presented first to establish some background for what is to follow. Figure 1 below demonstrates a common natural 1-to-M (one-to-many) data relationship. These are the most common relationships and even have more significance and usefulness for hierarchical structures. These relationships are shown in both hierarchical format and relational rowset format. Notice that the hierarchical format has not replicated data while the flat relational rowset has data replicated because of the linear Cartesian product applied. In this case, the relational replicated data is produced from the vertical parent-child effect.

Figure 1: 1-to-M example

Figure 2 below demonstrates the less common M-to-1 data relationship using the same data as in Figure 1. Notice in this example compared to the previous 1-to-M example that the now inverted hierarchical structure has produced Dept replicated data in the hierarchical data structure. The ordering of the root data has also been performed for convenience; this may or may not be desired. It carries no fixed semantic significance. Also notice that the relational rowset is basically the same as the one in the previous example except that data order has been reversed just to reflect the inversion.

Figure 2: M-to-1 inversion example

Figure 3 below demonstrates a nonlinear multipath 1-to-M hierarchical structure relationship. Notice in this multipath hierarchical structure that the additional pathway of Proj which added a third column to the relational structure produced more total data replications than expected from its parent-child linear data replication addition. What has also been naturally included in this result is the Cartesian product producing all the multipath hierarchical unique data combinations occurring horizontally across the related pathways when represented in the relational structure. These are the Emp/Proj combinations related by Dept01 data occurrence and the Emp/Proj combinations related by the Dept02 data occurrence. Also, notice in Figure 3 that the multipath hierarchical data structure still did not produce any data replications for its standard 1-to-M multipath hierarchical structure.

Figure 3: 1-to-M multipath example

## Ordering Out of Hierarchical Order Causes Problems

Ordering flat data structures like relational data do not present many rules and is straightforward. This is not the same for more complex data such as hierarchical data. The problem of ordering out of hierarchical order can be seen below in Figure 4. First, it can be seen that the data replications for Dept have been introduced inadvertently, which can produce incorrect results. Second, by ordering Emp before Dept naturally gives it a higher significance level similar to the Dept/Emp inversion example in Figure 2, which further supports the inadvertent inverted results in Figure 4. Ordering should not affect the structure of hierarchical structures.

Figure 4: Ordering out of hierarchical order problem

## Hierarchical Structure-Aware Processing Importance

It was shown in previous articles in this series that the ANSI SQL LEFT Outer Join syntax can model multipath hierarchical structures, and their associated semantics can support multipath hierarchical processing when processed by SQL. Figure 6 has an example of this ANSI SQL LEFT Outer Join data modeling syntax. The newer multiple use SQL ON clause used at each join point replaced the single use WHERE clause for specifying join criteria. This makes it more flexible and accurate eliminating ambiguity problems. The SQL-92 LEFT Outer Join introduced the one sided join which preserves the left side of the join when there is no data match. This LEFT Outer Join syntax supports the basic hierarchical parent over child relationship, which models hierarchical structures.

Reversing the process of LEFT Join data modeling process produces the ability to automatically extract the active hierarchical data structure from the LEFT Outer Join string that models the hierarchical structure being processed. This structure-aware processing is one of the most important transparent capabilities necessary to automatically control many of the automatic features covered previously. These include hierarchical optimization, hierarchical output structure format, and SQL-to-hierarchical operation mapping. Also included are capabilities that modify the structure: structure joining; structure data modeling; and structure transformations. These structure changes are also detected. The multipath features described in the following sections relies heavily on structure-aware processing to support new nonlinear multipath features.

## Nonlinear Hierarchical Ordering Follows Multipath Structure

The ORDER-BY is one of those SQL operations that take on special meaning when applied to multipath hierarchical structures and operations. These operations are outside of SQL's natural hierarchical processing. The value-added hierarchical ordering solves this problem nonprocedurally retaining the ANSI SQL syntax. As was shown above in Figure 4, ordering out of the hierarchical structure presents problems when processing hierarchical structures. Flat structures do not have these concerns since there is no structure and can be ordered in any way. On the other hand, nonlinear data ordering in multipath hierarchical structures introduces a new powerful capability shown in Figure 5 below.

Figure 5: Nonlinear hierarchical ordering flow