Is UNION The Best Choice When Joining Result Sets?

There are occasions when two or more data sets need to be combined to produce the desired results, and a join is not possible (no common columns, no common data values to join on). Conventional wisdom would point to using UNION to generate such sets, and in many cases, this is a suitable solution. If, however, these unions are the ‘table’ for another query, one intending to produce distinct results, simply diving in and using UNION may not be the best direction to take.

The following example has been specifically written to highlight an issue when using UNION. It is similar to code I was asked to tune (although the actual code was not intentionally written to generate SORT UNIQUE operations as this was) and illustrates the issue that was slowing down execution. Let’s begin by creating three simple two-column tables:


BLEEPLE @ smang > create table tab1(
  2  snord   number,
  3  queebo  varchar2(40));

Table created.

BLEEPLE @ smang > 
BLEEPLE @ smang > create table tab2(
  2  plang   number,
  3  horbst  varchar2(40));

Table created.

BLEEPLE @ smang > 
BLEEPLE @ smang > create table tab3(
  2  umo     number,
  3  erst    varchar2(40));

Table created.

BLEEPLE @ smang > 

To speed up the resulting queries function-based indexes will be created, and the tables will be populated. Finally, statistics will be gathered on each table and index:


BLEEPLE @ smang > create index tab1_fbi on tab1(mod(snord,970));

Index created.

BLEEPLE @ smang > create index tab2_fbi on tab2(mod(plang,970));

Index created.

BLEEPLE @ smang > create index tab3_fbi on tab3(mod(umo,970));

Index created.

BLEEPLE @ smang > 
BLEEPLE @ smang > begin
  2  	     for i in 1..1000000 loop
  3  		     insert into tab1
  4  		     values(i, 'Eebeeneebee'||i);
  5  		     if mod(i,970) = 0 then
  6  			     insert into tab2
  7  			     values(i, 'Ooboonooboo'||i);
  8  			     insert into tab3
  9  			     values(i, 'Uubuunuubuu'||i);
 10  		     else
 11  			     insert into tab2
 12  			     values(i, 'Eebeeneebee'||i);
 13  			     insert into tab3
 14  			     values(i, 'Eebeeneebee'||i);
 15  		     end if;
 16  	     end loop;
 17  
 18  	     commit;
 19  end;
 20  /

PL/SQL procedure successfully completed.

BLEEPLE @ smang > 
BLEEPLE @ smang > exec dbms_stats.gather_table_stats(user, 'TAB1', cascade=>true);

PL/SQL procedure successfully completed.

BLEEPLE @ smang > exec dbms_stats.gather_table_stats(user, 'TAB2', cascade=>true);

PL/SQL procedure successfully completed.

BLEEPLE @ smang > exec dbms_stats.gather_table_stats(user, 'TAB3', cascade=>true);

PL/SQL procedure successfully completed.

BLEEPLE @ smang > 

So far, so good. Now, a result set will be generated using UNION. You can see how Oracle processes the request:


BLEEPLE @ smang > set autotrace on
BLEEPLE @ smang > 
BLEEPLE @ smang > with u1 as(
  2  select snord, queebo From tab1
  3   where mod(snord,970) = 0
  4  union
  5  select plang, horbst From tab2
  6   where mod(plang,970) = 0
  7  ),
  8  u2 as (
  9  select plang, horbst From tab2
 10   where mod(plang,970) = 0
 11  union
 12  select umo, erst  from tab3
 13  where mod(umo,970) = 0
 14  )
 15  select distinct t.* from
 16  (select * from u1 union select * from u2) t;

     SNORD QUEEBO                                                                                                                           
---------- ----------------------------------------                                                                                         
       970 Eebeeneebee970                                                                                                                   
       970 Ooboonooboo970                                                                                                                   
       970 Uubuunuubuu970                                                                                                                   
      1940 Eebeeneebee1940                                                                                                                  
      1940 Ooboonooboo1940                                                                                                                  
      1940 Uubuunuubuu1940                                                                                                                  
      2910 Eebeeneebee2910                                                                                                                  
      2910 Ooboonooboo2910                                                                                                                  
      2910 Uubuunuubuu2910                                                                                                                  
      3880 Eebeeneebee3880                                                                                                                  
      3880 Ooboonooboo3880                                                                                                                  
...
    996190 Uubuunuubuu996190                                                                                                                
    997160 Eebeeneebee997160                                                                                                                
    997160 Ooboonooboo997160                                                                                                                
    997160 Uubuunuubuu997160                                                                                                                
    998130 Eebeeneebee998130                                                                                                                
    998130 Ooboonooboo998130                                                                                                                
    998130 Uubuunuubuu998130                                                                                                                
    999100 Eebeeneebee999100                                                                                                                
    999100 Ooboonooboo999100                                                                                                                
    999100 Uubuunuubuu999100                                                                                                                

3090 rows selected.


Execution Plan
----------------------------------------------------------                                                                                  
Plan hash value: 2741225135                                                                                                                 
                                                                                                                                            
------------------------------------------------------------------------------------------------------                                      
| Id  | Operation                                 | Name     | Rows  | Bytes | Cost (%CPU)| Time     |                                      
------------------------------------------------------------------------------------------------------                                      
|   0 | SELECT STATEMENT                          |          |  4109 |   140K|  4155   (1)| 00:00:01 |                                      
|   1 |  VIEW                                     |          |  4109 |   140K|  4155   (1)| 00:00:01 |                                      
|   2 |   SORT UNIQUE                             |          |  4109 |   140K|  4155   (1)| 00:00:01 |                                      
|   3 |    UNION-ALL                              |          |       |       |            |          |                                      
|   4 |     VIEW                                  |          |  2048 | 71680 |  2076   (1)| 00:00:01 |                                      
|   5 |      SORT UNIQUE                          |          |  2048 | 55296 |  2076   (1)| 00:00:01 |                                      
|   6 |       UNION-ALL                           |          |       |       |            |          |                                      
|   7 |        TABLE ACCESS BY INDEX ROWID BATCHED| TAB1     |  1018 | 27486 |  1037   (0)| 00:00:01 |                                      
|*  8 |         INDEX RANGE SCAN                  | TAB1_FBI |  1031 |       |     6   (0)| 00:00:01 |                                      
|   9 |        TABLE ACCESS BY INDEX ROWID BATCHED| TAB2     |  1030 | 27810 |  1037   (0)| 00:00:01 |                                      
|* 10 |         INDEX RANGE SCAN                  | TAB2_FBI |  1031 |       |     6   (0)| 00:00:01 |                                      
|  11 |     VIEW                                  |          |  2061 | 72135 |  2076   (1)| 00:00:01 |                                      
|  12 |      SORT UNIQUE                          |          |  2061 | 55647 |  2076   (1)| 00:00:01 |                                      
|  13 |       UNION-ALL                           |          |       |       |            |          |                                      
|  14 |        TABLE ACCESS BY INDEX ROWID BATCHED| TAB2     |  1030 | 27810 |  1037   (0)| 00:00:01 |                                      
|* 15 |         INDEX RANGE SCAN                  | TAB2_FBI |  1031 |       |     6   (0)| 00:00:01 |                                      
|  16 |        TABLE ACCESS BY INDEX ROWID BATCHED| TAB3     |  1031 | 27837 |  1037   (0)| 00:00:01 |                                      
|* 17 |         INDEX RANGE SCAN                  | TAB3_FBI |  1031 |       |     6   (0)| 00:00:01 |                                      
------------------------------------------------------------------------------------------------------                                      
                                                                                                                                            
Predicate Information (identified by operation id):                                                                                         
---------------------------------------------------                                                                                         
                                                                                                                                            
   8 - access(MOD("SNORD",970)=0)                                                                                                           
  10 - access(MOD("PLANG",970)=0)                                                                                                           
  15 - access(MOD("PLANG",970)=0)                                                                                                           
  17 - access(MOD("UMO",970)=0)                                                                                                             


Statistics
----------------------------------------------------------                                                                                  
          1  recursive calls                                                                                                                
          0  db block gets                                                                                                                  
       7230  consistent gets                                                                                                                
       3055  physical reads                                                                                                                 
     247332  redo size                                                                                                                      
     115568  bytes sent via SQL*Net to client                                                                                               
       2807  bytes received via SQL*Net from client                                                                                         
        207  SQL*Net roundtrips to/from client                                                                                              
          3  sorts (memory)                                                                                                                 
          0  sorts (disk)                                                                                                                   
       3090  rows processed                                                                                                                 

BLEEPLE @ smang > 

Notice lines 2, 5 and 12 of the execution plan; All three are SORT UNIQUE, one for each of the interim result sets and the last for the combined set of data. Only one SORT UNIQUE was expected when this code was written. Implicit sorts from the UNION operators added the other two to the plan. Let’s modify this code a bit and use UNION ALL in place of the UNION statements originally used:


BLEEPLE @ smang > 
BLEEPLE @ smang > with u1 as(
  2  select snord, queebo From tab1
  3   where mod(snord,970) = 0
  4  union all
  5  select plang, horbst From tab2
  6   where mod(plang,970) = 0
  7  ),
  8  u2 as (
  9  select plang, horbst From tab2
 10   where mod(plang,970) = 0
 11  union all
 12  select umo, erst  from tab3
 13  where mod(umo,970) = 0
 14  )
 15  select distinct t.* from
 16  (select * from u1 union all select * from u2) t;

     SNORD QUEEBO                                                                                                                           
---------- ----------------------------------------                                                                                         
      1940 Eebeeneebee1940                                                                                                                  
     11640 Eebeeneebee11640                                                                                                                 
     20370 Eebeeneebee20370                                                                                                                 
     23280 Eebeeneebee23280                                                                                                                 
     25220 Eebeeneebee25220                                                                                                                 
...
    978730 Uubuunuubuu978730                                                                                                                
    956420 Uubuunuubuu956420                                                                                                                
    982610 Uubuunuubuu982610                                                                                                                
    389940 Uubuunuubuu389940                                                                                                                
    303610 Uubuunuubuu303610                                                                                                                
    312340 Uubuunuubuu312340                                                                                                                
    301670 Uubuunuubuu301670                                                                                                                
    352110 Uubuunuubuu352110                                                                                                                
    349200 Uubuunuubuu349200                                                                                                                
    392850 Uubuunuubuu392850                                                                                                                

3090 rows selected.


Execution Plan
----------------------------------------------------------                                                                                  
Plan hash value: 294584558                                                                                                                  
                                                                                                                                            
-----------------------------------------------------------------------------------------------------                                       
| Id  | Operation                                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |                                       
-----------------------------------------------------------------------------------------------------                                       
|   0 | SELECT STATEMENT                         |          |  4109 |   140K|  4150   (1)| 00:00:01 |                                       
|   1 |  HASH UNIQUE                             |          |  4109 |   140K|  4150   (1)| 00:00:01 |                                       
|   2 |   VIEW                                   |          |  4109 |   140K|  4149   (1)| 00:00:01 |                                       
|   3 |    UNION-ALL                             |          |       |       |            |          |                                       
|   4 |     VIEW                                 |          |  2048 | 71680 |  2074   (0)| 00:00:01 |                                       
|   5 |      UNION-ALL                           |          |       |       |            |          |                                       
|   6 |       TABLE ACCESS BY INDEX ROWID BATCHED| TAB1     |  1018 | 27486 |  1037   (0)| 00:00:01 |                                       
|*  7 |        INDEX RANGE SCAN                  | TAB1_FBI |  1031 |       |     6   (0)| 00:00:01 |                                       
|   8 |       TABLE ACCESS BY INDEX ROWID BATCHED| TAB2     |  1030 | 27810 |  1037   (0)| 00:00:01 |                                       
|*  9 |        INDEX RANGE SCAN                  | TAB2_FBI |  1031 |       |     6   (0)| 00:00:01 |                                       
|  10 |     VIEW                                 |          |  2061 | 72135 |  2074   (0)| 00:00:01 |                                       
|  11 |      UNION-ALL                           |          |       |       |            |          |                                       
|  12 |       TABLE ACCESS BY INDEX ROWID BATCHED| TAB2     |  1030 | 27810 |  1037   (0)| 00:00:01 |                                       
|* 13 |        INDEX RANGE SCAN                  | TAB2_FBI |  1031 |       |     6   (0)| 00:00:01 |                                       
|  14 |       TABLE ACCESS BY INDEX ROWID BATCHED| TAB3     |  1031 | 27837 |  1037   (0)| 00:00:01 |                                       
|* 15 |        INDEX RANGE SCAN                  | TAB3_FBI |  1031 |       |     6   (0)| 00:00:01 |                                       
-----------------------------------------------------------------------------------------------------                                       
                                                                                                                                            
Predicate Information (identified by operation id):                                                                                         
---------------------------------------------------                                                                                         
                                                                                                                                            
   7 - access(MOD("SNORD",970)=0)                                                                                                           
   9 - access(MOD("PLANG",970)=0)                                                                                                           
  13 - access(MOD("PLANG",970)=0)                                                                                                           
  15 - access(MOD("UMO",970)=0)                                                                                                             


Statistics
----------------------------------------------------------                                                                                  
          1  recursive calls                                                                                                                
          0  db block gets                                                                                                                  
       4140  consistent gets                                                                                                                
        213  physical reads                                                                                                                 
          0  redo size                                                                                                                      
     113955  bytes sent via SQL*Net to client                                                                                               
       2807  bytes received via SQL*Net from client                                                                                         
        207  SQL*Net roundtrips to/from client                                                                                              
          0  sorts (memory)                                                                                                                 
          0  sorts (disk)                                                                                                                   
       3090  rows processed                                                                                                                 

BLEEPLE @ smang > 

The three SORT UNIQUE operations have now been replaced with a single HASH UNIQUE operation. If the result sets are examined, you noticed that the SORT UNIQUE results are ordered by the leading column of the table and the HASH UNIQUE results appear to be haphazard. Believe it or not, the hash results are ordered; they are ordered by the hash key used to ‘weed out’ the distinct elements. In previous releases of Oracle (9 and earlier) the three SORT UNIQUE statements would be replaced by a single SORT UNIQUE from the ‘select distinct’ query. Nonetheless using UNION ALL does produce a less resource-intensive plan and returns the same data, even if not in the same physical order.

It may not be possible to give such problems much thought in this day and age because someone wants the results … now. But, the more you know about how Oracle behaves can make such work easier in the long run.

David Fitzjarrell
David Fitzjarrell
David Fitzjarrell has more than 20 years of administration experience with various releases of the Oracle DBMS. He has installed the Oracle software on many platforms, including UNIX, Windows and Linux, and monitored and tuned performance in those environments. He is knowledgeable in the traditional tools for performance tuning – the Oracle Wait Interface, Statspack, event 10046 and 10053 traces, tkprof, explain plan and autotrace – and has used these to great advantage at the U.S. Postal Service, American Airlines/SABRE, ConocoPhilips and SiriusXM Radio, among others, to increase throughput and improve the quality of the production system. He has also set up scripts to regularly monitor available space and set thresholds to notify DBAs of impending space shortages before they affect the production environment. These scripts generate data which can also used to trend database growth over time, aiding in capacity planning. He has used RMAN, Streams, RAC and Data Guard in Oracle installations to ensure full recoverability and failover capabilities as well as high availability, and has configured a 'cascading' set of DR databases using the primary DR databases as the source, managing the archivelog transfers manually and montoring, through scripts, the health of these secondary DR databases. He has also used ASM, ASMM and ASSM to improve performance and manage storage and shared memory.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles