Getting ANSI About Joins

Tuesday May 20th 2003 by Jim Czuprynski
Share:

The new ANSI format for joins in Oracle 9i may appear alien at first, but they have the potential to become part of a DBA's arsenal for clarifying SQL and PL/SQL code. This article discusses the new join formats including NATIVE, CROSS, and OUTER and includes examples for review and comparison to the "traditional" join mechanisms.

I recently had an opportunity to review a Microsoft Access query created against one of our production databases. The query seemed to run forever and took up huge amounts (4GB) of TEMPFILE space. Our development team eventually solved the problem - an inexperienced developer had missed a critical join between one of the six tables in the query. Moreover, since Access translates all its joins into ANSI syntax, it took us some time to decode the SQL that caused the problem.



If you've been working with SQL for some time, I'm willing to bet that the ANSI joins may look a bit confusing and convoluted, especially if you've always used the "traditional" join predicates (i.e. WHERE table1.column = table2.column). I have been writing SQL queries since 1985 starting with TERADATA databases, back when the concept of a terabyte of information was still intimidating. However, I must admit that the ANSI syntax, once understood, has some advantages over the traditional syntax.



Here is a quick summary of how the ANSI syntax can be applied to SQL queries in Oracle 9iR2. I have provided side-by-side examples showing how the traditional vs. ANSI syntax can be used to obtain the same result. Please note that I have utilized the HR schema in the EXAMPLE tablespace that is provided with Oracle 9iR2 for these examples. See the section at the end of this document for a listing of these tables' metadata.



NATURAL Joins. A natural join, as its name implies, can be invoked when two or more tables share exactly the same columns needed for a successful equijoin. For example, these queries will return all Region and Country information for all countries whose name that contains the string "united":



Example: NATURAL Join

Traditional Syntax

ANSI SYNTAX

SELECT
   r.region_id,
   r.region_name,
   c.country_id,
   c.country_name
  FROM 
   countries c,
   regions r
 WHERE c.region_id = r.region_id
   AND LOWER(c.country_name) LIKE '%united%';
SELECT
   region_id,
   r.region_name,
   c.country_id,
   c.country_name
  FROM countries c
  NATURAL JOIN regions r
 WHERE LOWER(c.country_name) LIKE '%united%';

Note that even though the table names are prefixed in both examples, REGION_ID cannot use the prefix; Oracle chooses the appropriate column in the naturally-joined table from which to gather data. If you get an ORA-25155: column used in NATURAL join cannot have qualifier error message, check over the columns named in the query - you may have inadvertently broken this rule.

by Jim Czuprynski

JOIN ... USING. When you need to join tables that share more than one column naturally, the JOIN ... USING syntax needs to be used. A NATURAL join between the Employees and Departments tables, for example, could yield unexpected results because the tables share both the DEPARTMENT_ID and MANAGER_ID columns, so the JOIN...USING syntax can be used to alleviate this issue. These queries will display Department and Employee information for all employees hired after December 31, 1999:

Example: JOIN...USING

Traditional Syntax

SELECT
   d.department_id || ' - ' || d.department_name "Department",
   e.employee_id, 
   e.last_name || ', ' || e.first_name "Name",
   e.hire_date
  FROM 
   employees e,
   departments d
 WHERE e.department_id = d.department_id
   AND e.hire_date > TO_DATE('12/31/1999', 'mm/dd/yyyy');;

ANSI SYNTAX

SELECT
   department_id || ' - ' || d.department_name "Department",
   employee_id, 
   e.last_name || ', ' || e.first_name "Name",
   e.hire_date
  FROM  employees e 
  JOIN departments d USING (department_id)
 WHERE e.hire_date > TO_DATE('12/31/1999', 'mm/dd/yyyy');

Again, note that as with the NATURAL JOIN syntax, it's improper to use the table prefix for the columns specified in the JOIN...USING statement (in this case, DEPARTMENT_ID).

JOIN ... ON. When you need to describe exactly how two or more tables should be joined together, the ANSI JOIN...ON syntax is ideal. Notice that these queries are joining three tables together to return employee, job, and job history information for all employees hired after December 31, 1999:

Example: JOIN...ON

Traditional Syntax

ANSI SYNTAX

SELECT
   employee_id, 
   e.last_name || ', ' || e.first_name "Name",
   job_id,
   j.job_title,
   jh.start_date,
   jh.end_date
  FROM
   employees e,
   jobs j,
   job_history jh
WHERE e.job_id = j.job_id
   AND e.job_id = jh.job_id
   AND e.hire_date > TO_DATE('12/31/1999', 'mm/dd/yyyy');
SELECT
   employee_id, 
   e.last_name || ', ' || e.first_name "Name",
   job_id,
   j.job_title,
   jh.start_date,
   jh.end_date
  FROM  employees e 
  JOIN jobs j ON (j.job_id = e.job_id)
  JOIN job_history jh ON (jh.job_id = j.job_id)
 WHERE e.hire_date > TO_DATE('12/31/1999', 'mm/dd/yyyy'); 

Outer Joins. Outer joins are usually needed when all rows of one side of the join equation must be retrieved regardless of how many matches are found between the other side of the equation.

A left outer join is constructed whenever all rows on the left side of the join equation need to be returned regardless of whether or not any rows exist on the right side of the join. The traditional syntax uses a plus sign in parentheses - (+) - to indicate which side of the join may not contain any corresponding rows.

In the following example, the query needs to return a total count of all Employees for all Departments regardless of whether some Departments have no employees:

Example: LEFT OUTER Join

Traditional Syntax

ANSI SYNTAX

SELECT
   d.department_id,
   d.department_name,
   COUNT(e.employee_id) "Employees"
  FROM
   departments d,
   employees e
 WHERE d.department_id = e.department_id (+) 
 GROUP BY d.department_id, d.department_name
 ORDER BY d.department_id, d.department_name;
SELECT
   d.department_id,
   d.department_name,
   COUNT(e.employee_id) "Employees"
  FROM
   departments d LEFT OUTER JOIN employees e 
   ON d.department_id = e.department_id
 GROUP BY d.department_id, d.department_name
 ORDER BY d.department_id, d.department_name;

Likewise, a right outer join is invoked whenever all rows on the right side of the join equation need to be returned regardless of whether or not any rows exist on the left side of the join. Here are examples satisfying the opposite of the prior requirement: Return a count of total Employees in each Department regardless of whether some Employees have been assigned a Department that does not yet exist:

Example: RIGHT OUTER Join

Traditional Syntax

ANSI SYNTAX

SELECT
   d.department_id,
   d.department_name,
   COUNT(e.employee_id) "Employees"
  FROM
   departments d,
   employees e
 WHERE d.department_id (+) = e.department_id
 GROUP BY d.department_id, d.department_name
 ORDER BY d.department_id, d.department_name;
SELECT
   d.department_id,
   d.department_name,
   COUNT(e.employee_id) "Employees"
  FROM
   departments d RIGHT OUTER JOIN employees e 
   ON d.department_id = e.department_id
 GROUP BY d.department_id, d.department_name
 ORDER BY d.department_id, d.department_name;

Finally, there is the full outer join. As its name implies, a full outer join returns results from both sides of the join equation. The only way to perform a full outer join in traditional syntax is via the UNION or UNION ALL operators. However, the ANSI syntax provides the FULL OUTER JOIN to accomplish this instead. As you can see, the FULL OUTER JOIN syntax results in a much more compact query:

Example: FULL OUTER Join

Traditional Syntax

ANSI SYNTAX

   SELECT
      d.department_id,
      d.department_name,
      COUNT(e.employee_id) "Employees"
     FROM
      departments d,
      employees e
    WHERE d.department_id = e.department_id (+) 
    GROUP BY d.department_id, d.department_name
UNION
   SELECT
      d.department_id,
      d.department_name,
      COUNT(e.employee_id) "Employees"
     FROM
      departments d,
      employees e
    WHERE d.department_id (+) = e.department_id
    GROUP BY d.department_id, d.department_name;
SELECT
   d.department_id,
   d.department_name,
   COUNT(e.employee_id) "Employees"
  FROM
   employees e FULL OUTER JOIN departments d 
   ON d.department_id = e.department_id
 GROUP BY d.department_id, d.department_name
 ORDER BY d.department_id, d.department_name;

Self-Joins. The recursive join that results when a set of data is joined back to itself is also known as a self-join. This is typically encountered when a parent-child relationship in a hierarchy needs to be traversed. When using the ANSI syntax to accomplish this, only the JOIN...ON syntax will work, and the keyword INNER can be used for clarity. For example, these queries will return a list of Employees hired after December 31, 1999 and their current Managers:

Example: INNER JOIN

Traditional Syntax

ANSI SYNTAX

SELECT
   m.last_name || ', ' || m.first_name "Manager",
   e.last_name || ', ' || e.first_name "Name",
   e.hire_date
  FROM
   employees e,
   employees m
 WHERE e.manager_id = m.employee_id        
   AND e.hire_date > TO_DATE('12/31/1999', 'mm/dd/yyyy');
SELECT
   m.last_name || ', ' || m.first_name "Manager",
   e.last_name || ', ' || e.first_name "Name",
   e.hire_date
  FROM employees e 
   INNER JOIN employees m
   ON (e.manager_id = m.employee_id)        
 WHERE e.hire_date > TO_DATE('12/31/1999', 'mm/dd/yyyy');
by Jim Czuprynski

Cartesian Products. Though I must admit I have spent a lot of time trying to avoid them, Cartesian products do happen from time to time, usually as the result of an equijoin condition that has been missed in a query using traditional syntax. However, one of the advantages of the ANSI join syntax is that a specific keyword, CROSS JOIN, is required to create a Cartesian product, as shown in these next examples:

Example: CROSS JOIN

Traditional Syntax

ANSI SYNTAX

SELECT 
   d.department_name,
   l.city,
   l.country_id
  FROM
    departments d,
    locations l
 WHERE l.country_id = 'JP';
SELECT 
   d.department_name,
   l.city,
   l.country_id
  FROM departments d
    CROSS JOIN locations l
 WHERE l.country_id = 'JP';

Summary. Though my experiments are still underway, and I have yet to completely convince my fellow developers of the benefits of the ANSI syntax, I have found that there is indeed considerable value in ANSI join syntax:

Intuitive. The ANSI syntax uses specific, intuitive keywords to describe the exact operations to be performed while the sets of data are being joined.

Distinctive. Since the new ANSI syntax is quite different from the traditional join method, it is immediately obvious which sets of data have been joined and what the relationships are. It is no longer necessary to untangle a complex set of selection criteria to determine which are join-related and which are merely filtering the final result set.

Clarity. I have had to work with several different dialects of SQL over the past several years, including Informix, Sybase, DB2, and Access, and the syntax for outer joins is slightly different for each dialect. The ANSI syntax, however, leaves no doubt as to which "side" of the join is the outermost.

More powerful. The new FULL OUTER JOIN syntax can be used in place of a UNION without fear of forgetting the difference between UNION and UNION ALL.

If you still harbor any doubts about experimenting with and implementing the new ANSI syntax, consider this: Oracle mentions in the preparation notes for the first OCA test that extensive knowledge of the ANSI join syntax is important for a passing grade. If you are on the road to becoming an Oracle Certified Professional, it certainly appears they must be mastered.

Jim Czuprynski is an Oracle DBA for a telecommunications company in Schaumburg, IL. He can be contacted at jczuprynski@zerodefectcomputing.com.


Example Tables Metadata. Here's a listing of all tables and their corresponding columns in Oracle's HR example schema.

SQL> select table_name from user_tables;

TABLE_NAME                                                                      
------------------------------                                                  
COUNTRIES                                                                       
DEPARTMENTS                                                                     
EMPLOYEES                                                                       
JOBS                                                                            
JOB_HISTORY                                                                     
LOCATIONS                                                                       
REGIONS                                                                         

7 rows selected.

SQL> describe countries
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 COUNTRY_ID                                NOT NULL CHAR(2)
 COUNTRY_NAME                                       VARCHAR2(40)
 REGION_ID                                          NUMBER

SQL> describe departments
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 DEPARTMENT_ID                             NOT NULL NUMBER(4)
 DEPARTMENT_NAME                           NOT NULL VARCHAR2(30)
 MANAGER_ID                                         NUMBER(6)
 LOCATION_ID                                        NUMBER(4)

SQL> describe employees
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 EMPLOYEE_ID                               NOT NULL NUMBER(6)
 FIRST_NAME                                         VARCHAR2(20)
 LAST_NAME                                 NOT NULL VARCHAR2(25)
 EMAIL                                     NOT NULL VARCHAR2(25)
 PHONE_NUMBER                                       VARCHAR2(20)
 HIRE_DATE                                 NOT NULL DATE
 JOB_ID                                    NOT NULL VARCHAR2(10)
 SALARY                                             NUMBER(8,2)
 COMMISSION_PCT                                     NUMBER(2,2)
 MANAGER_ID                                         NUMBER(6)
 DEPARTMENT_ID                                      NUMBER(4)

SQL> describe jobs
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 JOB_ID                                    NOT NULL VARCHAR2(10)
 JOB_TITLE                                 NOT NULL VARCHAR2(35)
 MIN_SALARY                                         NUMBER(6)
 MAX_SALARY                                         NUMBER(6)

SQL> describe job_history
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 EMPLOYEE_ID                               NOT NULL NUMBER(6)
 START_DATE                                NOT NULL DATE
 END_DATE                                  NOT NULL DATE
 JOB_ID                                    NOT NULL VARCHAR2(10)
 DEPARTMENT_ID                                      NUMBER(4)

SQL> describe locations
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 LOCATION_ID                               NOT NULL NUMBER(4)
 STREET_ADDRESS                                     VARCHAR2(40)
 POSTAL_CODE                                        VARCHAR2(12)
 CITY                                      NOT NULL VARCHAR2(30)
 STATE_PROVINCE                                     VARCHAR2(25)
 COUNTRY_ID                                         CHAR(2)

SQL> 
SQL> describe regions
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 REGION_ID                                 NOT NULL NUMBER
 REGION_NAME                                        VARCHAR2(25)


Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved