Did you know that MySQL has supported subqueries since version 4.1? In fact, it even boasts a few MySQL-specific features! Whether youre well acquainted with subqueries or just a novice in this area, youre bound to benefit from todays article, which will cover how to use subqueries in MySQL to get the most out of your SELECT queries while making your SQL more readable.
The Subquery Explained
A subquery is one that is nested within another query. In fact it is possible to nest subqueries within other subqueries. The main advantages of subqueries are:
- They allow queries that are structured so that it is possible to isolate each part of a statement.
- They provide alternative ways to perform operations that would otherwise require complex joins and unions.
- They are, in many people's opinion, more readable than complex joins or unions. According to the MySQL docs, it was the innovation of subqueries that gave people the original idea of calling the early SQL Structured Query Language.
MySQL subqueries can return a single value, single row, single column, or a table, which is a resultset containing one or more rows of one or more columns. Well be taking a look at each of these possibilities a little later on.
Here is an example to illustrate the basic syntax of a subquery:
SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);
The above example is made up of two parts: SELECT * FROM t1 WHERE column1 = is refered to as the outer query (or outer statement). The part between the parentheses, (SELECT column1 FROM t2), is the subquery. A subquery must always appear within parentheses. Also note that subqueries can be found on both in the column list and to the right side of the WHERE clause after a comparison or logical operator such as = (equal), <> (not equal), <= (less than or equal), >= (greater than or equal), < (less than), > (greater than), BETWEEN (value lies between two values), NOT (logical NOT), AND (logical AND), OR (logical OR).
There are few restrictions on the type of statements in which subqueries can be used. The subquery's outer statement must be of the type SELECT, INSERT, UPDATE, DELETE, SET or DO statement, and you cannot modify a table and select from it in the subquery. Other than that, a subquery can contain any of the keywords or clauses that an ordinary SELECT can contain, including DISTINCT, GROUP BY, ORDER BY, LIMIT, joins, index hints, UNION constructs, comments, functions, and so on.
Some common uses for subqueries are to retrieve the results of an aggregate function such as SUM or COUNT or to look up a description based on a code value. In large databases, its common practice to use some sort of general component ID that links to the unique ID field of data tables. Since there could potentially be dozens or even hundreds of tables that link to the code table, it is advantageous to use individual subqueries in the SELECT clause, rather than create many relationships that all link to the same table. The following query should help to clarify this point. It selects some client information, including the ID, citizenship, date of birth, gender, a count of purchases. The citizenship country and gender descriptions are retrieved from the code_values table. Notice how the count selects from the same table. This is accomplished by creating a second alias which is linked by the client_id:
SELECT client_id, (SELECT english_desc FROM code_values WHERE code_value_id = cl.citizenship_country_cd) AS citizenship, client_dob, (SELECT english_desc FROM code_values WHERE code_value_id = cl.client_gender_cd) AS gender (SELECT count(purchase_id) FROM clients AS cl2 WHERE cl2.client_id = cl.client_id) AS purchase_count FROM clients AS cl;
Subqueries that return a single value as the ones above do are called
scalar subqueries. Another place to put an aggregate function
is in the WHERE clause. In this example, a subquery is used instead of a
JOIN to compare total_sales to the greatest
SELECT associate_name FROM sales_associates WHERE total_sales = (SELECT MAX(monthly_sales) FROM sales
The key to using a subquery after a comparison operator is that you must be certain that the subquery will always return a single value. The following SELECT statement looks harmless enough, but it could cause an error:
SELECT associate_name FROM sales_associates WHERE associate_name = (SELECT name FROM sales WHERE seller_id = 99 AND date_of_sale = '2003-05-05');
The above statement will work as long as the seller did not make more than one sale on the given date. If more than one row is returned, youll get an error such as the following:
SQL Error: Subquery returns more than 1 row.
A safer way to compare one value against the results of a subquery is to use the IN operator, as it will work with any number of rows. All you have to do is replace the equals (=) with the IN keyword:
SELECT associate_name FROM sales_associates WHERE associate_name IN (SELECT name FROM sales WHERE seller_id = 99 AND date_of_sale = '2003-05-05');
One special instance of a scalar subquery is one that uses a SELECT CASE statement, which is a control construct that executes an SQL statement based on a given condition. It can only return a scalar value, so its perfect for choosing between fields:
(SELECT english_desc FROM vcode_value WHERE code_value_id = (SELECT CASE WHEN sfd.recomm_or_final_decision_ind = 1 THEN sfd.recommendation_cd ELSE sfd.final_decision_cd END )) AS Recommendation FROM sts_final_decision sfd WHERE sfd.business_no <> "";
Heres a more complex query that contains a total of four subqueries, two of which utilize the SELECT CASE statement and one which features the IN operator. The final_decision_desc is presumably an input parameter of a stored procedure. Such a proc would allow the user to search for a record by text description, as users dont typically know the underlying code values. The SELECT statement that follows the IN keyword fetches the code (or codes) that matches the supplied final_decision_desc parameter, which is then compared to those of the final_decision table:
SELECT final_decision_desc, (SELECT english_desc FROM vcode_value WHERE code_value_id = (SELECT CASE WHEN fd.recomm_or_final_decision_ind = 1 THEN fd.recommendation_cd ELSE fd.final_decision_cd END )) AS Recommendation FROM final_decision cd WHERE fd.final_decision_cd IN (SELECT ctd.code_value_id FROM vcode_table_data ctd WHERE (SELECT CASE WHEN language_cd = 'E' THEN english_desc ELSE french_desc END) = final_decision_desc;
The examples provided here are really just a taste of what can be done with subqueries. For more information on subqueries, be sure to check out the MySQL documentation on Subquery Syntax. They also have a series of articles on the subject. Here is Part 1.