Subqueries/nested queries/sub-selects A Select query nested within a Select, Update, Insert or Delete. This is the structuredness of Structured Query Lanaguage. Provides data for the From or Where (or Select) clauses. Used when one question must be answered before another can be done. Subquery can return one value (i.e. one column of one row), one column of many rows (i.e. a list of values), multicolumn values (i.e. several columns of one or more rows, a virtual table). Subquery can return scalar, one column, one row, or table. SELECT .. FROM .. WHERE f1 relOp (SELECT f2 FROM .. [WHERE..][GROUP BY..][HAVING..]) parens required around inner query, which runs before outer/main query. Cannot have Order By in subquery (would be use less anyway, since result is not seen). subquery could be another table: SELECT .. FROM t1 WHERE fi relOp (SELECT .. FROM t2 ..) Usually, subquery produces one column, thus it has only one Select item (f2), which is often an aggregate function: SELECT .. FROM .. WHERE f1 relOp (SELECT aggFunc(f2) FROM...) -- Can not use aggregate functions in Where clause: -- SELECT p_code,p_descript,p_price FROM product WHERE p_price=Max(p_price); -- so need a subquery (nested query): returns one value --row with max|min column value SELECT * FROM product WHERE p_price=(SELECT Max(p_price) FROM product); --rows > average column value SELECT * FROM product WHERE p_price>(SELECT Avg(p_price) FROM product) ORDER BY p_price DESC; --products supplied us by Gomez Bros. SELECT * FROM product WHERE v_code=(SELECT v_code FROM vendor WHERE v_name LIKE 'Gomez%'); --Note: can be done with join. Some joins convertible to [more efficient?][more understandable?] subqueries --subquery can be in Having clause: SELECT v_code,Avg(p_price) FROM product GROUP BY v_code HAVING Avg(p_price)<(SELECT Max(Avg(p_price)) FROM product GROUP BY v_code); SELECT v_code,Avg(p_price) FROM product GROUP BY v_code HAVING Avg(p_price) < (SELECT Max(Avg(p_price)) FROM product WHERE v_code IN (SELECT v_code FROM vendor WHERE v_order='N') GROUP BY v_code); --subquery in From clause (inline view [if alias used?]): Subquery that returns a column of values (list of values from one column of many rows). multi-row subquery. use IN operator: --vendors currently supplying products SELECT DISTINCT v_name,v_code FROM vendor WHERE v_code IN (SELECT DISTINCT v_code FROM product); --(doable by join). Distincts are not necessary here. --check referential integrity: any FKs of child product not in PKs of parent vendor SELECT v_name,v_code FROM vendor WHERE v_code NOT IN (SELECT DISTINCT v_code FROM product); --multiple rows returned by subquery. --Note: can be done with outer join --products whose vendor has Y order SELECT p_code,p_descript,p_price,v_code FROM product WHERE v_code IN (SELECT v_code FROM vendor WHERE v_order='Y'); --can be done with join --vendors who supply us saws SELECT v_name FROM vendor WHERE v_code IN (SELECT v_code FROM product WHERE p_descript LIKE '%saw%'); --same as join: SELECT DISTINCT v_name FROM vendor,product WHERE product.v_code=vendor.v_code AND p_descript LIKE '%saw%'; --inventory per vendor for those whose inventory is greater than average inventory per vendor ?? --first, inventory per vendor: SELECT v_code,Sum(p_onhand*p_price) FROM product GROUP BY v_code; --also, average inventory per vendor (average of those sums) ?? --bogus: inventory per vendor for those whose inventory is greater than average product inventory SELECT v_code,Sum(p_onhand*p_price) FROM product GROUP BY v_code HAVING Sum(p_onhand*p_price)>(SELECT Avg(p_onhand*p_price) FROM product); --products whose inventory is greater than average product inventory SELECT p_code,p_onhand*p_price FROM product WHERE p_onhand*p_price>(SELECT Avg(p_onhand*p_price) FROM product); Subquery that returns multiple columns (of multiple rows). --first, show this: SELECT p_descript,p_indate,p_discount FROM product ORDER BY p_discount,p_indate; --earliest indate per discount group: SELECT p_descript,p_indate,p_discount FROM product WHERE (p_discount,p_indate) IN (SELECT p_discount,Min(p_indate) FROM product GROUP BY p_discount); Correlated subquery contains reference to outer query row. --products whose price is the max of products supplied by its vendor: SELECT p_descript,p_price,v_code FROM product outer WHERE p_price = (SELECT Max(p_price) FROM product inner WHERE outer.v_code=inner.v_code); --could almost be done with: SELECT Max(p_price),v_code FROM product GROUP BY v_code; EXISTS checks if any rows returned. no need to return column values, so just return a literal value from subquery: --who are the managers: SELECT employee_id, last_name from employees outer WHERE EXISTS (SELECT 1 FROM employees inner WHERE inner.manager_id=outer.employee_id); --vendors not supplying any products: SELECT v_code,v_name FROM vendor WHERE NOT EXISTS (SELECT 1 FROM product WHERE vendor.v_code=product.v_code); --warning: NOT IN returns false if there are any Nulls! SELECT v_code,v_name FROM vendor WHERE v_code NOT IN (SELECT v_code FROM product); --so use NVL to return some non_Null value: SELECT v_code,v_name FROM vendor WHERE v_code NOT IN (SELECT NVL(v_code,0) FROM product); --v_codes that occur exactly twice in product SELECT * FROM product WHERE v_code IN (SELECT v_code FROM product GROUP BY v_code HAVING Count(*)=2); SELECT * FROM product t WHERE (SELECT Count(*) FROM product WHERE product.v_code=t.v_code)=2; --2 or more same v_codes SELECT v_code,Count(*) FROM product t WHERE (SELECT Count(*) FROM product WHERE product.v_code=t.v_code)>=2 GROUP BY v_code ORDER BY Count(*); the result of a subquery that returns 0 values is Null. comparing anything against null is Null: SELECT p_code,p_descript,p_price,v_code FROM product WHERE v_code=(SELECT v_code FROM vendor WHERE v_name='UMUC'); scalar subquery can be used wherever a scalar can be used. --ratio of #items onhand to min number should have. (bogus example) SELECT (SELECT Sum(p_onhand) FROM product) / (SELECT Sum(p_min) FROM product) FROM dual; --ratio of inventory on hand to min inventory SELECT (SELECT Sum(p_onhand*p_price) FROM product) / (SELECT Sum(p_min*p_price) FROM product) FROM dual; --not a sbuquery: inventory per vendor SELECT v_code,Sum(p_onhand*p_price) AS onhandinventory,Sum(p_min*p_price) AS mininventory,Sum(p_onhand*p_price)/Sum(p_min*p_price) AS ratio FROM product GROUP BY v_code ORDER BY onhandinventory/mininventory; SELECT (SELECT Sum(population) FROM city) / (SELECT Sum(population) FROM country) FROM dual; subquery as simpler, more readable, easier to maintain than equivalentcomplex join or union ? join often more efficient? Convert subquery of other table to join: inner join: Select fi From t1 Where fj IN (Select ci From t2); Select DISTINCT fi From t1,t2 Where fj=ci; Select DISTINCT fi From t1 Inner Join t2 On fj=ci; #South american countries whose surface areas are greater than Paraguay's: select Name, SurfaceArea from Country where SurfaceArea>=(select SurfaceArea from Country where Name ='paraguay') and Continent='South America' order by SurfaceArea; select c1.Name, c1.SurfaceArea from Country as c1, Country as c2 where c2.Name='paraguay' and c1.SurfaceArea>=c2.SurfaceArea and c1.Continent='South America' order by SurfaceArea; outer join: Select fi From t1 Where fj NOT IN (Select ci From t2); Select fi From t1 Left Join t2 On fj=ci Where ci Is Null;