Lab-3
So far, we’ve seen joins and set operators for combining tables together. SQL provides another way to combine tables. You can nest queries within queries. Such an embedded query is called a subquery. A subquery computes results that are then used by an outer query. Basically, a subquery acts like any other expression we’ve seen so far. A subquery can be nested inside the SELECT, FROM, WHERE, and HAVING clauses. You can even nest subqueries inside another subquery
Find the names of the ingredients supplied by Veggies_R_Us
SELECT name FROM ingredients WHERE vendorid = (SELECT vendorid FROM vendors
WHERE companyname = 'Veggies_R_Us');
SQL marks the boundaries of a subquery with parentheses.
Some points to remember when using subqueries:
- Only the columns of the outermost query can appear in the result table. When creating a new query, the outermost query must contain all of the attributes needed in the answer.
- There must be some way of connecting the outer query to the inner query. All SQL comparison operators work with subqueries.
- Subqueries are restricted in what they can return. First, the row and column
count must match the comparison operator. Second, the data types must be compatible.
Subqueries using IN:
In the above query, the = operator makes the connection between the queries. Because = expects a single value, the inner query may only return a result with a single attribute and row. If subquery returns multiple rows, we must use a different operator. We use the IN operator, which expects a subquery result with zero or more rows.
Find the name of all ingredients supplied by Veggies_R_Us or Spring Water Supply
SELECT name FROM ingredients
WHERE vendorid IN (SELECT vendorid FROM vendors
WHERE companyname = 'Veggies_R_Us' OR
companyname = 'Spring Water Supply');
Find the average unit price for all items provided by Veggies_R_Us
SELECT AVG(unitprice) AS avgprice FROM ingredients
WHERE vendorid IN (SELECT vendorid FROM vendors
WHERE companyname = 'Veggies_R_Us');
Subqueries using BETWEEN:
Find all of the ingredients with an inventory within 25% of the average inventory of ingredients
SELECT name FROM ingredients
WHERE inventory BETWEEN (SELECT AVG(inventory) * 0.75 FROM ingredients)
AND
(SELECT AVG(inventory) * 1.25 FROM ingredients);
Subqueries can even be combined with other predicates in the WHERE clause, including other Subqueries. Look at the following query.
Find the companies who were referred by Veggies_R_Us and provide an ingredient in the milk food group
SELECT companyname FROM vendors
WHERE (referredby IN (SELECT vendorid FROM vendors
WHERE companyname = 'Veggies_R_Us')) AND (vendorid IN
(SELECT vendorid FROM ingredients
WHERE foodgroup = 'Milk'));
Multilevel Subquery Nesting:
Find the name and price for all items using an ingredient supplied by Veggies_R_Us
SELECT name, price
FROM items
WHERE itemid IN
(SELECT itemid -- Subquery 3 FROM madewith
WHERE ingredientid IN
(SELECT ingredientid -- Subquery 2 FROM ingredients
WHERE vendorid =
(SELECT vendorid -- Subquery 1 FROM vendors
WHERE companyname = 'Veggies_R_Us')));
First the innermost subquery is executed which return one single value. Then subquery2 is executed, the result of which is passed to subquery3. We can use GROUP BY, HAVING, and ORDER BY with subqueries.
For each store, find the total sales of items made with ingredients supplied by Veggies_R_Us. Ignore meals and only consider stores with at least two such items sold
SELECT storeid, SUM(price) AS sales FROM orders
WHERE menuitemid IN
(SELECT itemid -- Subquery 3 FROM madewith
WHERE ingredientid IN
(SELECT ingredientid -- Subquery 2 FROM ingredients
WHERE vendorid =
(SELECT vendorid -- Subquery 1 FROM vendors
WHERE companyname = 'Veggies_R_Us'))) GROUP BY storeid
HAVING COUNT(*) > 2 ORDER BY sales DESC;
GROUP BY, HAVING can be used in subqueries as well. Using ORDER BY in a subquery makes little sense because the order of the results is determined by the execution of the outer query.
Subqueries Using NOT IN:
Find all of the ingredients supplied by someone other than Veggies_R_Us
SELECT name FROM ingredients
WHERE vendorid NOT IN (SELECT vendorid FROM vendors
WHERE companyname = 'Veggies_R_Us');
Find the company name of the small vendors who don’t provide any ingredients with large (>100) inventories
SELECT companyname FROM vendors
WHERE vendorid NOT IN (SELECT vendorid FROM ingredients
WHERE inventory > 100);
What is wrong with this query? The ingredient Secret Dressing does not have a vendor, so the list of values contains a NULL. If any element in the NOT IN list is NULL, then no rows evaluate to true, and the final result table is empty.
Solution: Eliminate NULL values from the subquery result. Rewrite the above query!!!
Note: IN over an empty table always returns false, therefore NOT IN always returns true. Tryout the following query!
SELECT companyname FROM vendors
WHERE referredby NOT IN (SELECT vendorid
FROM vendors
WHERE companyname = 'No Such Company');
Combining JOIN and Subqueries:
Nested queries are not restricted to a single table.
Find the name and price of all items using an ingredient supplied by Veggies_R_Us
SELECT DISTINCT itemid, price FROM items NATURAL JOIN madewith WHERE ingredientid IN
(SELECT ingredientid FROM ingredients NATURAL JOIN vendors WHERE companyname = 'Veggies_R_Us');
Standard Comparison Operators with Lists Using ANY, SOME, or ALL
We can modify the meaning of the SQL standard comparison operators with ANY, SOME, and ALL so that the operator applies to a list of values instead of a single value.
Using ANY or SOME:
The ANY or SOME modifiers determine if the expression evaluates to true for at least one row in the subquery result.
Find all items that have a price that is greater than any salad item
SELECT name FROM items
WHERE price > ANY (SELECT price FROM itemsWHERE name LIKE '%Salad');
Find all ingredients not supplied by Veggies_R_Us or Spring Water Supply
SELECT name FROM ingredients
WHERE ingredientid NOT IN (SELECT ingredientid FROM ingredients
WHERE vendorid = ANY (SELECT vendorid FROM vendors
WHERE companyname = 'Veggies_R_Us' OR companyname = 'Spring Water Supply'));
Using ALL:
The ALL modifier determines if the expression evaluates to true for all rows in the subquery result.
Find all ingredients that cost at least as much as every ingredient in a salad
SELECT name FROM ingredients
WHERE unitprice >= ALL (SELECT unitprice FROM ingredients ing NATURAL JOIN madewith mw JOIN items i USING(itemid) WHERE i.name LIKE '%Salad');
Find the name of all ingredients supplied by someone other than Veggies_R_Us or Spring Water Supply
SELECT name FROM ingredients
WHERE vendorid <> ALL (SELECT vendorid FROM vendors
WHERE companyname = 'Veggies_R_Us' OR companyname = 'Spring Water Supply');
Find the most expensive items
SELECT * FROM items
WHERE price >= ALL (SELECT price FROM items);
What is wrong with above query ? Correct query:
SELECT * FROM items
WHERE price >= ALL (SELECT max(price) FROM items);
Correlated Subqueries:
Correlated subqueries are not independent of the outer query. Correlated subqueries work by first executing the outer query and then executing the inner query for each row from the outer query.
Find the items that contain 3 or more ingredients
SELECT itemid, name
FROM items
WHERE (SELECT COUNT(*)
FROM madewith
WHERE madewith.itemid = items.itemid) >= 3;
Look closely at the inner query. It cannot be executed independently from the outer query because the WHERE clause references the items table from the outer query. Note that in the inner query we must use the table name from the outer query to qualify itemid. How does this execute? Because it’s a correlated subquery, the outer query fetches all the rows from the items table. For each row from the outer query, the inner query is executed to determine the number of ingredients for the particular itemid.
Find all of the vendors who referred two or more vendors
SELECT vendorid, companyname
FROM vendors v1
WHERE (SELECT COUNT(*)
FROM vendors v2
WHERE v2.referredby = v1.vendorid) >= 2;
EXISTS:
EXISTS is a conditional that determines if any rows exist in the result of a subquery. EXISTS returns true if <subquery> returns at least one row and false otherwise.
Find the meals containing an ingredient from the Milk food group
SELECT * FROM meals m
WHERE EXISTS (SELECT *
FROM partof p JOIN items USING (itemid) JOIN madewith USING (itemid)
JOIN ingredients USING (ingredientid)
WHERE foodgroup = 'Milk' AND m.mealid = p.mealid);
Find all of the vendors that did not recommend any other vendor
SELECT vendorid, companyname FROM vendors v1
WHERE NOT EXISTS (SELECT * FROM vendors v2
WHERE v2.referredby = v1.vendorid);
Derived Relations — Subqueries in the FROM Clause:
List the name and inventory value of each ingredient in the Fruit or Vegetable food group and its supplier
SELECT name, companyname, val FROM vendors v,
(SELECT name, vendorid, unitprice * inventory as val FROM ingredients i WHERE foodgroup IN ('Fruit', 'Vegetable') ) d WHERE v.vendorid = d.vendorid
Subqueries in the HAVING Clause:
Find all vendors who provide more ingredients than Spring Water Supply
SELECT companyname
FROM vendors v, ingredients i WHERE i.vendorid = v.vendorid GROUP BY v.vendorid, companyname HAVING COUNT(*) > (SELECT COUNT(*) FROM ingredients i, vendors v WHERE i.vendorid = v.vendorid AND
companyname = 'Spring Water Supply');