Lab-3

So far, we’ve seen joins and set operators for combining tables together. SQL provides another way to combine tables. You can nest queries within queries. Such an embedded query is called a subquery. A subquery computes results that are then used by an outer query. Basically, a subquery acts like any other expression we’ve seen so far. A subquery can be nested inside the SELECT, FROM, WHERE, and HAVING clauses. You can even nest subqueries inside another subquery

Find the names of the ingredients supplied by Veggies_R_Us

SELECT name FROM ingredients WHERE vendorid = (SELECT vendorid FROM vendors

WHERE companyname = 'Veggies_R_Us');

SQL marks the boundaries of a subquery with parentheses.

Some points to remember when using subqueries:

  1. Only the columns of the outermost query can appear in the result table. When creating a new query, the outermost query must contain all of the attributes needed in the answer.

  1. There must be some way of connecting the outer query to the inner query. All SQL comparison operators work with subqueries.
  2. Subqueries are restricted in what they can return. First, the row and column

count must match the comparison operator. Second, the data types must be compatible.

Subqueries using IN:

In the above query, the = operator makes the connection between the queries. Because = expects a single value, the inner query may only return a result with a single attribute and row. If subquery returns multiple rows, we must use a different operator. We use the IN operator, which expects a subquery result with zero or more rows.

Find the name of all ingredients supplied by Veggies_R_Us or Spring Water Supply

SELECT name FROM ingredients

WHERE vendorid IN (SELECT vendorid FROM vendors

WHERE companyname = 'Veggies_R_Us' OR

companyname = 'Spring Water Supply');

Find the average unit price for all items provided by Veggies_R_Us

SELECT AVG(unitprice) AS avgprice FROM ingredients

WHERE vendorid IN (SELECT vendorid FROM vendors

WHERE companyname = 'Veggies_R_Us');

Subqueries using BETWEEN:

Find all of the ingredients with an inventory within 25% of the average inventory of ingredients

SELECT name FROM ingredients

WHERE inventory BETWEEN (SELECT AVG(inventory) * 0.75 FROM ingredients)

AND

(SELECT AVG(inventory) * 1.25 FROM ingredients);

Subqueries can even be combined with other predicates in the WHERE clause, including other Subqueries. Look at the following query.

Find the companies who were referred by Veggies_R_Us and provide an ingredient in the milk food group

SELECT companyname FROM vendors

WHERE (referredby IN (SELECT vendorid FROM vendors

WHERE companyname = 'Veggies_R_Us')) AND (vendorid IN

(SELECT vendorid FROM ingredients

WHERE foodgroup = 'Milk'));

Multilevel Subquery Nesting:

Find the name and price for all items using an ingredient supplied by Veggies_R_Us

SELECT name, price

FROM items

WHERE itemid IN

(SELECT itemid -- Subquery 3 FROM madewith

WHERE ingredientid IN

(SELECT ingredientid -- Subquery 2 FROM ingredients

WHERE vendorid =

(SELECT vendorid -- Subquery 1 FROM vendors

WHERE companyname = 'Veggies_R_Us')));

First the innermost subquery is executed which return one single value. Then subquery2 is executed, the result of which is passed to subquery3. We can use GROUP BY, HAVING, and ORDER BY with subqueries.

For each store, find the total sales of items made with ingredients supplied by Veggies_R_Us. Ignore meals and only consider stores with at least two such items sold

SELECT storeid, SUM(price) AS sales FROM orders

WHERE menuitemid IN

(SELECT itemid -- Subquery 3 FROM madewith

WHERE ingredientid IN

(SELECT ingredientid -- Subquery 2 FROM ingredients

WHERE vendorid =

(SELECT vendorid -- Subquery 1 FROM vendors

WHERE companyname = 'Veggies_R_Us'))) GROUP BY storeid

HAVING COUNT(*) > 2 ORDER BY sales DESC;

GROUP BY, HAVING can be used in subqueries as well. Using ORDER BY in a subquery makes little sense because the order of the results is determined by the execution of the outer query.

Subqueries Using NOT IN:

Find all of the ingredients supplied by someone other than Veggies_R_Us

SELECT name FROM ingredients

WHERE vendorid NOT IN (SELECT vendorid FROM vendors

WHERE companyname = 'Veggies_R_Us');

Find the company name of the small vendors who don’t provide any ingredients with large (>100) inventories

SELECT companyname FROM vendors

WHERE vendorid NOT IN (SELECT vendorid FROM ingredients

WHERE inventory > 100);

What is wrong with this query? The ingredient Secret Dressing does not have a vendor, so the list of values contains a NULL. If any element in the NOT IN list is NULL, then no rows evaluate to true, and the final result table is empty.

Solution: Eliminate NULL values from the subquery result. Rewrite the above query!!!

Note: IN over an empty table always returns false, therefore NOT IN always returns true. Tryout the following query!

SELECT companyname FROM vendors

WHERE referredby NOT IN (SELECT vendorid

FROM vendors

WHERE companyname = 'No Such Company');

Combining JOIN and Subqueries:

Nested queries are not restricted to a single table.

Find the name and price of all items using an ingredient supplied by Veggies_R_Us

SELECT DISTINCT itemid, price FROM items NATURAL JOIN madewith WHERE ingredientid IN

(SELECT ingredientid FROM ingredients NATURAL JOIN vendors WHERE companyname = 'Veggies_R_Us');

Standard Comparison Operators with Lists Using ANY, SOME, or ALL

We can modify the meaning of the SQL standard comparison operators with ANY, SOME, and ALL so that the operator applies to a list of values instead of a single value.

Using ANY or SOME:

The ANY or SOME modifiers determine if the expression evaluates to true for at least one row in the subquery result.

Find all items that have a price that is greater than any salad item

SELECT name FROM items

WHERE price > ANY (SELECT price FROM itemsWHERE name LIKE '%Salad');

Find all ingredients not supplied by Veggies_R_Us or Spring Water Supply

SELECT name FROM ingredients

WHERE ingredientid NOT IN (SELECT ingredientid FROM ingredients

WHERE vendorid = ANY (SELECT vendorid FROM vendors

WHERE companyname = 'Veggies_R_Us' OR companyname = 'Spring Water Supply'));

Using ALL:

The ALL modifier determines if the expression evaluates to true for all rows in the subquery result.

Find all ingredients that cost at least as much as every ingredient in a salad

SELECT name FROM ingredients

WHERE unitprice >= ALL (SELECT unitprice FROM ingredients ing NATURAL JOIN madewith mw JOIN items i USING(itemid) WHERE i.name LIKE '%Salad');

Find the name of all ingredients supplied by someone other than Veggies_R_Us or Spring Water Supply

SELECT name FROM ingredients

WHERE vendorid <> ALL (SELECT vendorid FROM vendors

WHERE companyname = 'Veggies_R_Us' OR companyname = 'Spring Water Supply');

Find the most expensive items

SELECT * FROM items

WHERE price >= ALL (SELECT price FROM items);

What is wrong with above query ? Correct query:

SELECT * FROM items

WHERE price >= ALL (SELECT max(price) FROM items);

Correlated Subqueries:

Correlated subqueries are not independent of the outer query. Correlated subqueries work by first executing the outer query and then executing the inner query for each row from the outer query.

Find the items that contain 3 or more ingredients

SELECT itemid, name

FROM items

WHERE (SELECT COUNT(*)

FROM madewith

WHERE madewith.itemid = items.itemid) >= 3;

Look closely at the inner query. It cannot be executed independently from the outer query because the WHERE clause references the items table from the outer query. Note that in the inner query we must use the table name from the outer query to qualify itemid. How does this execute? Because it’s a correlated subquery, the outer query fetches all the rows from the items table. For each row from the outer query, the inner query is executed to determine the number of ingredients for the particular itemid.

Find all of the vendors who referred two or more vendors

SELECT vendorid, companyname

FROM vendors v1

WHERE (SELECT COUNT(*)

FROM vendors v2

WHERE v2.referredby = v1.vendorid) >= 2;

EXISTS:

EXISTS is a conditional that determines if any rows exist in the result of a subquery. EXISTS returns true if <subquery> returns at least one row and false otherwise.

Find the meals containing an ingredient from the Milk food group

SELECT * FROM meals m

WHERE EXISTS (SELECT *

FROM partof p JOIN items USING (itemid) JOIN madewith USING (itemid)

JOIN ingredients USING (ingredientid)

WHERE foodgroup = 'Milk' AND m.mealid = p.mealid);

Find all of the vendors that did not recommend any other vendor

SELECT vendorid, companyname FROM vendors v1

WHERE NOT EXISTS (SELECT * FROM vendors v2

WHERE v2.referredby = v1.vendorid);

Derived Relations — Subqueries in the FROM Clause:

List the name and inventory value of each ingredient in the Fruit or Vegetable food group and its supplier

SELECT name, companyname, val FROM vendors v,

(SELECT name, vendorid, unitprice * inventory as val FROM ingredients i WHERE foodgroup IN ('Fruit', 'Vegetable') ) d WHERE v.vendorid = d.vendorid

Subqueries in the HAVING Clause:

Find all vendors who provide more ingredients than Spring Water Supply

SELECT companyname

FROM vendors v, ingredients i WHERE i.vendorid = v.vendorid GROUP BY v.vendorid, companyname HAVING COUNT(*) > (SELECT COUNT(*) FROM ingredients i, vendors v WHERE i.vendorid = v.vendorid AND

companyname = 'Spring Water Supply');