Topics

It’s a common practice and often a performance optimization to rewrite correlated subqueries using joins. Here are some common patterns for converting correlated subqueries to equivalent queries using joins:

1. Correlated Subquery in the SELECT list (Calculating an aggregate for each row):

SELECT SalesOrderID, OrderDate, TotalDue, (
    SELECT COUNT(SalesOrderDetailID)
    FROM Sales.SalesOrderDetail
    WHERE SalesOrderID = SO.SalesOrderID
) as LineCount
FROM Sales.SalesOrderHeader SO

Equivalent Query using INNER JOIN and GROUP BY:

SELECT SO.SalesOrderID, OrderDate, TotalDue,
    COUNT(SOD.SalesOrderDetailID) as LineCount
FROM Sales.SalesOrderHeader SO
    INNER JOIN Sales.SalesOrderDetail SOD
    ON SOD.SalesOrderID = SO.SalesOrderID
GROUP BY SO.SalesOrderID, OrderDate, TotalDue

2. Correlated Subquery in the WHERE clause (Filtering based on a row-by-row comparison):

SELECT lastname, firstname, salary
FROM employee e1
WHERE e1.salary > (
    SELECT avg(salary)
    FROM employee e2
    WHERE e2.dept_id = e1.dept_id
)

Equivalent query using JOIN:

SELECT e1.lastname, e1.firstname, e1.salary
FROM employee e1
JOIN (
    SELECT dept_id, AVG(salary) AS avg_salary
    FROM employee
    GROUP BY dept_id
) AS dept_avg ON e1.dept_id = dept_avg.dept_id
WHERE e1.salary > dept_avg.avg_salary

3. Correlated Subquery with NOT EXISTS (Finding rows without a match):

SELECT lastname, firstname
FROM employees e1
WHERE NOT EXISTS (
    SELECT ph.lastname
    FROM payment_history ph
    WHERE ph.emp_id = e1.employee_id
        AND ph.payment_type = 'award'
)

Equivalent query using LEFT JOIN:

SELECT e1.lastname, e1.firstname
FROM employees e1
    LEFT JOIN payment_history ph ON e1.employee_id = ph.emp_id
    AND ph.payment_type = 'award'
WHERE ph.emp_id IS NULL

Tip

Check the execution plan to determine whether rewriting actually makes the query faster for your specific database and data. Sometimes, the optimizer can optimize a correlated subquery very well!