SQL Correlated Subqueries

What is a Correlated Subquery?

A correlated subquery is a type of subquery that uses values from the outer query. Unlike regular subqueries, which are executed independently of the outer query, a correlated subquery is evaluated for each row processed by the outer query. This makes correlated subqueries more dynamic but often less efficient than regular subqueries.

Why Use Correlated Subqueries?

  1. Dynamic Filtering: Correlated subqueries allow for row-by-row comparisons between tables.
  2. Complex Conditions: They enable you to define conditions in the subquery that depend on values from the outer query.
  3. Data Analysis: Ideal for scenarios where calculations or conditions are based on the context of individual rows.

Syntax

mysql
1
2
3
4
5
6
7
SELECT column_name(s)
FROM table_name outer
WHERE condition
  AND column_name operator
      (SELECT column_name
       FROM table_name inner
       WHERE condition AND outer.column_name = inner.column_name);
  • Outer Query: The main query that processes data row by row.
  • Inner Query: The subquery that references columns from the outer query for its condition.

Examples of Correlated Subqueries

Example 1: Find Employees Earning More Than the Average Salary in Their Department

mysql
1
2
3
4
5
6
7
SELECT EmployeeName, Salary
FROM Employees e1
WHERE Salary > (
    SELECT AVG(Salary)
    FROM Employees e2
    WHERE e1.DepartmentID = e2.DepartmentID
);

Explanation:

  • The outer query retrieves employees' names and salaries.
  • The inner query calculates the average salary for the department of the current employee (e1.DepartmentID = e2.DepartmentID).
  • The WHERE clause ensures that only employees earning more than the average salary in their department are included.

Example 2: Find Employees Who Have the Highest Salary in Their Department

mysql
1
2
3
4
5
6
7
SELECT EmployeeName, Salary, DepartmentID
FROM Employees e1
WHERE Salary = (
    SELECT MAX(Salary)
    FROM Employees e2
    WHERE e1.DepartmentID = e2.DepartmentID
);

Explanation:

  • The outer query retrieves employees and their departments.
  • The inner query calculates the maximum salary for the department of the current employee.
  • Only employees with the highest salary in their department are included.

Key Features of Correlated Subqueries

  • Row-by-Row Evaluation: Each row in the outer query is processed individually with the subquery.
  • Dynamic Results: The subquery adapts based on the current row from the outer query.
  • Dependent on Outer Query: The subquery cannot run independently because it references columns from the outer query.

Key Takeaways

  • Purpose: Correlated subqueries dynamically evaluate conditions based on rows in the outer query.
  • Applications: Commonly used for comparisons, filtering, and analyzing relationships within tables.
  • Performance: While flexible, they can be less efficient than joins for large datasets.
  • Best Use Case: Ideal for scenarios requiring row-by-row evaluation where joins may not be suitable.