18.3 subqueries in select clause - mergen-sergio/DBest GitHub Wiki

Previous
Next

Subqueries in the SELECT Clause

In SQL, subqueries can appear inside the SELECT clause of an outer query. This is particularly useful when computing independent results from the same pivot table.

For example, suppose we want to retrieve, for each movie, its title, the number of cast members, and the number of crew members.
This can be achieved using subqueries inside the SELECT clause:

SELECT m.title, 
       (SELECT COUNT() FROM movie_cast mc WHERE m.movie_id = mc.movie_id), 
        (SELECT COUNT() FROM movie_crew mc WHERE m.movie_id = mc.movie_id) 
FROM movie m;

Alternatively, the LATERAL JOIN approach can be used to achieve the same result:

SELECT m.title, agg1.cont, agg2.cont
FROM movie m1
JOIN LATERAL (
    SELECT COUNT(*) AS cont FROM movie_cast mc
    WHERE mc.movie_id = m1.movie_id
) AS agg1 
JOIN LATERAL (
    SELECT COUNT(*) AS cont FROM movie_crew mc
    WHERE mc.movie_id = m1.movie_id
) AS agg2 
ON agg1.cont = agg2.cont;

Other alternatives exist, but some are highly inefficient. For example, a naive approach would require the movie table to join against movie_cast and then with movie_crew , leading to redundant aggregations caused by the unnecessary join between movie_cast and movie_crew. Both subqueries inside the SELECT clause and LATERAL JOIN help avoid this inefficiency.

In DBest, this query is represented by a tree structure, where processing starts at the movie table.

Step-by-step execution:

Query Execution Steps

  1. A nested loop left outer join is performed between the movie and the movie_cast tables.

    • The outer join ensures that movies without cast members are included.
  2. A GROUP BY operation(named agg1) is applied.

  3. This GROUP BY operation retains three columns:

    • movie_id → The grouping column.
    • title → Computed using the FIRST aggregation function.
      • This function takes the first occurrence of the title within the group, but any occurrence would work.
    • cast_count → Computed using the COUNT aggregation function.
  4. The same process is repeated for the movie_crew table:

    • Another outer join is followed by a GROUP BY operation to compute the crew member count.
  5. The second GROUP BY operation(named aggregate) retains four columns:

    • movie_id → The grouping column.
    • title → Retrieved from the previous aggregation using the FIRST function.
    • cast_count → Retrieved from the previous aggregation using the FIRST function.
    • crew_count → Computed using the COUNT aggregation function.
Subqueries in the SELECT clause

A very similar query plan was used to solve a query that included a subquery in the FROM clause, as shown in the previous section. The only distinction is that no filter is applied as a final step.

Subqueries in the SELECT clause can often be rewritten using LATERAL JOIN, which provides a more general mechanism for handling correlated subqueries. Unlike subqueries in SELECT, LATERAL JOIN allows additional computations, such as filtering and joining other tables, making it a more flexible approach. In the example provided, since both methods rely on similar logical steps, their query plans are almost identical.

Previous
Next
⚠️ **GitHub.com Fallback** ⚠️