BigQuery - iamshaqir/cheat-sheet GitHub Wiki

Window function calls

A window function, also known as an analytic function, computes values over a group of rows and returns a single result for each row. This is different from an aggregate function, which returns a single result for a group of rows.

A window function includes an OVER clause, which defines a window of rows around the row being evaluated. For each row, the window function result is computed using the selected window of rows as input, possibly doing aggregation.

With window functions you can compute moving averages, rank items, calculate cumulative sums, and perform other analyses.

Syntax of the OVER Clause

window_function_name ( [ argument_list ] ) OVER ( [ PARTITION BY partition_expression [, ...] ] [ ORDER BY order_expression [ { ASC | DESC } ] [, ...] ] [ { ROWS | RANGE } frame_clause ] )

Types of Frame Clauses

  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Includes all rows from the start of the partition up to the current row. Common for cumulative sums.
  • ROWS BETWEEN 2 PRECEDING AND CURRENT ROW: Includes the current row and the two preceding rows. Common for rolling averages.
  • ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING: Includes the current row, one preceding row, and one following row.
  • ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: Includes all rows in the partition (default if ORDER BY is present but no frame is specified for certain functions).

Various Use Cases and Examples

WITH product_sales as (
    SELECT DATE("2024-01-01") as sale_date, 'A' as product_id, 'Electronics' as department, 100 as sales  
    UNION ALL
    SELECT DATE("2024-01-01") as sale_date, 'B' as product_id, 'Clothing' as department, 50 as sales
    UNION ALL
    SELECT DATE("2024-01-02") as sale_date, 'A' as product_id, 'Electronics' as department, 120 as sales
    UNION ALL
    SELECT DATE("2024-01-02") as sale_date, 'C' as product_id, 'Home Goods' as department, 80 as sales
    UNION ALL
    SELECT DATE("2024-01-03") as sale_date, 'B' as product_id, 'Clothing' as department, 60 as sales
    UNION ALL
    SELECT DATE("2024-01-03") as sale_date, 'A' as product_id, 'Electronics' as department, 90 as sales
    UNION ALL
    SELECT DATE("2024-01-04") as sale_date, 'C' as product_id, 'Home Goods' as department, 110 as sales
)

Calculating Grand Total (No PARTITION BY, No ORDER BY)

select
  sale_date,
  product_id,
  sales,
  SUM(sales) OVER() AS total_sales_overall
FROM
  product_sales

Calculating Departmental Totals (PARTITION BY)

select
  sale_date,
  product_id,
  department,
  sales,
  SUM(sales) OVER(PARTITION BY department) AS total_sales_by_department
FROM
  product_sales

Ranking Rows (PARTITION BY and ORDER BY)

select
  sale_date,
  product_id,
  department,
  sales,
  RANK() OVER(PARTITION BY department ORDER BY sales DESC) AS rank_in_department
FROM
  product_sales

Calculating Running Totals (PARTITION BY, ORDER BY, ROWS)

select
  sale_date,
  product_id,
  department,
  sales,
  SUM(sales) OVER (
    PARTITION BY product_sales.department
    ORDER BY sale_date ASC
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS cumulative_sales_by_department
FROM
  product_sales

Calculating Moving Averages (PARTITION BY, ORDER BY, ROWS)

select
  sale_date,
  product_id,
  department,
  sales,
  AVG(sales) OVER (
    PARTITION BY product_sales.department
    ORDER BY sale_date ASC
    ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
  ) AS two_day_moving_avg
FROM
  product_sales