tags : SQL

Basics

  • Like an aggregation but tuples are not grouped into a single output tuples.
  • You can probably do what window function does by clever use of subqueries and GROUP BY but window functions make it easier
  • Will add a new column in SELECT (Which will be the window aggregated value)
  • Let’s you
    • Reference values in other rows!
    • Do a sort of “extended group by”
    • Compute aggregation in incremental fashion
    • Performs a “sliding” calculation across a set of tuples that are related.

Use

  • Running totals
  • Moving averages

Components

  • Window: A set of rows
    • Can be a whole table
    • Can be one single row
  • Aggr Func: Normal aggregation functions + ROW_NUMBER(), RANK(), LAG, LEAD, NTH_VALUE etc.
  • OVER : How to group tuples while computing the window func

Aggregation functions

From some reddit comment

  • If you need to join a row to it’s logical “next” row (e.g. a query which joins today’s sales with yesterdays sales), you can simplify subqueries using LEAD/LAG
  • If you need to know the relative position of rows in the dataset, then use ROW_NUMBER/RANK/NTILE.
  • If you want to aggregate rows based on unique values of a certain column (e.g. return a table of sales with columns for each month), it’s much easier to use PIVOT/UNPIVOT than CASE statements.
  • If you need to filter rows based on the output of an aggregate function, for example finding all rows that where Count(*) > 2, use a HAVING clause than a nested query.

On OVER

  • ORDER BY and PARTITION BY can be used together
    • PARTITION_BY segregates is like GROUP_BY but per partition
    • ORDER_BY segregates using range/row/group with custom frame start and end using BETWEEN
  • OVER is not specific to Window functions but mostly used
  • Alternative to GROUP BY

OVER()

  • OVER() : makes all rows visible at every row

PARTITION BY

  • When you want to aggregate but not merge
  • This partition has nothing to do with database partitioning (See Scaling Databases)

ORDER BY

  • ORDER BY inside a window function only affects the window and not the query output

FRAMING

RANKING

Learn more on Window Functions