Functions

Functions#

This page describes the different types of window functions, discusses the specifics of each group, and illustrates the key features in practice.

Ranking#

Ranking functions allow you to assign numbers to the elements in a partition based on specific rules. The following table provides a brief description of the ranking funcitons.

Function

Description

ROW_NUMBER()

Assigns a unique number to each row.

RANK()

Assigns ranks with gaps for ties.

DENSE_RANK()

Assigns ranks without gaps for ties.

NTILE(n)

Divides rows into n buckets with equal sizes.


The following postgreSQL query illustrates the ranking functions applied to the frame that describing the differences between them.

--postgreSQL
SELECT
    *,
    ROW_NUMBER() OVER (PARTITION BY "group" ORDER BY "order"),
    RANK() OVER (PARTITION BY "group" ORDER BY "order"),
    DENSE_RANK() OVER (PARTITION BY "group" ORDER BY "order"),
    NTILE(2) OVER (PARTITION BY "group" ORDER BY "order") AS ntile_2,
    NTILE(3) OVER (PARTITION BY "group" ORDER BY "order") AS ntile_3
FROM (
    VALUES
    ('C', 4),
    ('B', 2),
    ('B', 3),
    ('C', 4),
    ('A', 5),
    ('C', 6),
    ('B', 3),
    ('A', 8)
) AS temp("group", "order");
SELECT 8
group order row_number rank dense_rank ntile_2 ntile_3
A 5 1 1 1 1 1
A 8 2 2 2 2 2
B 2 1 1 1 1 1
B 3 2 2 2 1 2
B 3 3 2 2 2 3
C 4 1 1 1 1 1
C 4 2 1 1 1 2
C 6 3 3 2 2 3

Aggregate#

Aggregate window functions work just like regular aggregation functions, but in the context of window functions, they don’t collapse the table by the aggregation variable. The following table lists the aggregate window functions and their descriptions.

Function

Description

SUM(column)

Returns the cumulative or grouped sum.

AVG(column)

Returns the average.

MIN(column)

Returns the minimum value.

MAX(column)

Returns the maximum value.

COUNT(column)

Returns the number of rows.


The next cell shows the aggregation functions applied to the corresponding dataset.

--postgreSQL
SELECT
    *,
    SUM("value") OVER (PARTITION BY "group"),
    AVG("value") OVER (PARTITION BY "group"),
    MIN("value") OVER (PARTITION BY "group"),
    MAX("value") OVER (PARTITION BY "group"),
    COUNT("value") OVER (PARTITION BY "group")
FROM (
    VALUES
    ('C', 4),
    ('B', 2),
    ('B', 3),
    ('C', 4),
    ('A', 5),
    ('C', 6),
    ('B', 3),
    ('A', 8)
) AS temp("group", "value");
SELECT 8
group value sum avg min max count
A 8 136.5 5 8 2
A 5 136.5 5 8 2
B 3 82.66667 2 3 3
B 3 82.66667 2 3 3
B 2 82.66667 2 3 3
C 4 144.66667 4 6 3
C 4 144.66667 4 6 3
C 6 144.66667 4 6 3

Statistical#

Statistical (analytical) functions vary depending on the SQL dialect and sometimes are considered part of a different group of functions.

Function

Description

CUME_DIST()

Cumulative distribution of a value in its partition.

PERCENT_RANK()

Percent rank of a value in its partition.

PERCENTILE_CONT()

Continuous percentile (interpolated).

PERCENTILE_DISC()

Discrete percentile (actual row value).


The following cell demonstrates some of the functions in action in PostgreSQl.

Note. The PERCENTILE_COUNT and PERCENTILE_DISC functions are omitted because they are not supported by PostgreSQL in the context of window functions.

--postgreSQL
SELECT
    *,
    CUME_DIST() OVER (PARTITION BY "group" ORDER BY "order"),
    PERCENT_RANK() OVER (PARTITION BY "group" ORDER BY "order")
FROM (
    VALUES
    (3, 'C', 1),
    (3, 'B', 2),
    (1, 'B', 3),
    (4, 'C', 4),
    (3, 'A', 5),
    (5, 'C', 6),
    (6, 'B', 7),
    (7, 'A', 8)
) AS temp("order", "group", "value")
ORDER BY "group", "order";
SELECT 8
ordergroup value cume_dist percent_rank
3A 5 0.5 0
7A 8 1 1
1B 3 0.333333 0
3B 2 0.666667 0.5
6B 7 1 1
3C 1 0.333333 0
4C 4 0.666667 0.5
5C 6 1 1

Value#

Value window functions (рус. функции смещения) allow each row in a group to access a value from another row within the same group. The following table shows the basic window functions.

Function

Description

LAG(column, n)

Returns the value from n rows before the current row.

LEAD(column, n)

Returns the value from n rows after the current row.

FIRST_VALUE()

Returns the first value in the window frame.

LAST_VALUE()

Returns the last value in the window frame.

NTH_VALUE(col, n)

Returns the n-th value in the window frame.

For more details on the value functions in SQL, check out the value page.


The following cell shows how value window functions works in the postgreSQL.

--postgreSQL
SELECT
    *,
    LAG("value", 1) OVER (PARTITION BY "group" ORDER BY "order"),
    LEAD("value", 1) OVER (PARTITION BY "group" ORDER BY "order"),
    FIRST_VALUE("value") OVER (PARTITION BY "group" ORDER BY "order"),
    LAST_VALUE("value") OVER (PARTITION BY "group" ORDER BY "order"),
    NTH_VALUE("value", 2) OVER (PARTITION BY "group" ORDER BY "order")
FROM (
    VALUES
    (3, 'C', 1),
    (3, 'B', 2),
    (1, 'B', 3),
    (4, 'C', 4),
    (3, 'A', 5),
    (5, 'C', 6),
    (6, 'B', 7),
    (7, 'A', 8)
) AS temp("order", "group", "value");
SELECT 8
ordergroup value lag lead first_value last_value nth_value
3A 5 8 5 5
7A 8 5 5 8 8
1B 3 2 3 3
3B 2 3 7 3 2 2
6B 7 2 3 7 2
3C 1 4 1 1
4C 4 1 6 1 4 4
5C 6 4 1 6 4