Functions#

This page specifically focuses on examining functions in SQL that are exclusively designed for use with window functions.


The following cell sets up a Docker container with SQL that we’ll use as an example.

docker run -d --rm\
    --name window_functions_pg\
    -e POSTGRES_PASSWORD=postgres\
    postgres:15.4 &> /dev/null
sleep 5

Note don’t forget to stop container after all.

docker stop window_functions_pg
window_functions_pg

Values shifting (LAG/LEAD)#

You can shift values over the window by using LAG functions for shifting down and LEAD funtions for shifting up. Find out more in the specific page.


The following cell demonstrates the application of the LAG and LEAD functions to our example.

docker exec -i window_functions_pg psql -U postgres -d postgres << EOF
SELECT 
    "group",
    LAG("group") OVER () lag_step_1,
    LEAD("group") OVER () lead_step_1,
    LAG("group", 3) OVER () lag_step3,
    LEAD("group", 3) OVER () lag_step3
FROM (
    VALUES
    (3, 'C', 'X'),
    (3, 'C', 'Y'),
    (1, 'C', 'X'),
    (4, 'B', 'Y'),
    (3, 'A', 'X'),
    (1, 'D', 'Y')
) AS temp(some_value, "group", group2);
EOF
 group | lag_step_1 | lead_step_1 | lag_step3 | lag_step3 
-------+------------+-------------+-----------+-----------
 C     |            | C           |           | B
 C     | C          | C           |           | A
 C     | C          | B           |           | D
 B     | C          | A           | C         | 
 A     | B          | D           | C         | 
 D     | A          |             | C         | 
(6 rows)

The result of applying LAG and LEAD produces new variables that can be represented as:

  • For LAG: \(x'_i = x_{i-s}\)

  • For LEAD: \(x'_i=x_{i+s}\)

Where:

  • \(i\) position of the row during generaiton.

  • \(x_i\) is the original group value at the \(i\)-th position.

  • \(s\) is the step value specified as the second argument of the function.

First/last value#

The FIRST_VALUE and LAST_VALUE window functions retrieve the first and last values, respectively, within a partition of a result set.


The followin cell shows using FIRST_VALUE for group and LAST_VALUE for first_group2.

docker exec -i window_functions_pg psql -U postgres -d postgres << EOF
SELECT 
    FIRST_VALUE("group") OVER () first_group,
    LAST_VALUE(group2) OVER () last_group2
FROM (
    VALUES
    (3, 'C', 'X'),
    (3, 'C', 'Y'),
    (1, 'C', 'X'),
    (4, 'B', 'Y'),
    (3, 'A', 'X'),
    (1, 'D', 'Y')
) AS temp(some_value, "group", group2);
EOF
 first_group | last_group2 
-------------+-------------
 C           | Y
 C           | Y
 C           | Y
 C           | Y
 C           | Y
 C           | Y
(6 rows)

The FIRST_VALUE for the group is C because it was the first value encountered during data generation. Similarly, the LAST_VALUE for the group2 is Y because it was the last value encountered.

Order by#

By using ORDER BY clause you can find value that corresponds to the biggest/lowest value of some other columns.


This example demonstrates how to retrieve the first value of the group and the last value of the group2, both based on the some_value axis.

docker exec -i window_functions_pg psql -U postgres -d postgres << EOF
SELECT 
    FIRST_VALUE("group") OVER (ORDER BY some_value) group_of_min_some_value,
    LAST_VALUE(group2) OVER (ORDER BY some_value) group_of_max_some_value
FROM (
    VALUES 
    (0, 'M', 'X'),
    (3, 'G', 'Z'),
    (10, 'C', 'Z'),
    (4, 'B', 'Y'),
    (3, 'A', 'X')
)
AS t (some_value, "group", group2);
EOF
 group_of_min_some_value | group_of_max_some_value 
-------------------------+-------------------------
 M                       | X
 M                       | X
 M                       | X
 M                       | Y
 M                       | Z
(5 rows)

In this scenario, M within the group corresponds to the lowest value on the some_value axis. Conversely, Z represents the highest value on the some_value axis within group2.