Functions#
This page specifically focuses on examining functions in SQL that are exclusively designed for use with window functions.
The following cell sets up a Docker container with SQL that we’ll use as an example.
docker run -d --rm\
--name window_functions_pg\
-e POSTGRES_PASSWORD=postgres\
postgres:15.4 &> /dev/null
sleep 5
Note don’t forget to stop container after all.
docker stop window_functions_pg
window_functions_pg
Values shifting (LAG/LEAD)#
You can shift values over the window by using LAG
functions for shifting down and LEAD
funtions for shifting up. Find out more in the specific page.
The following cell demonstrates the application of the LAG
and LEAD
functions to our example.
docker exec -i window_functions_pg psql -U postgres -d postgres << EOF
SELECT
"group",
LAG("group") OVER () lag_step_1,
LEAD("group") OVER () lead_step_1,
LAG("group", 3) OVER () lag_step3,
LEAD("group", 3) OVER () lag_step3
FROM (
VALUES
(3, 'C', 'X'),
(3, 'C', 'Y'),
(1, 'C', 'X'),
(4, 'B', 'Y'),
(3, 'A', 'X'),
(1, 'D', 'Y')
) AS temp(some_value, "group", group2);
EOF
group | lag_step_1 | lead_step_1 | lag_step3 | lag_step3
-------+------------+-------------+-----------+-----------
C | | C | | B
C | C | C | | A
C | C | B | | D
B | C | A | C |
A | B | D | C |
D | A | | C |
(6 rows)
The result of applying LAG
and LEAD
produces new variables that can be represented as:
For
LAG
: \(x'_i = x_{i-s}\)For
LEAD
: \(x'_i=x_{i+s}\)
Where:
\(i\) position of the row during generaiton.
\(x_i\) is the original group value at the \(i\)-th position.
\(s\) is the step value specified as the second argument of the function.
First/last value#
The FIRST_VALUE
and LAST_VALUE
window functions retrieve the first and last values, respectively, within a partition of a result set.
The followin cell shows using FIRST_VALUE
for group
and LAST_VALUE
for first_group2
.
docker exec -i window_functions_pg psql -U postgres -d postgres << EOF
SELECT
FIRST_VALUE("group") OVER () first_group,
LAST_VALUE(group2) OVER () last_group2
FROM (
VALUES
(3, 'C', 'X'),
(3, 'C', 'Y'),
(1, 'C', 'X'),
(4, 'B', 'Y'),
(3, 'A', 'X'),
(1, 'D', 'Y')
) AS temp(some_value, "group", group2);
EOF
first_group | last_group2
-------------+-------------
C | Y
C | Y
C | Y
C | Y
C | Y
C | Y
(6 rows)
The FIRST_VALUE
for the group
is C
because it was the first value encountered during data generation. Similarly, the LAST_VALUE
for the group2
is Y
because it was the last value encountered.
Order by#
By using ORDER BY
clause you can find value that corresponds to the biggest/lowest value of some other columns.
This example demonstrates how to retrieve the first value of the group
and the last value of the group2
, both based on the some_value
axis.
docker exec -i window_functions_pg psql -U postgres -d postgres << EOF
SELECT
FIRST_VALUE("group") OVER (ORDER BY some_value) group_of_min_some_value,
LAST_VALUE(group2) OVER (ORDER BY some_value) group_of_max_some_value
FROM (
VALUES
(0, 'M', 'X'),
(3, 'G', 'Z'),
(10, 'C', 'Z'),
(4, 'B', 'Y'),
(3, 'A', 'X')
)
AS t (some_value, "group", group2);
EOF
group_of_min_some_value | group_of_max_some_value
-------------------------+-------------------------
M | X
M | X
M | X
M | Y
M | Z
(5 rows)
In this scenario, M
within the group
corresponds to the lowest value on the some_value
axis. Conversely, Z
represents the highest value on the some_value
axis within group2
.