Value#
This page focuses on the value window functions.
Values shifting (LAG/LEAD)#
You can shift values over the window by using LAG
functions for shifting down and LEAD
funtions for shifting up. Find out more in the specific page.
The following cell demonstrates the application of the LAG
and LEAD
functions to our example.
--postgreSQL
SELECT
"group",
LAG("group") OVER () lag_step_1,
LEAD("group") OVER () lead_step_1,
LAG("group", 3) OVER () lag_step3,
LEAD("group", 3) OVER () lag_step3
FROM (
VALUES
(3, 'C', 'X'),
(3, 'C', 'Y'),
(1, 'C', 'X'),
(4, 'B', 'Y'),
(3, 'A', 'X'),
(1, 'D', 'Y')
) AS temp(some_value, "group", group2);
SELECT 6
group | lag_step_1 | lead_step_1 | lag_step3 | lag_step3 |
---|---|---|---|---|
C | C | B | ||
C | C | C | A | |
C | C | B | D | |
B | C | A | C | |
A | B | D | C | |
D | A | C |
The result of applying LAG
and LEAD
produces new variables that can be represented as:
For
LAG
: \(x'_i = x_{i-s}\)For
LEAD
: \(x'_i=x_{i+s}\)
Where:
\(i\) position of the row during generaiton.
\(x_i\) is the original group value at the \(i\)-th position.
\(s\) is the step value specified as the second argument of the function.
Clickhouse#
Clickhouse don’t have LEAD
/LAG
exactly functions, but you can reproduce the effect by using syntax:
any(<column name>) OVER (ROWS BETWEEN <offset> PRECEDING AND <offset> PRECEDING)
forLAG
function.any(<column name>) OVER (ROWS BETWEEN <offset> FOLLOWING AND <offset> FOLLOWING)
forLEAD
function.
Here is an example of a LAG
shift with an offset of 1 and a LEAD
shift with an offset of 2.
--ClickHouse
DROP TABLE IF EXISTS temp;
CREATE TEMPORARY TABLE temp (original_column Int32) ENGINE = Memory;
INSERT INTO temp VALUES
(4), (3), (3), (1), (3), (3), (3), (5), (5), (1);
SELECT
original_column,
any(original_column) OVER (
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
) AS lag_value,
any(original_column) OVER (
ROWS BETWEEN 2 FOLLOWING AND 2 FOLLOWING
) AS lead_value
FROM temp;
elapsed_ns | query_id |
---|---|
431318 | 6a866f8c-5916-4ee2-aebc-a47af6aad413 |
original_column | lag_value | lead_value |
---|---|---|
4 | 0 | 3 |
3 | 4 | 1 |
3 | 3 | 3 |
1 | 3 | 3 |
3 | 1 | 3 |
3 | 3 | 5 |
3 | 3 | 5 |
5 | 3 | 1 |
5 | 5 | 0 |
1 | 5 | 0 |
Note: Instead of empty values, records without previous or next values are replaced with zeros—this is quite typical for ClickHouse.
First/last value#
The FIRST_VALUE
and LAST_VALUE
window functions retrieve the first and last values, respectively, within a partition of a result set.
The followin cell shows using FIRST_VALUE
for group
and LAST_VALUE
for first_group2
.
--postgreSQL
SELECT
FIRST_VALUE("group") OVER () first_group,
LAST_VALUE(group2) OVER () last_group2
FROM (
VALUES
(3, 'C', 'X'),
(3, 'C', 'Y'),
(1, 'C', 'X'),
(4, 'B', 'Y'),
(3, 'A', 'X'),
(1, 'D', 'Y')
) AS temp(some_value, "group", group2);
SELECT 6
first_group | last_group2 |
---|---|
C | Y |
C | Y |
C | Y |
C | Y |
C | Y |
C | Y |
The FIRST_VALUE
for the group
is C
because it was the first value encountered during data generation. Similarly, the LAST_VALUE
for the group2
is Y
because it was the last value encountered.
Order by#
By using ORDER BY
clause you can find value that corresponds to the biggest/lowest value of some other columns.
This example demonstrates how to retrieve the first value of the group
and the last value of the group2
, both based on the some_value
axis.
--postgreSQL
SELECT
FIRST_VALUE("group") OVER (ORDER BY some_value) group_of_min_some_value,
LAST_VALUE(group2) OVER (ORDER BY some_value) group_of_max_some_value
FROM (
VALUES
(0, 'M', 'X'),
(3, 'G', 'Z'),
(10, 'C', 'Z'),
(4, 'B', 'Y'),
(3, 'A', 'X')
)
AS t (some_value, "group", group2);
SELECT 5
group_of_min_some_value | group_of_max_some_value |
---|---|
M | X |
M | X |
M | X |
M | Y |
M | Z |
In this scenario, M
within the group
corresponds to the lowest value on the some_value
axis. Conversely, Z
represents the highest value on the some_value
axis within group2
.