Values shifting (LAG/LEAD)

Values shifting (LAG/LEAD)#

This page focuses on functions that allow shifting arrays: LAG and LEAD.

The result of applying LAG and LEAD can be represented as:

  • For LAG: \(x'_i = x_{i-s}\)

  • For LEAD: \(x'_i = x_{i+s}\)

Where \(s\) is the number of cells we want to shift the values.

On this page, we dive into the options provided by different databases. In the following cell, both ClickHouse and Postgres are started.

docker run -d --rm\
    --name lag_lead_pg\
    -e POSTGRES_PASSWORD=postgres\
    postgres:15.4 &> /dev/null
docker run --rm -d \
    --name lag_lead_click \
    clickhouse/clickhouse-server &> /dev/null
sleep 5

Note: Don’t forget to stop the container afterward.

docker stop lag_lead_pg lag_lead_click
lag_lead_pg
lag_lead_click

Partition by#

If you include a PARTITION BY clause with the LAG window function, it will define groups of values within which the shifting operation will take place.


This example demonstrates shifting the group values within groups defined by group2.

docker exec -i window_functions_pg psql -U postgres -d postgres << EOF
SELECT 
    "group", group2,
    LAG("group") OVER (PARTITION BY group2)
FROM (
    VALUES
    (3, 'C', 'X'),
    (3, 'C', 'Y'),
    (1, 'C', 'X'),
    (4, 'B', 'Y'),
    (3, 'A', 'X'),
    (1, 'D', 'Y')
) AS temp(some_value, "group", group2);
EOF
 group | group2 | lag 
-------+--------+-----
 C     | X      | 
 C     | X      | C
 A     | X      | C
 C     | Y      | 
 B     | Y      | C
 D     | Y      | B
(6 rows)

Clickhouse#

Clickhouse by default don’t have LEAD/LAG functions, but you can reproduce the effect by using syntax:

  • any(<column name>) OVER (ROWS BETWEEN <offset> PRECEDING AND <offset> PRECEDING) for LAG function.

  • any(<column name>) OVER (ROWS BETWEEN <offset> FOLLOWING AND <offset> FOLLOWING) for LEAD function.


Here is an example of a LAG shift with an offset of 1 and a LEAD shift with an offset of 2.

docker exec lag_lead_click clickhouse-client -q "
CREATE TEMPORARY TABLE temp (original_column Int32) ENGINE = Memory;

INSERT INTO temp VALUES
    (4), (3), (3), (1), (3), (3), (3), (5), (5), (1);

SELECT 
    original_column,
    any(original_column) OVER (
        ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
    ) AS lag_value, 
    any(original_column) OVER (
        ROWS BETWEEN 2 FOLLOWING AND 2 FOLLOWING
    ) AS lead_value
FROM temp
FORMAT Pretty;
"
    ┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┓
    ┃ original_column ┃ lag_value ┃ lead_value ┃
    ┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━┩
 1. │               4 │         0 │          3 │
    ├─────────────────┼───────────┼────────────┤
 2. │               3 │         4 │          1 │
    ├─────────────────┼───────────┼────────────┤
 3. │               3 │         3 │          3 │
    ├─────────────────┼───────────┼────────────┤
 4. │               1 │         3 │          3 │
    ├─────────────────┼───────────┼────────────┤
 5. │               3 │         1 │          3 │
    ├─────────────────┼───────────┼────────────┤
 6. │               3 │         3 │          5 │
    ├─────────────────┼───────────┼────────────┤
 7. │               3 │         3 │          5 │
    ├─────────────────┼───────────┼────────────┤
 8. │               5 │         3 │          1 │
    ├─────────────────┼───────────┼────────────┤
 9. │               5 │         5 │          0 │
    ├─────────────────┼───────────┼────────────┤
10. │               1 │         5 │          0 │
    └─────────────────┴───────────┴────────────┘

Note: Instead of empty values, records without previous or next values are replaced with zeros—this is quite typical for ClickHouse.