Group by#
Using the GROUP BY
SQL statement allows you to select columns whose unique values will define the groups for aggregation functions.
The following cell sets up a container with PostgreSQL that will be used for the example demonstrated on this page.
docker run --rm -d\
-e POSTGRES_PASSWORD=postgres \
--name groupby_examples \
postgres:15.4 &> /dev/null
sleep 5
docker exec -i groupby_examples psql -U postgres << EOF
CREATE TABLE aggregation_table(
col1 TEXT,
col2 INT
);
INSERT INTO aggregation_table(col1, col2) VALUES
('A', 5),
('A', 1),
('B', 2),
('B', 1),
('C', 3),
('C', 4);
SELECT * FROM aggregation_table;
EOF
CREATE TABLE
INSERT 0 6
col1 | col2
------+------
A | 5
A | 1
B | 2
B | 1
C | 3
C | 4
(6 rows)
By using GROUP BY
, we can explicitly state that we want to partition the table based on unique values of col1
and then apply the sum function to the values of col2
within each separate partition.
docker exec -i groupby_examples psql -U postgres << EOF
SELECT col1, SUM(col2)
FROM aggregation_table
GROUP BY col1;
EOF
col1 | sum
------+-----
B | 3
C | 7
A | 6
(3 rows)
Note don’t forget to stop the container.
docker stop groupby_examples
groupby_examples
Conditions on aggregats (HAVING)#
For scenarios where you need to apply a filter on the aggregated results, SQL provides the special keyword HAVING
. For more details on using HAVING
, please refer to the specific page.
docker exec -i groupby_examples psql -U postgres << EOF
SELECT col1, SUM(col2)
FROM aggregation_table
GROUP BY col1
HAVING SUM(col2) > 5;
EOF
col1 | sum
------+-----
C | 7
A | 6
(2 rows)
Conditions in aggregats (FILTER)#
By using FILTER
keyword you can specify conditions to the row to be used in aggregation.
Let’s say you need to count the occurrences of specific values in col2
within each group defined by col1
. This can be accomplished using the query shown in the following cell.
docker exec -i groupby_examples psql -U postgres << EOF
SELECT
col1,
COUNT(col2) FILTER (WHERE col2=1) "col2=1",
COUNT(col2) FILTER (WHERE col2=2) "col2=2"
FROM aggregation_table
GROUP BY col1;
EOF
col1 | col2=1 | col2=2
------+--------+--------
B | 1 | 1
C | 0 | 0
A | 1 | 0
(3 rows)