Unique values (DISTINCT)

Unique values (DISTINCT)#

Using the syntax SELECT DISTINCT ... you can get unique values in any column. Check official postgres documentation.

In the next cell, I create everything I need for the examples in this page.

%%bash
docker run --rm -d\
    -e POSTGRES_PASSWORD=postgres \
    --name sql_DISTINCT_examples \
    postgres:15.4 &> /dev/null
sleep 5
docker exec -i sql_DISTINCT_examples psql -U postgres -d postgres

\set QUIET on
CREATE TABLE non_unique_values(
    col1 TEXT,
    col2 TEXT
);
INSERT INTO non_unique_values(col1, col2) VALUES
('A', 'Z'),
('A', 'X'),
('A', 'Y'),
('A', 'X'),
('B', 'X'),
('B', 'Z'),
('C', 'Z'),
('C', 'Y'),
('C', 'Y');

The next cell shows the table I will use for examples in this section.

%%bash
docker exec -i sql_DISTINCT_examples psql -U postgres -d postgres

SELECT * FROM non_unique_values;

 col1 | col2 
------+------
 A    | Z
 A    | X
 A    | Y
 A    | X
 B    | X
 B    | Z
 C    | Z
 C    | Y
 C    | Y
(9 rows)

Note don’t forget to stop the container when you finish playing with examples.

!docker stop sql_DISTINCT_examples &> /dev/null

Specific column#

If you use only one column as the result of a SELECT, the unique values of that column will be retrieved.

%%bash
docker exec -i sql_DISTINCT_examples psql -U postgres -d postgres

SELECT DISTINCT col1 FROM non_unique_values;

 col1 
------
 B
 C
 A
(3 rows)

Columns combination#

If you use multiple columns as the result of a SELECT, you will get every possible combination of values in the columns once.

%%bash
docker exec -i sql_DISTINCT_examples psql -U postgres -d postgres

SELECT DISTINCT col1, col2 FROM non_unique_values;

 col1 | col2 
------+------
 A    | Z
 C    | Z
 B    | X
 A    | Y
 A    | X
 C    | Y
 B    | Z
(7 rows)

Unique on specific column#

By using DISTINCT ON (expression) you can got the table that contains records uniue by (expression). You can specify the column by which unique records should be selected.

Note The DISTINCT ON clause is not part of the SQL standard and is sometimes considered bad style because of the potentially indeterminate nature of its results. With judicious use of GROUP BY and subqueries in FROM, this construct can be avoided, but it is often the most convenient alternative.

Note It is assumed that in the controversial case of which entry to take, the one that occurs first in (expression) is always taken, but no official confirmation of this has yet been found.

So in the following example I just use col1 as (expression). As a result we have only unique values in col1 but uniqueness for col2 but uniqueness for col2 was not a concern for the query. Note that col2 only takes values that match the first entry in col1.

%%bash
docker exec -i sql_DISTINCT_examples psql -U postgres -d postgres

SELECT DISTINCT ON (col1) 
    col1, col2 
FROM 
    non_unique_values;

 col1 | col2 
------+------
 A    | Z
 B    | X
 C    | Z
(3 rows)

Unique values (DISTINCT)

Contents

Unique values (DISTINCT)#

Specific column#

Columns combination#

Unique on specific column#