Clickhouse

Clickhouse#

This section provides an overview of features related to intefacing with the Clickhouse database. The clickhouse_connect is a database driver for python. Check the the description at the clickhouse site.

from time import sleep
import clickhouse_connect
from src.rerun_docker import reload_docker_container

container = reload_docker_container(
    image="clickhouse/clickhouse-server:24",
    name="clickhouse_examples",
    detach=True,
    remove=True,
    ports={
        "8123": 8123,
    },
)
sleep(5)

client = clickhouse_connect.get_client(
    host="localhost",
    port=8123,
    username="default",
    password="",
    database="default",
)

Execute methods#

The clickhouse_connect.client has several methods to execute code in it: query, command and insert. The difference isn’t really clear - because in general they all allow you to do the same things. But according to the documentation, they are optimized for the purposes that correspond to their names.


The following cell shows the execution of the SELECT statement in the query method and shows the typical format of the output.

query_result = client.query("SELECT 10 v10, 20 v20, 30 v30, 40 v40;")
print("Column names:", query_result.column_names)
print("Data:", query_result.result_rows)
Column names: ('v10', 'v20', 'v30', 'v40')
Data: [(10, 20, 30, 40)]

The following cell performs a CREATE TABLE using the query method.

query_result = client.query("""
    CREATE TABLE
    IF NOT EXISTS 
    test_table (v10 UInt8, v20 UInt8) 
    ENGINE = MergeTree ORDER BY v10;
""")

print("Column names:", query_result.column_names)
print("Data:", query_result.result_rows)
Column names: ('read_rows', 'read_bytes', 'written_rows', 'written_bytes', 'total_rows_to_read', 'result_rows', 'result_bytes', 'elapsed_ns', 'query_id')
Data: [[0, 0, 0, 0, 0, 0, 0, 8088019, '76cd5f5f-928d-4a06-a8dc-89bf52477d9b']]

Everything works fine - it looks like the output just returns some technical information.

And the next cell shows that the INSERT operation can also be performed by query.

query_result = client.query("""
    INSERT INTO test_table (v10, v20) 
    VALUES (3, 5), (5, 5)
""")

print("Column names:", query_result.column_names)
print("Data:", query_result.result_rows)
Column names: ()
Data: []

Finally, the next cell shows that the CREATE TABLE and INSERT operations were sucessful - the data was inserted into the created table.

client.query("SELECT * FROM test_table;").result_rows
[(3, 5), (5, 5)]

Multistatement#

clickhouse_connect doesn’t allow to run multistatement code.


The following cell shows the type of the error you would get in attempt to execute multistatement sql code using clickhouse_connect.

try:
    client.query("SELECT 1 as val1; SELECT 2 as val2;")
except Exception as e:
    print(e)
HTTPDriver for http://localhost:8123 received ClickHouse error code 62
 Code: 62. DB::Exception: Syntax error (Multi-statements are not allowed): failed at position 17 (end of query) (line 1, col 17): ; SELECT 2 as val2
 FORMAT Native. . (SYNTAX_ERROR) (version 24.10.2.80 (official build))