Clickhouse#
This section provides an overview of features related to intefacing with the Clickhouse database. The clickhouse_connect
is a database driver for python. Check the the description at the clickhouse site.
from time import sleep
import clickhouse_connect
from src.rerun_docker import reload_docker_container
container = reload_docker_container(
image="clickhouse/clickhouse-server:24",
name="clickhouse_examples",
detach=True,
remove=True,
ports={
"8123": 8123,
},
)
sleep(5)
client = clickhouse_connect.get_client(
host="localhost",
port=8123,
username="default",
password="",
database="default",
)
Execute methods#
The clickhouse_connect.client
has several methods to execute code in it: query
, command
and insert
. The difference isn’t really clear - because in general they all allow you to do the same things. But according to the documentation, they are optimized for the purposes that correspond to their names.
The following cell shows the execution of the SELECT
statement in the query
method and shows the typical format of the output.
query_result = client.query("SELECT 10 v10, 20 v20, 30 v30, 40 v40;")
print("Column names:", query_result.column_names)
print("Data:", query_result.result_rows)
Column names: ('v10', 'v20', 'v30', 'v40')
Data: [(10, 20, 30, 40)]
The following cell performs a CREATE TABLE
using the query
method.
query_result = client.query("""
CREATE TABLE
IF NOT EXISTS
test_table (v10 UInt8, v20 UInt8)
ENGINE = MergeTree ORDER BY v10;
""")
print("Column names:", query_result.column_names)
print("Data:", query_result.result_rows)
Column names: ('read_rows', 'read_bytes', 'written_rows', 'written_bytes', 'total_rows_to_read', 'result_rows', 'result_bytes', 'elapsed_ns', 'query_id')
Data: [[0, 0, 0, 0, 0, 0, 0, 8088019, '76cd5f5f-928d-4a06-a8dc-89bf52477d9b']]
Everything works fine - it looks like the output just returns some technical information.
And the next cell shows that the INSERT
operation can also be performed by query
.
query_result = client.query("""
INSERT INTO test_table (v10, v20)
VALUES (3, 5), (5, 5)
""")
print("Column names:", query_result.column_names)
print("Data:", query_result.result_rows)
Column names: ()
Data: []
Finally, the next cell shows that the CREATE TABLE
and INSERT
operations were sucessful - the data was inserted into the created table.
client.query("SELECT * FROM test_table;").result_rows
[(3, 5), (5, 5)]
Multistatement#
clickhouse_connect
doesn’t allow to run multistatement code.
The following cell shows the type of the error you would get in attempt to execute multistatement
sql code using clickhouse_connect
.
try:
client.query("SELECT 1 as val1; SELECT 2 as val2;")
except Exception as e:
print(e)
HTTPDriver for http://localhost:8123 received ClickHouse error code 62
Code: 62. DB::Exception: Syntax error (Multi-statements are not allowed): failed at position 17 (end of query) (line 1, col 17): ; SELECT 2 as val2
FORMAT Native. . (SYNTAX_ERROR) (version 24.10.2.80 (official build))