ClickHouse 🤝 DuckDB
If you follow this blog you're familiar with the OLAP ecosystem, our passion for ClickHouse and our involvement in developing the chDB in-memory database.
The ClickHouse community is strong and proud but there's a small taboo 🤫
🦆 The bipedal in the room is DuckDB and its exciting features and small size!
😂 Jokes aside, there are already lots of comparisons around making a case for either or both platforms being the best or the fastest at something. We leave that to the benchmark experts, while we focus on what we love:
integrations without borders!
Enemies with Benefits
Can ClickHouse and DuckDB become enemies with benefits (for users)?
To make this little bromance happen we'll use a magical combo:
💡 Quackpipe runs a DuckDB core and supports the Clickhouse query API and several native data formats - essentially impersonating a basic ClickHouse and including a variety of ClickHouse Macro Aliases to make familiar commands and functions available out of the box for ClickHouse clients and users.
It can run standalone, embedded or accept input and commands via
-stdin from scripts. It even includes a fully working clickhouse play user interface! 😎
That's the definition of a ClickHouse UDF-ready toolset! Let's go OLAP²
Install Quackpipe UDF
curl -fsSL github.com/metrico/quackpipe/releases/latest/download/quackpipe-amd64 --output quackpipe \ && chmod +x quackpipe mv quackpipe /var/lib/clickhouse/user_scripts/quackpipe
⚡ Once ready, we can leverage our Quackpipe integration with two methods:
This method can be used to return multiple columns and rows from DuckDB into ClickHouse using any of the supported formats. Table schema MUST be known.
:) SELECT * FROM executable('quackpipe -stdin -format TSV', TSV, 'id UInt32, num UInt32', (SELECT 'SELECT 1, 2'))
:) SELECT * FROM executable('quackpipe -stdin -format TSV', TSV, 'id UInt32, num UInt32', ( SELECT 'SELECT 1, 2' )) Query id: dd878948-bec8-4abe-9e06-2f5813653c3a ┌─id─┬─num─┐ │ 1 │ 2 │ └────┴─────┘ 1 rows in set. Elapsed: 0.268 sec.
What is this? Think of it as a SELECT within a SELECT with a different syntax.
The data comes from DuckDB right into ClickHouse columns. You can read and pass data to UDF functions ad-hoc per query, or by using a dedicated engine table.
:) CREATE TABLE onlyone ( `id` UInt64 `num` UInt64 ) ENGINE = Executable('quackpipe -stdin -format TSV', TSV, ( SELECT 'SELECT 1, 2' )); :) SELECT * FROM onlyone Query id: 45aa7267-e0f4-4fdc-a52e-5a7a4bf2c0a8 ┌─id─┬─num─┐ │ 1 │ 2 │ └────┴─────┘ 1 rows in set. Elapsed: 0.272 sec.
NOTE: ONLY WORKS AS FUNCTION, cannot return multiple rows or columns!
- Configure the UDF function in
Stringwith the full results
the actual content can be JSON, CSV, TSV with or without Column Names
<functions> <function> <type>executable</type> <name>duckdb</name> <return_type>String</return_type> <argument> <type>String</type> </argument> <format>TSV</format> <command>quackpipe -stdin -format TSV</command> </function> </functions>
- Restart Clickhouse and make sure the function is loaded
clickhouse-udf | Processing configuration file '/etc/clickhouse-server/duckdb_function.xml'. clickhouse-udf | Saved preprocessed configuration to '/var/lib/clickhouse/preprocessed_configs/duckdb_function.xml'.
- Test with Clickhouse client.
b5c4f004e862 :) SELECT duckdb('SELECT 1') as t1 SELECT duckdb('SELECT 1') AS t1 Query id: 68c4fbd0-75a5-466d-9720-3937bcd96898 ┌─t1─┐ │ 1 │ └────┘ 1 row in set. Elapsed: 0.047 sec.
Upon request of Alexey Milovidov, here's DuckDB querying data out of ClickHouse and chdb using nothing more than the HTTP API and URL query parameters 🔥🔥🔥
SELECT * FROM read_json_auto("https://chdb.fly.dev/?default_format=JSONEachRow&query=SELECT number as once, number *2 as twice FROM numbers(10)")
The string format is not the prettiest (we'll work on that!) but the query returns steaming hot columns and column formats just fine! It even works on MotherDuck:
Where do we go from here? Can this combination be used to build something amazing and revolutionary, or is it just a useless hack/flex for both platforms?
That's up to the users and community - and hopefully, You too!
Want more? Discover all of our projects on the Metrico github repository
Did you find this article valuable?
Support qryn by becoming a sponsor. Any amount is appreciated!