Tigris S3 Storage + qryn

Tigris S3 Storage + qryn

Using Tigris S3 with ClickHouse and qryn

Alex Maitland's photo
·

3 min read

Meet Tigris

Tigris is a new distributed S3 compatible object storage operated by Fly.io and offering global bucket replication with low pricing and a generous free tier:

  • 5GB of data storage per month

  • 10,000 PUT, COPY, POST, LIST requests per month

  • 100,000 GET, SELECT and all other requests per month

Example

Let's say you have a bucket with 100GB of data and you make 1,000,000 GET requests to the objects in the bucket. You would be charged as follows:

  • Data Storage: 5GB x $0 + 95GB x $0.02/GB/month = $1.90

  • PUT Requests: 10,000 x $0 + 90,000 x $0.005/1000 requests = $0.45

  • GET Requests: 100,000 x $0 + 900,000 x $0.0005/1000 requests = $0.45

  • Data Transfer: $0

There’s more! Storage costs are calculated using GB/month, determined by averaging the daily peak storage over a monthly period. For example:

  • Storing 1 GB constantly for a whole month = 1 GB/month

  • Storing 10 GB for 12 days + 20 GB for 18 days = 16 GB/month

🚀 Sounds interesting? Get ready! This example shows how to use Tigris buckets as cold storage disk with the ClickHouse S3 Table engine and qryn. Let’s do this.

Setup Instructions

Get Tigris

  • Sign in to your Fly.io/Tigris account and create an new bucket, ie:
https://yourbucket.fly.storage.tigris.dev
  • Generate a token pair with write permissions to the bucket, ie:
Access Key ID = XXXXXXXXXXXXXXXXXXXXXXXX
Secret Access Key = YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY

ClickHouse

Before we proceed, let’s validate our bucket and practice some simple queries.

  • Configure an S3 table in ClickHouse using Parquet format

  • Configure the S3 Engine with your Tigris bucket and tokens

  • Configure max_threads, max_insert_threads based on your CPU cores

CREATE TABLE s3_tigris (name String, value UInt32) 
   ENGINE=S3('https://yourbucket.fly.storage.tigris.dev/somefolder/sometable.csv', 'XXXXXXXXXXXXXXXXXXXXXXXX', 'YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY', 'Parquet') 
   SETTINGS max_threads=8, max_insert_threads=8, input_format_parallel_parsing=0, input_format_with_names_use_header=0;
  • INSERT & SELECT data using the Tigris storage table
INSERT INTO s3_tigris VALUES ('one', 1), ('two', 2), ('three', 3);
SELECT * FROM s3_tigris LIMIT 2;

Alrigh! If everything works as expected, we’re ready to steam right ahead.

Tigris Storage for qryn

Manual queries are fun - next let's configure Tigris as a ClickHouse storage disk for our qryn instance to store our Logs, Metrics, Traces and Profiling data.

Here’s an overly simple configuration using S3 as the only storage for our data.

<yandex>
  <storage_configuration>
    <disks>
      <tigris>
        <type>s3</type>
        <endpoint>https://yourbucket.fly.storage.tigris.dev/fakekey</endpoint>
        <access_key_id>XXXXXXXXXXXXXXXXXXXXXXXX</access_key_id>
        <secret_access_key>YYYYYYYYYYYYYYYYYYYY</secret_access_key>
        <data_cache_enabled>1</data_cache_enabled>
        <data_cache_max_size>8589934592</data_cache_max_size>
      </tigris>
    </disks>
    <policies>
      <external>
        <volumes>
          <s3>
            <disk>tigris</disk>
          </s3>
        </volumes>
      </external>
      <tiered>
        <move_factor>0.05</move_factor>
        <volumes>
          <hot>
            <disk>ssd</disk>
          </hot>
          <s3>
            <disk>tigris</disk>
            <prefer_not_to_merge>true</prefer_not_to_merge>
          </s3>
        </volumes>
      </tiered>
    </policies>
  </storage_configuration>
</yandex>

Note: Performance may vary based on network conditions and available resources

🗨️ If you have feedback or use Tigris Buckets with ClickHouse and qryn, please consider sharing your test results with our community!


Interested in this subject? Check out the following links for further information

Did you find this article valuable?

Support qryn by becoming a sponsor. Any amount is appreciated!