Production tips

This page shares some tips and things to take in consideration when setting up a production Cortex cluster based on the blocks storage.

Querier

Ensure caching is enabled

The querier relies on caching to reduce the number API calls to the storage bucket. Ensure caching is properly configured and properly scaled.

Avoid querying non compacted blocks

When running Cortex blocks storage cluster at scale, querying non compacted blocks may be inefficient for two reasons:

  1. Non compacted blocks contain duplicated samples (as effect of the ingested samples replication)
  2. Overhead introduced querying many small indexes

Because of this, we would suggest to avoid querying non compacted blocks. In order to do it, you should:

  1. Run the compactor
  2. Configure queriers -querier.query-store-after large enough to give compactor enough time to compact newly uploaded blocks (see below)
  3. Configure queriers -querier.query-ingesters-within equal to -querier.query-store-after plus 5m (5 minutes is just a delta to query the boundary both from ingesters and queriers)
  4. Configure ingesters -experimental.blocks-storage.tsdb.retention-period at least as -querier.query-ingesters-within
  5. Lower -experimental.blocks-storage.bucket-store.ignore-deletion-marks-delay to 1h, otherwise non compacted blocks could be queried anyway, even if their compacted replacement is available

How to estimate -querier.query-store-after

The -querier.query-store-after should be set to a duration large enough to give compactor enough time to compact newly uploaded blocks, and queriers and store-gateways to discover and sync newly compacted blocks.

The following diagram shows all the timings involved in the estimation. This diagram should be used only as a template and you’re expected to tweak the assumptions based on real measurements in your Cortex cluster. In this example, the following assumptions have been done:

  • An ingester takes up to 30 minutes to upload a block to the storage
  • The compactor takes up to 3 hours to compact 2h blocks shipped from all ingesters
  • Querier and store-gateways take up to 15 minutes to discover and load a new compacted block

Given these assumptions, in the worst case scenario it would take up to 6h and 45m since when a sample has been ingested until that sample has been appended to a block flushed to the storage and that block has been vertically compacted with all other overlapping 2h blocks shipped from ingesters.

Avoid querying non compacted blocks

Store-gateway

Ensure caching is enabled

The store-gateway heavily relies on caching both to speed up the queries and to reduce the number of API calls to the storage bucket. Ensure caching is properly configured and properly scaled.

Ensure a high number of max open file descriptors

The store-gateway stores each block’s index-header on the local disk and loads it via mmap. This means that the store-gateway keeps a file descriptor open for each loaded block. If your Cortex cluster has many blocks in the bucket, the store-gateway may hit the file-max ulimit (maximum number of open file descriptions by a process); in such case, we recommend increasing the limit on your system or running more store-gateway instances with blocks sharding enabled.

Compactor

Ensure the compactor has enough disk space

The compactor generally needs a lot of disk space in order to download source blocks from the bucket and store the compacted block before uploading it to the storage. Please refer to Compactor disk utilization for more information about how to do capacity planning.

Caching

Ensure memcached is properly scaled

The rule of thumb to ensure memcached is properly scaled is to make sure evictions happen infrequently. When that’s not the case and they affect query performances, the suggestion is to scale out the memcached cluster adding more nodes or increasing the memory limit of existing ones.

We also recommend to run a different memcached cluster for each cache type (metadata, index, chunks). It’s not required, but suggested to not worry about the effect of memory pressure on a cache type against others.