This document aims to describe how to implement the ability to allow rules to cover data from more than a single Cortex tenant, here after referred to as federated rules. Since currently rules are owned by, query data from and save resulting series in the same tenant, this document aims to provide clear delineation of who owns a federated rule, what tenants the federated rule queries data from and where the federated rule saves resulting series.
A federated rule is any rule which contains the
The primary use case for allowing federated rules which query data from multiple tenants is the administration of cortex.
In the case of the administration of cortex, when running Cortex within a large organization, there may be metrics spanning across tenants which might be desired to be monitored e.g. administrative metrics of the cortex system like
prometheus_rule_evaluation_failures_total aggregated by
__tenant_id__. In this case, a team e.g.
infra may wish to be able to create a rule, owned by
infra which queries multiple tenants
t0|t1|...|ti and stores resulting series in
Federated tenant rules and alerts will not be a good fit for organization and should be behind a feature flag.
For federated rules, creation of federated rules (those sourcing data from multiple tenants) should be blocked behind the feature flag
If tenant federation is enabled, then ruler should use a
mergeQueryable to aggregate the results of querying multiple tenants.
For many organizations, the ability for any tenant to write a rule querying any other tenant is not acceptable and more fine grained control is required
Since the current default is that a tenant should only be able to write rules against itself, we suggest a config option
ruler.allowed-federated-tenants, a string slice of OrgIDs like
0|1|2|3|4 which are allowed to write rules against all tenants. If a tenant
bar attempts to create a federated rule, an error should be returned by the ruler api. Similarly an option
ruler.disallowed-federated-tenants explicitly states a list of tenants for which federated rules are not allowed. Combining these in a
util.AllowedTenants should allow one to quickly determine if federation is enabled or disabled for a given tenant at rule creation.
A single tenant rule always stores produced series in the tenant where the rule exists. This 1 -> 1 mapping becomes a many -> 1 mapping for federated rules.
Tenants owning a federated rule the resulting series is saved in the tenant which owns the rule.
A single tenant rule always queries the tenant which owns the rule. This 1 -> 1 mapping becomes a 1 -> many mapping for federated rules.
As some use cases will demand that a specific federated rule, querying tenant B and C, is stored in the owning teams tenant A, an option to allow explicit assignment of source tenants for a federated rule is needed.
To support this we suggest an additional field
src_tenants on the rule group containing an array of OrgIDs e.g.
[t0,t1,...,ti] which when present determines which tenants to query for the given rule. Rule group is chosen as it reduces repetition between rules.
|Allow federated rules behind feature flag||Planned but not yet implemented|
|Allow federated rules only for select tenants||Planned but not yet implemented|
|Where to store resulting series of federated rules||Planned but not yet implemented|
|Which tenants to query from for federated rules||Planned but not yet implemented|