Skip to main content
Version: next
Prime feature only
This feature is only available with a Prime subscription. See plans or contact sales.

Configuring kubriX for High Availability

This document explains how to configure kubriX for high availability (HA).
It outlines which Helm chart values must be adjusted and identifies components that are not designed for high availability.

High Availability vs. Restartability

High Availability (HA) ensures that your service continues to operate during common failure scenarios such as:

  • Node drains or crashes
  • Rolling updates
  • Availability Zone (AZ) outages

With only a single replica, there will always be downtime when that pod is unavailable - for example, during rescheduling, image pulling, initialization, or when the underlying node fails.

However, depending on your service level agreements (SLAs), a single replica might still be sufficient for some components, especially if:

  • The service is not required to be continuously available, but only when users actively access it.
  • The component performs background or asynchronous processing, where temporary downtime does not impact the user experience.

Taking these considerations into account, the following sections describe the recommended configuration for a highly available kubriX control plane and kubriX data plane.

Three or two replicas

There is an excellent blog article explaining why three replicas are better than two replicas: https://sookocheff.com/post/kubernetes/why-three-replicas-are-better-than-two/

Observability

Grafana

Grafana supports scaling, as long as you use an external database and configure alerting to use the unified_alerting feature.

We use cnpg for creating the external database. This assumes you have the correct secrets in vault so that the external secrets can fetch the required secrets.

Additional docs:

To summarize, this is a valid configuration for high availability configuration:

grafana:

replicas: 2

# headless service for https://github.com/grafana/helm-charts/tree/main/charts/grafana#high-availability-for-unified-alerting
headlessService: true

grafana.ini:
unified_alerting:
enabled: true
ha_peers: "{{ .Release.Name }}-headless:9094"
ha_listen_address: ${POD_IP}:9094
ha_advertise_address: ${POD_IP}:9094
rule_version_record_limit: "5"
alerting:
enabled: false

# use shared database for persistence instead of a volume
persistence:
enabled: false

env:
GF_DATABASE_TYPE: postgres

# Tell the chart to load env vars from our Secret for grafana db
# see https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#database
# this will be created by an external-secret and contains:
# GF_DATABASE_PASSWORD
# GF_DATABASE_HOST
# GF_DATABASE_NAME
# GF_DATABASE_USER
envFromSecrets:
- name: grafana-env-secret
optional: true
- name: grafana-db
optional: true

# create shared postgresql db
cluster:
type: postgresql
mode: standalone
version:
postgresql: "16"
cluster:
instances: 3
monitoring:
enabled: true
superuserSecret: cnpg-superuser-secret
initdb:
database: grafana
secret:
name: cnpg-grafana-secret
roles:
- name: grafana
ensure: present
comment: grafana-admin-user
login: true
inherit: true
superuser: true
createdb: true
passwordSecret:
name: cnpg-grafana-secret
annotations:
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
argocd.argoproj.io/sync-wave: "-1"
backups:
enabled: false

K8s-monitoring

To summarize, this is a valid high availability configuration:

k8s-monitoring:
alloy-operator:
replicaCount: 2

alloy-metrics:
controller:
replicas: 2

clusterMetrics:
kube-state-metrics:
discoveryType: service
replicas: 2

Components not designed for multiple replicas

Loki

With Loki simple-scalable you can easily scale out all Loki components to be high available.

  loki:
commonConfig:
replication_factor: 3 # needs to be the same number as ingesters to write to and read from

backend:
replicas: 3
read:
replicas: 3
write:
replicas: 3

gateway:
replicas: 2

resultsCache:
replicas: 2

Mimir

  • nginx supports scaling, see example

  • distributor supports scaling, it is completely stateless. see example and documentation

  • query-frontend supports scaling. Scalability is limited by the configured number of workers per querier when you don't use the query-scheduler, which is active in our configuration. See also the official documentation and example.

  • ruler supports scaling. see example

  • compactor should be able to scale up according to the documentation but also in the example it is not scaled up, so we leave it for now.

  • querier supports scaling and has two replicas per default, see values

  • query_scheduler supports scaling and has two replicas per default, see values

  • ingester and store-gateway create zone-aware replication per default, see documentation

Components not designed for multiple replicas

  • Overrides-exporter: don't scale the overrides exporter! The metrics emitted by the overrides-exporter have high cardinality. It’s recommended to run only a single replica of the overrides-exporter to limit that cardinality. see documentation

  • rollout-operator is fixed to 1 replica, because it doesn't make any sense to scale it. see deployment spec

  • Alertmanager: makes no sense just with one tenant, because there is tenant shardeing implemented. see documentation

To summarize, this is a valid high availability configuration additionally to the default values:

  nginx:
replicas: 2

distributor:
replicas: 2

query_frontend:
replicas: 2

ruler:
replicas: 2

Tempo

Currently there is no HA setup defined for Tempo. We are working constantly to extend our HA setup documents.

Delivery

Kargo

Kargo supports scaling most of its components out of the box. You just need to include the Kargo values-ha-enabled-prime.yaml.

However, the controller and manager-controller are hard coded to replicas: 1. You need to switch to a Distributed Architecture to achieve overall scalability.

Crossplane

It is possible to run multiple replicas of the crossplane core pods and rbac manager pods, as long as leader election is turned on (by default it is turned on). Details in Crossplane documentation.

Unfortunately PodDisruptionBudgets are not implemented yet, and currently not planned.

It is also possible to scale out crossplane providers (ootb we integrate keycloak, grafana and vault) as long as the leader election is implemented in the the provider and as long as it is enabled! If leader election is not implemented in the provider or not enabled, then the provider pod will consume 100% CPU!

While crossplane architects emphasize that additional replicas are not really needed and the leader election often takes more time than restarting the pods, there are definitely also arguments for implementing HA with leader election: https://github.com/gofogo/k8s-sigs-external-dns-fork/blob/4a039d1edc2cb2b29ffd48d137ec2d53bda4e0ae/docs/proposal/001-leader-election#use-cases

So it should be carefully decided if multiple replicas of crossplane and crossplane providers really make sense in your environment.

KubeVirt

The virt-operator runs with 2 replicas per default with leader-election. (see https://kubevirt.io/monitoring/runbooks/NoLeadingVirtOperator.html)

Also the KubeVirt CR defaults to two replicas, which means it creates 2 instances of the virt-controller. The virt-api deployment gets scaled based on the available nodes.

CDI (Containerized-Data-Importer)

Technically it is possible to scale the CDI resources via the CDI CustomResourse properties uploadProxyReplicas, apiServerReplicas and deploymentReplicas . However, currently we do not see the benefit to have multiple replicas. The cdi-operator itself get shipped with one replica out-of-the-box from the original project.

see https://github.com/kubevirt/containerized-data-importer/issues/2560

KubeVirt-Manager

KubeVirt-Manager comes out-of-the-box with a hardcoded replica of one. Unfortunately there is currently no evidence that it can get scaled out.

Security

External-Secrets

External-Secrets HA configuration is implemented via a leader-election (same as crossplane). That means just one replica is doing the work, the others are hot-standby.

webhook can be scaled out without leader-election

cert-controller has also a leader-election feature-flag, but there is an open issue that this flag cannot be enable via the official helm chart via an explicit attribute. However, it can be set via extraArgs.

Attention: if you need to have active-active setup - it is still possible, but things are going to get complicated really fast. you will need to set up controller classes and make sure each secret store gets a controller class assigned to it in a round robin manner with a webhook. And if you do it - bear in mind that any misconfiguration will cause external-secrets as a whole to stop operating.

Kyverno

All components of Kyverno can be scaled out. However, there are some like reports-controller and background-controller which are stateful and implement a leader-election (as in external-secrets or crossplane), and sometimes just some functionality in admission-controller or cleanup-controller like certificate and webhook management uses a leader-election and some not.

Additional Docs: https://kyverno.io/docs/high-availability/

Velero

Currently it is not supported to run velero with multiple replicas. For File System Backup there will be a Node Agent (DaemonSet) deployed. For the etcd Backup there is a Deployment with a hard coded single replica.

There is an issue where a HA requirement is in discussion, but not implemented yet.

General

Ingress-Nginx

Ingress-Nginx Controller supports scaling out without any restrictions. See values-ha-enabled-prime.yaml.

External-DNS

In the community there are currently some concerns against and warnings running multiple replicas of external-dns and so it is hard coded to 1 replica. Therefor we do not suggest to run multiple replicas. Since register dns entries is an asynchronous process anyways, it shouldn't harm if external-dns deployment is not HA.

Since there are also good arguments for enabling multiple replicas we will keep an eye on the open discussion and support external-dns HA configuration as soon as it is available in the upstream project.

CNPG

The CloudNative-PG Controller can be scaled out, but only one instance does the work (leader-election).

Open issues: