Kubernetes Operators extend the platform to manage complex applications automatically. While Kubernetes excels at running stateless workloads, stateful applications like databases, message queues, and monitoring systems require domain-specific knowledge for proper operation. Operators encode this knowledge, transforming manual runbooks into automated controllers.
The Operator pattern builds on Kubernetes' fundamental architecture: declare desired state, let controllers reconcile actual state to match. Standard controllers handle built-in resources like Deployments and Services. Operators handle custom resources that represent your specific applications, bringing the same declarative model to complex software.
The Operator Pattern
An Operator consists of Custom Resource Definitions (CRDs) and a controller. CRDs extend the Kubernetes API with new resource types specific to your application. The controller watches these custom resources and takes action to make reality match the declared state.
Consider a PostgreSQL Operator. Instead of manually creating StatefulSets, Services, ConfigMaps, and running backup scripts, you declare a PostgresCluster resource. The Operator handles everything: provisioning instances, configuring replication, managing failover, scheduling backups, and handling upgrades.
# Declare what you want
apiVersion: postgres.example.com/v1
kind: PostgresCluster
metadata:
name: production-db
spec:
replicas: 3
version: "15"
storage:
size: 100Gi
class: fast-ssd
backup:
schedule: "0 */6 * * *"
retention: 7d
resources:
requests:
cpu: 2
memory: 8Gi
The Operator reads this declaration and creates the necessary Kubernetes resources: a StatefulSet for the PostgreSQL pods, Services for client connections and replication, ConfigMaps for PostgreSQL configuration, Secrets for credentials, CronJobs for backups, and monitoring configuration.
More importantly, the Operator handles operational tasks that require PostgreSQL expertise. When a primary fails, it promotes a replica. When you change the version, it performs a rolling upgrade that respects replication lag. When storage fills, it can trigger alerts or automatic expansion.
Building Operators
Operator development frameworks simplify building controllers. Kubebuilder and Operator SDK provide scaffolding, code generation, and libraries that handle Kubernetes API interactions. You focus on your application's domain logic rather than Kubernetes plumbing.
// Simplified reconciliation loop structure
func (r *PostgresClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := r.Log.WithValues("postgrescluster", req.NamespacedName)
// Fetch the PostgresCluster resource
cluster := &postgresv1.PostgresCluster{}
if err := r.Get(ctx, req.NamespacedName, cluster); err != nil {
if errors.IsNotFound(err) {
// Resource deleted, nothing to do
return ctrl.Result{}, nil
}
return ctrl.Result{}, err
}
// Reconcile StatefulSet
if err := r.reconcileStatefulSet(ctx, cluster); err != nil {
log.Error(err, "Failed to reconcile StatefulSet")
return ctrl.Result{RequeueAfter: time.Minute}, err
}
// Reconcile Services
if err := r.reconcileServices(ctx, cluster); err != nil {
log.Error(err, "Failed to reconcile Services")
return ctrl.Result{RequeueAfter: time.Minute}, err
}
// Reconcile backups
if err := r.reconcileBackups(ctx, cluster); err != nil {
log.Error(err, "Failed to reconcile Backups")
return ctrl.Result{RequeueAfter: time.Minute}, err
}
// Check cluster health and update status
health, err := r.checkClusterHealth(ctx, cluster)
if err != nil {
return ctrl.Result{RequeueAfter: 30 * time.Second}, err
}
cluster.Status.Health = health
cluster.Status.ReadyReplicas = r.countReadyReplicas(ctx, cluster)
if err := r.Status().Update(ctx, cluster); err != nil {
return ctrl.Result{}, err
}
return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil
}
The reconciliation loop is the core of any Operator. It's called whenever the watched resource changes or when a requeue is triggered. The loop compares desired state (the CRD spec) with actual state (what exists in the cluster) and takes action to converge them.
Custom Resource Design
Well-designed CRDs provide clear, intuitive APIs for your application. The spec defines desired configuration. The status reports current state. Validation ensures users can't create invalid configurations.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: postgresclusters.postgres.example.com
spec:
group: postgres.example.com
names:
kind: PostgresCluster
listKind: PostgresClusterList
plural: postgresclusters
singular: postgrescluster
shortNames:
- pg
scope: Namespaced
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
required:
- spec
properties:
spec:
type: object
required:
- replicas
- version
properties:
replicas:
type: integer
minimum: 1
maximum: 10
version:
type: string
enum: ["13", "14", "15", "16"]
storage:
type: object
properties:
size:
type: string
pattern: '^[0-9]+Gi$'
status:
type: object
properties:
health:
type: string
enum: ["Healthy", "Degraded", "Unhealthy"]
readyReplicas:
type: integer
conditions:
type: array
items:
type: object
properties:
type:
type: string
status:
type: string
lastTransitionTime:
type: string
format: date-time
Status conditions follow Kubernetes conventions, providing machine-readable state that tools and other controllers can consume. Conditions like Ready, Available, and Progressing communicate state in a standardized way.
Operational Tasks
Operators shine when handling day-two operations that require domain expertise. These tasks often involve careful sequencing, health checks, and rollback capabilities that would be error-prone to perform manually.
Database upgrades are a prime example. A manual upgrade might involve: check replication lag, stop writes, verify replica caught up, promote replica with new version, reconfigure old primary as replica with new version, restore writes. An Operator encodes this sequence and handles failures at each step.
func (r *PostgresClusterReconciler) reconcileUpgrade(ctx context.Context, cluster *postgresv1.PostgresCluster) error {
currentVersion := cluster.Status.CurrentVersion
desiredVersion := cluster.Spec.Version
if currentVersion == desiredVersion {
return nil
}
// Check if upgrade is in progress
if cluster.Status.UpgradeInProgress {
return r.continueUpgrade(ctx, cluster)
}
// Validate upgrade path
if !r.isUpgradePathValid(currentVersion, desiredVersion) {
r.recorder.Eventf(cluster, corev1.EventTypeWarning, "InvalidUpgrade",
"Cannot upgrade from %s to %s", currentVersion, desiredVersion)
return fmt.Errorf("invalid upgrade path")
}
// Start upgrade
cluster.Status.UpgradeInProgress = true
cluster.Status.UpgradeStartedAt = metav1.Now()
// Upgrade replicas first, then primary
replicas := r.getReplicas(ctx, cluster)
for _, replica := range replicas {
if err := r.upgradeInstance(ctx, replica, desiredVersion); err != nil {
return err
}
// Wait for replica to be healthy before continuing
if err := r.waitForHealthy(ctx, replica); err != nil {
return err
}
}
// Failover to upgraded replica, then upgrade old primary
if err := r.performFailover(ctx, cluster); err != nil {
return err
}
// Upgrade remaining instance (old primary)
oldPrimary := r.getOldPrimary(ctx, cluster)
if err := r.upgradeInstance(ctx, oldPrimary, desiredVersion); err != nil {
return err
}
cluster.Status.UpgradeInProgress = false
cluster.Status.CurrentVersion = desiredVersion
return nil
}
Observability Integration
Operators should expose metrics about both themselves and the applications they manage. Prometheus ServiceMonitors, created by the Operator, enable automatic scraping. Custom metrics provide visibility into Operator-specific operations.
var (
reconcileTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "operator_reconcile_total",
Help: "Total number of reconciliations",
},
[]string{"cluster", "result"},
)
reconcileDuration = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "operator_reconcile_duration_seconds",
Help: "Duration of reconciliation",
Buckets: prometheus.ExponentialBuckets(0.1, 2, 10),
},
[]string{"cluster"},
)
clusterHealth = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "postgres_cluster_health",
Help: "Health status of PostgreSQL cluster (1=healthy, 0=unhealthy)",
},
[]string{"cluster", "namespace"},
)
)
Alerting rules can reference these metrics, creating alerts for Operator failures, cluster health issues, or backup problems.
When to Build vs Use Existing Operators
The Operator ecosystem is rich. Before building a custom Operator, check if one exists for your application. PostgreSQL has multiple mature Operators (Zalando, CrunchyData, CloudNativePG). Redis, Kafka, Elasticsearch, and most popular stateful applications have community or vendor Operators.
Build a custom Operator when your application is unique to your organization, when existing Operators don't meet your requirements, or when you need tight integration with your specific infrastructure and processes.
Even when using existing Operators, understand their design. Review their CRDs, understand their operational model, and verify they handle your requirements. An Operator that doesn't handle your backup strategy or upgrade path may cause more problems than it solves.
Conclusion
Kubernetes Operators bring the declarative, self-healing model of Kubernetes to complex applications. By encoding operational knowledge in code, they transform manual runbooks into automated controllers. Database clusters, message queues, and monitoring systems that once required careful manual operation can be managed with simple YAML declarations.
Building Operators requires understanding both Kubernetes controller patterns and your application's operational requirements. The reconciliation loop, CRD design, and status reporting follow established patterns. The domain logic; how to safely upgrade, how to handle failures, when to alert; comes from operational experience with your application.