One of the reasons Kubernetes has become the cornerstone of modern IT infrastructures is that it’s truly an all-in-one orchestration platform. It offers API objects, controllers, and resources that cater to most – if not all – deployment use cases.
For instance, its robust support for stateful applications is a rare find in the open-source container orchestration space. Unlike stateless applications, which are more straightforward to manage, stateful applications – like financial systems and collaborative platforms – require specific considerations to maintain data integrity and consistency.
In this article, we explore Kubernetes StatefulSets, the API objects/controllers that allow us to run and manage Stateful applications in a Kubernetes cluster. We define them, share the best ways to create and manage them, and mention some of their limitations.
When it comes to state management, applications are categorized into two types: stateful and stateless. Stateless applications, like web servers, are characterized by their ephemeral nature, which means that their state is not preserved across restarts or failures.
In contrast, stateful applications, like databases and message brokers, maintain persistent data that must be preserved even if the application restarts or fails. Kubernetes StatefulSets are purpose-built to orchestrate stateful applications in a Kubernetes cluster.
They ensure that each pod in a StatefulSet has a unique identity and maintains persistent storage. This is crucial for data consistency and application reliability in the event of pod disruptions.
The decision to use StatefulSets should be made after carefully considering the nature of the deployed applications. If any application requires persistent storage, a stable network identity, ordered deployment, and stateful scaling, then StatefulSets are the preferred (if not your only) choice. Here are a few examples:
Here are the building blocks of a StatefulSet deployment in Kubernetes:
A headless service is used to enable communication between pods in a StatefulSet. This service assigns a unique network identity to each pod, which is crucial for maintaining state and data across restarts.
Note: it’s the user’s responsibility to configure a headless service in a StatefulSet configuration; Kubernetes doesn’t do it by default.
Pod identity is a unique identifier for each pod that remains the same across restarts. It’s determined by using a combination of the pod’s ordinal, stable storage order, and the network identity assigned by the headless service. The ordinal represents a pod’s position within the StatefulSet. It starts from 0 (representing the first pod) and is incremented sequentially.
Much like other controllers, StatefulSets use a pod selector to determine which pods they manage. The pod selector allows the controller to target pods based on the assigned labels. This ensures that only the relevant pods are affected by scaling or update operations.
Volume claim templates define the persistent storage requirements for StatefulSet pods, including the storage class, access modes, and size of the persistent volumes for each pod.
A Minimum ready seconds parameter specifies the minimum duration for which a newly created pod must be ready, without any errors, before it can be considered available. This setting guarantees that pods have enough time to initialize and stabilize before they are put to use.
StatefulSets and Deployments are both controllers used to orchestrate workloads in Kubernetes, but they serve different use cases. StatefulSets are used to deploy stateful applications, whereas deployments are primarily used to run stateless applications. Here’s a table summarizing the comparison between the two:
Aspect | StatefulSets | Deployments |
---|---|---|
Purpose | For stateful apps that require persistent storage and stable identities | For stateless applications that don’t need persistent identities or storage |
Pod identity | Unique and stable across rescheduling | Dynamic, changes upon restart |
Complexity | More complex and requires manual effort in setting up a headless service | Relatively easier and self-sufficient |
Use cases | Databases, key-value stores, collaborative platforms | Web servers, microservices |
Pod interchangeability | Not possible | Possible |
Ordered deployment | Supported | Not supported |
A ReplicaSet is primarily used with stateless applications to make sure that a specified number of identical pod replicas are always running. Here’s a table comparing ReplicaSet with StatefulSet.
Aspect | StatefulSets | ReplicaSets |
---|---|---|
Purpose | For stateful apps that need persistent storage and identity | For stateless applications, particularly when a specific number of identical replicas are required |
Pod identity | Unique and stable across restarts | Replaceable identity |
Complexity | More complicated to set up. Also includes extra manual work for configuring a headless service | Relatively easier to configure |
Use cases | Databases, message brokers, machine learning workloads | Web servers, microservices |
Pod interchangeability | Not possible | Possible |
Ordered deployment | Supported | Not supported |
To learn how to set up a StatefulSet deployment, let’s look at a sample configuration file:
apiVersion: v1
kind: Service
metadata:
name: headless-svc
labels:
app: headless-app
spec:
ports:
- port: 80
name: http
clusterIP: None
selector:
app: headless-app
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: sample-statefulSet
spec:
serviceName: "headless-svc"
replicas: 4
selector:
matchLabels:
app: headless-app
template:
metadata:
labels:
app: headless-app
spec:
containers:
- name: headless-container
image: registry.k8s.io/sample-image:0.11
ports:
- containerPort: 80
name: http
volumeMounts:
- name: www-volume
mountPath: /usr/share/headless
volumeClaimTemplates:
- metadata:
name: www-volume
spec:
accessModes: [ "ReadWriteMany" ]
resources:
requests:
storage: 500Mi
The above YAML file starts by specifying the headless service at the top. We set the “clusterIP” field to “none” to indicate that it’s a headless service. We also define its name, selector, and label which are referenced later in the configuration.
Next, we define our StatefulSet named “sample-statefulSet”. We associate it with the headless service using the “serviceName” parameter, set the desired replicas to 4, and specify the container configuration and pod labels within the “template” section.
Finally, we use “volumeClaimTemplates” to define the Persistent Volume Claims (PVCs) that will be used by our StatefulSet. This includes the spec of the PVC, the access mode, and the resources (500Mi of storage).
Applying the above configuration will start the headless service and create the StatefulSet. Here’s the command to do so: (Replace sample.yaml with the actual name of your YAML file)
kubectl apply -f sample.yaml
Successful application should display an output like this:
service/headless-svc created
statefulset.apps/sample-statefulSet created
To view more information about the service, you can run this command:
kubectl get service headless-svc
Expect an output like the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
headless-svc ClusterIP None <none> 80/TCP 8s
To check the status of the StatefulSet, execute this:
kubectl get statefulset sample-statefulSet
The output should show you the number of replicas and age. For example:
NAME DESIRED CURRENT AGE
sample-statefulSet 4 2 12s
As the pods are created, you should also be able to see a sequential order appended with their name. Run this command:
kubectl get pods app=headless-app
You should be able to see pods with names like sample-statefulSet-0, sample-statefulSet-1, sample-statefulSet-2, and sample-statefulSet-3 in the output. These names serve as sticky identities for these pods, and will persist for as long as the StatefulSet stays up.
Stateful applications often require manual intervention for scaling, mainly due to their reliance on persistent storage and identity. With that said, here are two ways you can scale a StatefulSet in a Kubernetes cluster:
The kubectl utility makes it easy to scale a StatefulSet on the fly. Follow these steps to scale up:
kubectl get pods -w -l app=headless-app
kubectl scale sts sample-statefulSet --replicas=10
statefulset.apps/sample-statefulSet scaled
To scale down, follow these steps:
kubectl get pods -w -l app=headless-app
kubectl patch sts sample-statefulSet -p '{"spec":{"replicas":2}}'
statefulset.apps/sample-statefulSet patched
(Note: caling down will not work if any pod is in an unhealthy state. To scale down successfully, ensure that all the stateful pods are in a healthy and active state)
You can also scale a StatefulSet by modifying the manifest (YAML) file. Follow these steps:
kubectl apply -f sample-statefulSet.yaml
StatefulSet deployments can be updated automatically. The update strategy is specified by the “spec.updateStrategy” field of the StatefulSet configuration. Kubernetes supports the following strategies:
This is the default strategy to update pods in a StatefulSet. It sequentially updates all the pods in reverse ordinal order while ensuring data integrity and persistence throughout the process. To configure rolling updates, you can update the manifest file as follows:
kubectl patch statefulset sample-statefulSet -p
'{"spec":{"updateStrategy":{"type":"RollingUpdate"}}}'
You can also add the “partition” parameter to the “RollingUpdate” strategy to perform a rolling upgrade in distinct phases. Each update phase will then target a subset of pods, which allows for a gradual and controlled transition. To configure this rollout plan, add the partition parameter as follows:
kubectl patch statefulset sample-statefulSet -p
'{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":2}
}}}'
You can tweak the value of the partition based on your preferences.
You can also set the update strategy of a StatefulSet to be OnDelete. With this strategy, the StatefulSet will not update pods by default. It will only create new pods when the user deletes any existing pods. For most practical use cases, this strategy is not recommended.
To delete a StatefulSet, run this command:
kubectl delete statefulset sample-statefulSet
It performs a cascading delete, which ensures that the StatefulSet and all its pods are deleted. If you run the following command while performing the delete, you will see the pods terminating one by one.
kubectl get pods -w -l app=headless-app
Follow these best practices to get the most out of your StatefulSet deployments:
StatefulSets are a powerful tool to orchestrate stateful applications within a Kubernetes cluster. However, there are a few limitations to consider when using them:
Despite their limitations, StatefulSets are a great way to deploy, manage, and scale stateful applications inside a Kubernetes cluster. They support automated updates, guarantee persistent storage and sticky identities, and enable predictable scaling. In this article, we discussed how StatefulSets work, learned how to configure, deploy, upgrade, and scale them, and explored some best practices and limitationsg. We hope you found it useful.
Write for Site24x7 is a special writing program that supports writers who create content for Site24x7 “Learn” portal. Get paid for your writing.
Apply Now