Kubernetes workload for stateful application but no need of persistent disk - kubernetes

I am having a stateful application - I am keeping data in user's sessions (basically data in HttpSession object) - but I do not have any requirement to write anything to persistent disk.
From what I have read so far - StatefulSet workloads are meant for stateful applications, but my understanding so far is that even though my application is a stateful application but Deployment workloads can also suffice my requirement because I do not want to write anything to persistent disks.
However, one point I am not sure about is that suppose I use Deployment workload and a lot of user data is present in my HttpSession object, now due to some reason Kubernetes restarts my Pod then of course all that user session data will be lost. So, my question are following:
Does StatefulSet handles this situation any better than Deployment workload?
So, only difference between Deployment workload and StatefulSet workload is about absence/presence of persistent disk or there is something to do with application session management as well in case of StatefulSet?

Does StatefulSet handles this situation any better than Deployment workload?
No. Neither Deployment nor StatefulSet will preserve memory contents. To preserve session information, you'll need to store it somewhere. One common approach is to use Redis.
So, only difference between Deployment workload and StatefulSet workload is about absence/presence of persistent disk or there is something to do with application session management as well in case of StatefulSet?
No, there are other differences:
StatefulSets create (and re-create) deterministic, consistent pod names (identifiers).
StatefulSets, are deployed, scaled, and updated one by one in a deterministic, consistent order. The next pod will be created only after the previous one reached the Running state.
Additionally, it's worth mentioning that persistent disks can be attached to pods that aren't part of a StatefulSet. It's just that it's convenient to have disks always be attached to a pod with a consistent id. For instance if you have pods running a replicated database, you can use StatefulSets to ensure that the master replica's disk is always attached to pod #1.
Edit:
Link to official documentation about StatefulSets
From the documentation:
Like a Deployment, a StatefulSet manages Pods that are based on an
identical container spec. Unlike a Deployment, a StatefulSet maintains
a sticky identity for each of their Pods. These pods are created from
the same spec, but are not interchangeable: each has a persistent
identifier that it maintains across any rescheduling.
...
StatefulSets are valuable for applications that require one or more of
the following.
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.
In the above, stable is synonymous with persistence across Pod
(re)scheduling. If an application doesn't require any stable
identifiers or ordered deployment, deletion, or scaling, you should
deploy your application using a workload object that provides a set of
stateless replicas. Deployment or ReplicaSet may be better suited to
your stateless needs.

Related

Why not to use Kubernetes StatefulSet for stateless applications?

I know why use StatefulSet for stateful applications. (e.g. DB or something)
In most cases, I can see like "You want to deploy stateful app to k8s? Use StatefulSet!"
However, I couldn't see like "You want to deploy stateless app to k8s? Then, DO NOT USE StatefulSet" ever.
Even nobody says "I don't recommend to use StatefulSet for stateless app", many stateless apps is deployed through Deployment, like it is the standard.
The StatefulSet has clear pros for stateful app, but I think Deployment doesn't for stateless app.
Is there any pros in Deployment for stateless apps? Or is there any clear cons in StatefulSet for stateless apps?
I supposed that StatefulSet cannot use LoadBalancer Service or StatefulSet has penalty to use HPA, but all these are wrong.
I'm really curious about this question.
P.S. Precondition is the stateless app also uses the PV, but not persists stateful data, for example logs.
I googled "When not to use StatefulSet", "when Deployment is better than StatefulSet", "Why Deployment is used for stateless apps", or something more questions.
I also see the k8s docs about StatefulSet either.
Different Priorities
What happens when a Node becomes unreachable in a cluster?
Deployment - Stateless apps
You want to maximize availability. As soon as Kubernetes detects that there are fewer than the desired number of replicas running in your cluster, the controllers spawn new replicas of it. Since these apps are stateless, it is very easy to do for the Kubernetes controllers.
StatefulSet - Stateful apps
You want to maximize availability - but not you must ensure data consistency (the state). To ensure data consistency, each replica has its own unique ID, and there are never multiple replicas of this ID, e.g. it is unique. This means that you cannot spawn up a new replica, unless that you are sure that the replica on the unreachable Node are terminated (e.g. stops using the Persistent Volume).
Conclusion
Both Deployment and StatefulSet try to maximize the availability - but StatefulSet cannot sacrifice data consistency (e.g. your state), so it cannot act as fast as Deployment (stateless) apps can.
These priorities does not only happens when a Node becomes unreachable, but at all times, e.g. also during upgrades and deployments.
In contrast to a Kubernetes Deployment, where pods are easily replaceable, each pod in a StatefulSet is given a name and treated individually. Pods with distinct identities are necessary for stateful applications.
This implies that if any pod perishes, it will be apparent right away. StatefulSets act as controllers but do not generate ReplicaSets; rather, they generate pods with distinctive names that follow a predefined pattern. The ordinal index appears in the DNS name of a pod. A distinct persistent volume claim (PVC) is created for each pod, and each replica in a StatefulSet has its own state.
For instance, a StatefulSet with four replicas generates four pods, each of which has its own volume, or four PVCs. StatefulSets require a headless service to return the IPs of the associated pods and enable direct interaction with them. The headless service has a service IP but no IP address and has to be created separately.The major components of a StatefulSet are the set itself, the persistent volume and the headless service.
That all being said, people deploy Stateful Applications with Deployments, usually they mount a RWX PV into the pods so all "frontends" share the same backend. Quite common in CNCF projects.
A stateful set manages each POD with a unique hostname based on an index number. So with an index, it would be easy to identify the individual PODs and also easy for the application to check which on rely or unique network identities. Also, you might have read stateful sets get deleted in a specified order to maintain consistency.
When you use stateful for the stateless application it will be like a burden to manage and add complexity to unique network identities and ordering guarantees.
For example, when you scale down to zero stateful sets it goes in the controlled way while with deployment or RS it won't be the same case. However, there is no guarantee when deleting the resource stateful set.
Also, Before a scaling operation is applied to a stateful set Pod, all of its predecessors must be Running and Ready. So if you are deploying the application, three Pods will be deployed suppose in order app-0, app-1, app-2. app-1 wont be deployed before app-0 is Running & Ready, and app-2 wont be deployed until app-1 is Ready.
While with deployment you can manage the % for and handle the RollingUpdate scenario but with a stateful set it will delete and recreate new POD one by one.

In kubernetes, is there a way to make statefulset pods linger to finish requests on rolling update?

In Kubernetes, I have a statefulset with a number of replicas.
I've set the updateStrategy to RollingUpdate.
I've set podManagementPolicy to Parallel.
My statefulset instances do not have a persistent volume claim -- I use the statefulset as a way to allocate ordinals 0..(N-1) to pods in a deterministic manner.
The main reason for this, is to keep availability for new requests while rolling out software updates (freshly built containers) while still allowing each container, and other services in the cluster, to "know" its ordinal.
The behavior I want, when doing a rolling update, is for the previous statefulset pods to linger while there are still long-running requests processing on them, but I want new traffic to go to the new pods in the statefulset (mapped by the ordinal) without a temporary outage.
Unfortunately, I don't see a way of doing this -- what am I missing?
Because I don't use volume claims, you might think I could use deployments instead, but I really do need each of the pods to have a deterministic ordinal, that:
is unique at the point of dispatching new service requests (incoming HTTP requests, including public ingresses)
is discoverable by the pod itself
is persistent for the duration of the pod lifetime
is contiguous from 0 .. (N-1)
The second-best option I can think of is using something like zookeeper or etcd to separately manage this property, using some of the traditional long-poll or leader-election mechanisms, but given that kubernetes already knows (or can know) about all the necessary bits, AND kubernetes service mapping knows how to steer incoming requests from old instances to new instances, that seems more redundant and complicated than necessary, so I'd like to avoid that.
I assume that you need this for a stateful workload, a workload that e.g. requires writes. Otherwise you can use Deployments with multiple pods online for your shards. A key feature with StatefulSet is that they provide unique stable network identities for the instances.
The behavior I want, when doing a rolling update, is for the previous statefulset pods to linger while there are still long-running requests processing on them, but I want new traffic to go to the new pods in the statefulset.
This behavior is supported by Kubernetes pods. But you also need to implement support for it in your application.
New traffic will not be sent to your "old" pods.
A SIGTERM signal will be sent to the pod - your application may want to listen to this and do some action.
After a configurable "termination grace period", your pod will get killed.
See Kubernetes best practices: terminating with grace for more info about pod termination.
Be aware that you should connect to services instead of directly to pods for this to work. E.g. you need to create headless services for the replicas in a StatefulSet.
If your clients are connecting to a specific headless service, e.g. N, this means that it will not be available for some times during upgrades. You need to decide if your clients should retry their connections during this time period or if they should connect to another headless service if N is not available.
If you are in a case where you need:
stateful workload (e.g. support for write operations)
want high availability for your instances
then you need a form of distributed system that does some form of replication/synchronization, e.g. using raft or a product that implements this. Such system is easiest deployed as a StatefulSet.
You may be able to do this using Container Lifecycle Hooks, specifically the preStop hook.
We use this to drain connections from our Varnish service before it terminates.
However, you would need to implement (or find) a script to do the draining.

In Kubernetes, what is the real purpose of replicasets?

I am aware about the hierarchical order of k8s resources. In brief,
service: a service is what exposes the application to outer world (or with in cluster). (The service types like, CluserIp, NodePort, Ingress are not so much relevant to this question. )
deployment: a deployment is what is responsible to keep a set of pods running.
replicaset: a replica set is what a deployment in turn relies on to keep the set of pods running.
pod: - a pod consist of a container or a group of container
container - the actual required application is run inside the container.
The thing i want to empasise in this question is, why we have replicaset. Why don't the deployment directly handle or take responsibility of keeping the required number of pods running. But deployment in turn relies on replicset for this.
If k8s is designed this way there should be definitely some benefit of having replicaset. And this is what i want to explore/understand in depth.
Both essentially serves the same purpose. Deployments are a higher abstraction and as the name suggests it deals with creating, maintining and upgrading the deployment (collection of pods) as a whole.
Whereas, ReplicationControllers or Replica sets primary responsibility is to maintain a set of identical replicas (which you can achieve declaratively using deployments too, but internally it creates a resplicaset to enable this).
More specifically, when you are trying to perform a "rolling" update to your deployment, such as updating the image versions, the deployment internally creates a new replica set and performs the rollout. during the rollout you can see two replicasets for the same deployment.
So in other words, Deployment needs the lower level "encapsulation" of Replica sets to achive this.

When should I use StatefulSet?Can I deploy database in StatefulSet?

I heard that statefulset is suitable for database.
But StatefulSet will create different pvc for echo pod.
If I set the replicas=3.then I get 3 pod and 3 different pvc with different data.
For database users,they only want a database not 3 database.
So Its clear we should not use statefulset in this situation.
But when should we use statefulset.
A StatefulSet does three big things differently from a Deployment:
It creates a new PersistentVolumeClaim for each replica;
It gives the pods sequential names, starting with statefulsetname-0; and
It starts the pods in a specific order (ascending numerically).
This is useful when the database itself knows how to replicate data between different copies of itself. In Elasticsearch, for example, indexes are broken up into shards. There are by default two copies of each shard. If you have five Pods running Elasticsearch, each one will have a different fraction of the data, but internally the database system knows how to route a request to the specific server that has the datum in question.
I'd recommend using a StatefulSet in preference to manually creating a PersistentVolumeClaim. For database workloads that can't be replicated, you can't set replicas: greater than 1 in either case, but the PVC management is valuable. You usually can't have multiple databases pointing at the same physical storage, containers or otherwise, and most types of Volumes can't be shared across Pods.
We can deploy a database to Kubernetes as a stateful application. Usually, when we deploy pods they have their own storage, but that storage is ephemeral - if the container kills its storage, it’s gone with it.
So, we’ll have a Kubernetes object to tackle that scenario: when we want our data to persist we attach a pod with a respective persistent volume claim. By doing so, if our container kills our data, it will be in the cluster, and the new pod will access the data accordingly.
Some limitations of using StatefulSet are:
1.Required use of persistent volume provisioner to provision storage for pod-based on request storage class.
2.Deleting or scaling down the replicas will not delete the volume attached to StatefulSet. It ensures the safety of the data.
3.StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods.
4.StatefulSet doesn’t provide any guarantee to delete all pods when StatefulSet is deleted, unlike deployment, which deletes all pods associated with deployment when the deployment is deleted. You have to scale down pod replicas to 0 before deleting StatefulSet.
stateful set useful for running the application which stores the state basically.
Stateful set database run the multiple replicas of POD and PVC however internally they all auto sync. Data sync across the pods and PVC.
So ideally it's best option to use the stateful sets with multiple replicas to get the HA database.
Now it depends on the use case which database you want to use, it supports replication or not clustering, etc.
here is MySQL example with replication details : https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/

How pod replicas sync with each other - Kubernetes?

I have a MySQL database pod with 3 replicas.Now I'm making some changes in one pod(pod data,not pod configuration), say I'm adding a table.How will the change reflect on the other replicas of the pod?
I'm using kubernetes v1.13 with 3 worker nodes.
PODs do not sync. Think of them as independend processes.
If you want a clustered MySQL installation, the Kubernetes docs describe how to do this by using a StatefulSet: https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/#deploy-mysql
In essence you have to configure master/slave instances of MySQL yourself.
Pods are independent from each other, if you modify one pod the others will not be affected
As per your configuration - changes applied in one pod wont be reflected on all others. These are isolated resources.
There is a good practice to deploy such things using PersistentVolumeClaims and StatefulSets.
You can always find explanation with examples and best practices in Run a Replicated Stateful Application documentation.
If you have three mysql server pods, then you have 3 independent databases. Even though you created them from the same Deployment. So, depending on what you do, you might end up with bunch of databases in the cluster.
I would create 1 mysql pod, with persistence, so if one pod dies, the next one would take if from where the other one left. Would not lose data.
If what you want is high availability, or failover replica, you would need to manage it on your own.
Generally speaking, K8s should not be used for storage purposes.
You are good to have common storage among those 3 pods (PVC) and also consider STS when running databases on k8s.