When should I use StatefulSet?Can I deploy database in StatefulSet? - kubernetes

I heard that statefulset is suitable for database.
But StatefulSet will create different pvc for echo pod.
If I set the replicas=3.then I get 3 pod and 3 different pvc with different data.
For database users,they only want a database not 3 database.
So Its clear we should not use statefulset in this situation.
But when should we use statefulset.

A StatefulSet does three big things differently from a Deployment:
It creates a new PersistentVolumeClaim for each replica;
It gives the pods sequential names, starting with statefulsetname-0; and
It starts the pods in a specific order (ascending numerically).
This is useful when the database itself knows how to replicate data between different copies of itself. In Elasticsearch, for example, indexes are broken up into shards. There are by default two copies of each shard. If you have five Pods running Elasticsearch, each one will have a different fraction of the data, but internally the database system knows how to route a request to the specific server that has the datum in question.
I'd recommend using a StatefulSet in preference to manually creating a PersistentVolumeClaim. For database workloads that can't be replicated, you can't set replicas: greater than 1 in either case, but the PVC management is valuable. You usually can't have multiple databases pointing at the same physical storage, containers or otherwise, and most types of Volumes can't be shared across Pods.

We can deploy a database to Kubernetes as a stateful application. Usually, when we deploy pods they have their own storage, but that storage is ephemeral - if the container kills its storage, it’s gone with it.
So, we’ll have a Kubernetes object to tackle that scenario: when we want our data to persist we attach a pod with a respective persistent volume claim. By doing so, if our container kills our data, it will be in the cluster, and the new pod will access the data accordingly.
Some limitations of using StatefulSet are:
1.Required use of persistent volume provisioner to provision storage for pod-based on request storage class.
2.Deleting or scaling down the replicas will not delete the volume attached to StatefulSet. It ensures the safety of the data.
3.StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods.
4.StatefulSet doesn’t provide any guarantee to delete all pods when StatefulSet is deleted, unlike deployment, which deletes all pods associated with deployment when the deployment is deleted. You have to scale down pod replicas to 0 before deleting StatefulSet.

stateful set useful for running the application which stores the state basically.
Stateful set database run the multiple replicas of POD and PVC however internally they all auto sync. Data sync across the pods and PVC.
So ideally it's best option to use the stateful sets with multiple replicas to get the HA database.
Now it depends on the use case which database you want to use, it supports replication or not clustering, etc.
here is MySQL example with replication details : https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/

Related

Why not to use Kubernetes StatefulSet for stateless applications?

I know why use StatefulSet for stateful applications. (e.g. DB or something)
In most cases, I can see like "You want to deploy stateful app to k8s? Use StatefulSet!"
However, I couldn't see like "You want to deploy stateless app to k8s? Then, DO NOT USE StatefulSet" ever.
Even nobody says "I don't recommend to use StatefulSet for stateless app", many stateless apps is deployed through Deployment, like it is the standard.
The StatefulSet has clear pros for stateful app, but I think Deployment doesn't for stateless app.
Is there any pros in Deployment for stateless apps? Or is there any clear cons in StatefulSet for stateless apps?
I supposed that StatefulSet cannot use LoadBalancer Service or StatefulSet has penalty to use HPA, but all these are wrong.
I'm really curious about this question.
P.S. Precondition is the stateless app also uses the PV, but not persists stateful data, for example logs.
I googled "When not to use StatefulSet", "when Deployment is better than StatefulSet", "Why Deployment is used for stateless apps", or something more questions.
I also see the k8s docs about StatefulSet either.
Different Priorities
What happens when a Node becomes unreachable in a cluster?
Deployment - Stateless apps
You want to maximize availability. As soon as Kubernetes detects that there are fewer than the desired number of replicas running in your cluster, the controllers spawn new replicas of it. Since these apps are stateless, it is very easy to do for the Kubernetes controllers.
StatefulSet - Stateful apps
You want to maximize availability - but not you must ensure data consistency (the state). To ensure data consistency, each replica has its own unique ID, and there are never multiple replicas of this ID, e.g. it is unique. This means that you cannot spawn up a new replica, unless that you are sure that the replica on the unreachable Node are terminated (e.g. stops using the Persistent Volume).
Conclusion
Both Deployment and StatefulSet try to maximize the availability - but StatefulSet cannot sacrifice data consistency (e.g. your state), so it cannot act as fast as Deployment (stateless) apps can.
These priorities does not only happens when a Node becomes unreachable, but at all times, e.g. also during upgrades and deployments.
In contrast to a Kubernetes Deployment, where pods are easily replaceable, each pod in a StatefulSet is given a name and treated individually. Pods with distinct identities are necessary for stateful applications.
This implies that if any pod perishes, it will be apparent right away. StatefulSets act as controllers but do not generate ReplicaSets; rather, they generate pods with distinctive names that follow a predefined pattern. The ordinal index appears in the DNS name of a pod. A distinct persistent volume claim (PVC) is created for each pod, and each replica in a StatefulSet has its own state.
For instance, a StatefulSet with four replicas generates four pods, each of which has its own volume, or four PVCs. StatefulSets require a headless service to return the IPs of the associated pods and enable direct interaction with them. The headless service has a service IP but no IP address and has to be created separately.The major components of a StatefulSet are the set itself, the persistent volume and the headless service.
That all being said, people deploy Stateful Applications with Deployments, usually they mount a RWX PV into the pods so all "frontends" share the same backend. Quite common in CNCF projects.
A stateful set manages each POD with a unique hostname based on an index number. So with an index, it would be easy to identify the individual PODs and also easy for the application to check which on rely or unique network identities. Also, you might have read stateful sets get deleted in a specified order to maintain consistency.
When you use stateful for the stateless application it will be like a burden to manage and add complexity to unique network identities and ordering guarantees.
For example, when you scale down to zero stateful sets it goes in the controlled way while with deployment or RS it won't be the same case. However, there is no guarantee when deleting the resource stateful set.
Also, Before a scaling operation is applied to a stateful set Pod, all of its predecessors must be Running and Ready. So if you are deploying the application, three Pods will be deployed suppose in order app-0, app-1, app-2. app-1 wont be deployed before app-0 is Running & Ready, and app-2 wont be deployed until app-1 is Ready.
While with deployment you can manage the % for and handle the RollingUpdate scenario but with a stateful set it will delete and recreate new POD one by one.

Running DB as Kubernetes Deployment or StatefulSet?

I would like to run a single pod of Mongo db in my Kubernetes cluster. I would be using node selector to get the pod scheduled on a specific node.
Since Mongo is a database and I am using node selector, is there any reason for me not to use Kubernetes Deployment over StatefulSet? Elaborate more on this if we should never use Deployment.
Since mongo is a database and I am using node selector, Is there any reason for me not to use k8s deployment over StatefulSet? Elaborate more on this if we should never use Deployment.
You should not run a database (or other stateful workload) as Deployment, use StatefulSet for those.
They have different semantics while updating or when the pod becomes unreachable. StatefulSet use at-most-X semantics and Deployments use at-least-X semantics, where X is number of replicas.
E.g. if the node becomes unreachable (e.g. network issue), for Deployment, a new Pod will be created on a different node (to follow your desired 1 replica), but for StatefulSet it will make sure to terminate the existing Pod before creating a new, so that there are never more than 1 (when you have 1 as desired number of replicas).
If you run a database, I assume that you want the data consistent, so you don't want duplicate instances with different data - (but should probably run a distributed database instead).

Kubernetes workload for stateful application but no need of persistent disk

I am having a stateful application - I am keeping data in user's sessions (basically data in HttpSession object) - but I do not have any requirement to write anything to persistent disk.
From what I have read so far - StatefulSet workloads are meant for stateful applications, but my understanding so far is that even though my application is a stateful application but Deployment workloads can also suffice my requirement because I do not want to write anything to persistent disks.
However, one point I am not sure about is that suppose I use Deployment workload and a lot of user data is present in my HttpSession object, now due to some reason Kubernetes restarts my Pod then of course all that user session data will be lost. So, my question are following:
Does StatefulSet handles this situation any better than Deployment workload?
So, only difference between Deployment workload and StatefulSet workload is about absence/presence of persistent disk or there is something to do with application session management as well in case of StatefulSet?
Does StatefulSet handles this situation any better than Deployment workload?
No. Neither Deployment nor StatefulSet will preserve memory contents. To preserve session information, you'll need to store it somewhere. One common approach is to use Redis.
So, only difference between Deployment workload and StatefulSet workload is about absence/presence of persistent disk or there is something to do with application session management as well in case of StatefulSet?
No, there are other differences:
StatefulSets create (and re-create) deterministic, consistent pod names (identifiers).
StatefulSets, are deployed, scaled, and updated one by one in a deterministic, consistent order. The next pod will be created only after the previous one reached the Running state.
Additionally, it's worth mentioning that persistent disks can be attached to pods that aren't part of a StatefulSet. It's just that it's convenient to have disks always be attached to a pod with a consistent id. For instance if you have pods running a replicated database, you can use StatefulSets to ensure that the master replica's disk is always attached to pod #1.
Edit:
Link to official documentation about StatefulSets
From the documentation:
Like a Deployment, a StatefulSet manages Pods that are based on an
identical container spec. Unlike a Deployment, a StatefulSet maintains
a sticky identity for each of their Pods. These pods are created from
the same spec, but are not interchangeable: each has a persistent
identifier that it maintains across any rescheduling.
...
StatefulSets are valuable for applications that require one or more of
the following.
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.
In the above, stable is synonymous with persistence across Pod
(re)scheduling. If an application doesn't require any stable
identifiers or ordered deployment, deletion, or scaling, you should
deploy your application using a workload object that provides a set of
stateless replicas. Deployment or ReplicaSet may be better suited to
your stateless needs.

How do I mount data into persisted storage on Kubernetes and share the storage amongst multiple pods?

I am new at Kubernetes and am trying to understand the most efficient and secure way to handle sensitive persisted data that interacts with a k8 pod. I have the following requirements when I start a pod in a k8s cluster:
The pod should have persisted storage.
Data inside the pod should be persistent even if the pod crashes or restarts.
I should be able to easily add or remove data from hostPath into the pod. (Not sure if this is feasible since I do not know how the data will behave if the pod starts on a new node in a multi node environment. Do all nodes have access to the data on the same hostPath?)
Currently, I have been using StatefulSets with a persistent volume claim on GKE. The image that I am using has a couple of constraints as follows:
I have to mount a configuration file into the pod before it starts. (I am currently using configmaps to pass the configuration file)
The pod that comes up, creates its own TLS certificates which I need to pass to other pods. (Currently I do not have a process in place to do this and thus have been manually copy pasting these certificates into other pods)
So, how do I maintain a common persisted storage that handles sensitive data between multiple pods and how do I add pre-configured data to this storage? Any guidance or suggestions are appreciated.
I believe this documentation on creating a persistent disk with multiple readers [1] is what you are looking for. you will however only be able to have the pods read from the disk since GCP does not support "WRITEMANY".
Regarding hostpaths, the mount point is on the pod the volume is a directory on the node. I believe the hostpath is confined to individual nodes.
[1] https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/readonlymany-disks
[2] https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes

How pod replicas sync with each other - Kubernetes?

I have a MySQL database pod with 3 replicas.Now I'm making some changes in one pod(pod data,not pod configuration), say I'm adding a table.How will the change reflect on the other replicas of the pod?
I'm using kubernetes v1.13 with 3 worker nodes.
PODs do not sync. Think of them as independend processes.
If you want a clustered MySQL installation, the Kubernetes docs describe how to do this by using a StatefulSet: https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/#deploy-mysql
In essence you have to configure master/slave instances of MySQL yourself.
Pods are independent from each other, if you modify one pod the others will not be affected
As per your configuration - changes applied in one pod wont be reflected on all others. These are isolated resources.
There is a good practice to deploy such things using PersistentVolumeClaims and StatefulSets.
You can always find explanation with examples and best practices in Run a Replicated Stateful Application documentation.
If you have three mysql server pods, then you have 3 independent databases. Even though you created them from the same Deployment. So, depending on what you do, you might end up with bunch of databases in the cluster.
I would create 1 mysql pod, with persistence, so if one pod dies, the next one would take if from where the other one left. Would not lose data.
If what you want is high availability, or failover replica, you would need to manage it on your own.
Generally speaking, K8s should not be used for storage purposes.
You are good to have common storage among those 3 pods (PVC) and also consider STS when running databases on k8s.