Why we need replicaset when deployment can do everything that replicaset can do? I know that deployment uses replicaset underneath it [duplicate] - kubernetes

This question already has answers here:
k8s - Why we need ReplicaSet when we have Deployments
(2 answers)
Kubernetes: what's the difference between Deployment and Replica set?
(4 answers)
Closed last month.
I know that deployment uses replicaset underneath it, has revision control, creates another replicaset during rolling upgrade/downgrade.
I want to know what is the scenario in which only replicaset can be used and deployment can't be used.

ReplicaSet's purpose is to maintain a stable set of replica Pods running at any given time and it checks how many pods need to maintain bases on which it creates or deletes the pods. ReplicaSet then fulfills its purpose by creating and deleting Pods as needed to reach the desired number. ReplicaSets can be used independently. With ReplicaSet you define the number of replicas you want to run for a particular service. You would have those many replicas running.
Whereas Deployment is the advancement of replica sets. When you use Deployments you don't have to worry about managing the ReplicaSets that they create. Deployments own and manage their ReplicaSets. As such, it is recommended to use Deployments when you want ReplicaSets. As a replica looks only on creating and deleting the pods. Deployment is recommended for application services and
With deployment you should be able to do rolling upgrades or rollbacks. You can update images from v1 to v2.
Refer to this SO1 , SO2 and official documentation of Replicasets and Deployment.

what is the scenario in which only replicaset can be used and deployment can't be used.
There is no such common scenario. ReplicaSets are a lower level abstraction for maintaining stateless pods of the same image / config version. You typically creating new ReplicaSets when you want to change image or pod configuration, it is recommended to use Deployment for such changes.
By its own, it is not very useful to use ReplicaSet directly, it is more a lower level abstraction to maintain the number of replicas with the same configuration.

Related

What is the division of responsibilities between Deployments and ReplicaSets in a rolling update?

When updating my deployment to use a new version of my application, a new ReplicaSet is created and the previous ReplicaSet is scaled down as the new one scales up. What is the separation of concerns between the Deployment and the two ReplicaSets during this process?
Am I correct in assuming that it's the Deployment that's gradually changing the number of desired replicas for the two ReplicaSets as the update progresses?
A ReplicaSet ensures that a number of Pods is created in a cluster. The pods are called replicas and are the mechanism of availability in Kubernetes. But changing the ReplicaSet will not take effect on existing Pods, so it is not possible to easily change, for example, the image version.
A deployment is a higher abstraction that manages one or more ReplicaSets to provide controlled rollout of a new version. When the image version is changed in the Deployment, a new ReplicaSet for this version will be created with initially zero replicas. Then it will be scaled to one replica, after that is running, the old ReplicaSet will be scaled down. (The number of newly created pods, the step size so to speak, can be tuned.)
Below explains about the division of responsibilities between Deployments and ReplicaSets in a rolling update?
Deployment resources makes it easier for updating your pods to a newer version.
As per question lets assume, ReplicaSet-A for controlling your pods, then You wish to update your pods to a newer version, now you should create Replicaset-B, scale down ReplicaSet-A and scale up ReplicaSet-B by one step repeatedly (This process is known as rolling update). Although this does the job, it's not a good practice and it's better to let K8S do the job.
A Deployment resource does this automatically without any human interaction and increases the abstraction by one level.
Note: Deployment doesn't interact with pods directly, it just does rolling update using ReplicaSets
Refer this Replica set doc and Deployment doc

In Kubernetes, what is the real purpose of replicasets?

I am aware about the hierarchical order of k8s resources. In brief,
service: a service is what exposes the application to outer world (or with in cluster). (The service types like, CluserIp, NodePort, Ingress are not so much relevant to this question. )
deployment: a deployment is what is responsible to keep a set of pods running.
replicaset: a replica set is what a deployment in turn relies on to keep the set of pods running.
pod: - a pod consist of a container or a group of container
container - the actual required application is run inside the container.
The thing i want to empasise in this question is, why we have replicaset. Why don't the deployment directly handle or take responsibility of keeping the required number of pods running. But deployment in turn relies on replicset for this.
If k8s is designed this way there should be definitely some benefit of having replicaset. And this is what i want to explore/understand in depth.
Both essentially serves the same purpose. Deployments are a higher abstraction and as the name suggests it deals with creating, maintining and upgrading the deployment (collection of pods) as a whole.
Whereas, ReplicationControllers or Replica sets primary responsibility is to maintain a set of identical replicas (which you can achieve declaratively using deployments too, but internally it creates a resplicaset to enable this).
More specifically, when you are trying to perform a "rolling" update to your deployment, such as updating the image versions, the deployment internally creates a new replica set and performs the rollout. during the rollout you can see two replicasets for the same deployment.
So in other words, Deployment needs the lower level "encapsulation" of Replica sets to achive this.

StatefulSet update: recreate THEN delete pods

The Kubernetes StatefulSet RollingUpdate strategy deletes and recreates each Pod in order. I am interested in updating a StatefulSet by recreating a pod and then deleting the old Pod (note the reversal), one-by-one.
This is interesting to me because:
There is no reduction in the number of Ready Pods. I understand this is how a normal Deployment update works too (i.e. a Pod is only deleted after the new Pod replacing it is Ready).
More importantly, it allows me to perform application-specific live migration during my StatefulSet upgrade. I would like to "migrate" data from (old) pod-i to (new) pod-i before (old) pod-i is terminated (I would implement this in (new) pod-i readiness logic).
Is such an update strategy possible?
This is inherently possible with Deployments, but not StatefulSets. StatefulSets are used when you care strongly about an exact number of replicas with well known names. Deployments are used for more elastic workloads.
You may be able to accomplish your goal by using multiple StatefulSets- e.g. instead of a StatefulSet of 3 replicas, use 3 StatefulSets of 1 replica each. Then deploy an additional StatefulSet for your data migration before removing one of the previous ones.
Alternatively, this may be a use case for an Operator to manage the application.
No, because pods have specific names based on their ordinal (-0, -1, etc) and there can only be one pod at a time with a given name. Deployments and DaemonSets can burst for updates because their names are randomized so it doesn't matter what order you do things in.

how to stagger pod creation in k8s

I had a quick question about rolling deploys. I'm trying to make sure that the app pods creation is staggered. I looked at maxSurge and maxUnavailable which seems to be the only settings for controlling rolling deploys. Both these settings talk about pod creation in terms of old replicaset. I want to make sure that pod creation is staggered even when there is no deployment currently running.
example: If I set maxSurge to 1 and I have the replication set to 5 then in the presence of old deployment, the rolling update strategy will do the right thing and get one pod up at a time, but if there is no old deployment, all the 5 pods will come up together on a new deployment which is something I am trying to avoid.
What you have explaied is the ecpected behaviour in case there are no existing deployments.
So you want to do a ordered deployment - one pod after the other.
Try deploying the application as a statefulset.
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
Also note the differences b/w a deployment and a statefulset, for example no rollback in case of statefulset
https://blog.thecodeteam.com/2017/08/16/technical-dive-statefulsets-deployments-kubernetes/
You could try leveraging a HorizontalPodAutoscaler with maybe a custom metric that could be set to whatever value will result in your desired number of replicas. Then just configure your HPA so it only scales up so many at a time.

How Replicaset includes pods with specific labels

If I give some specific label to pods and define replicaset saying to include pods with same labels, it includes that pod in it. That is all fine and good..
( i know pods are not to be created separately, but are supposed to be created with deployments or replicaset.. but still how deployments/replicasets include pods whose label match in the defination, if they are already there for some reason)
BUT, how does this work behind the scene ? How replicaset knows that pod is to be included as it has the same label ? Lets say, I already have a pod with those labels, how does newly created replica set know that pod is to be included if it has pods less than desired number of pods ?
Does it get that information from etcd ? Or pods expose labels somehow ? How does this thing work really behind the scene ?
As stated in the Kubernetes documentation regarding ReplicaSet.
A ReplicaSet is defined with fields, including a selector that specifies how to identify Pods it can acquire, a number of replicas indicating how many Pods it should be maintaining, and a pod template specifying the data of new Pods it should create to meet the number of replicas criteria. A ReplicaSet then fulfills its purpose by creating and deleting Pods as needed to reach the desired number. When a ReplicaSet needs to create new Pods, it uses its Pod template.
It's recommended to use Deployments instead of ReplicaSets.
Deployment is an object which can own ReplicaSets and update them and their Pods via declarative, server-side rolling updates. While ReplicaSets can be used independently, today they’re mainly used by Deployments as a mechanism to orchestrate Pod creation, deletion and updates. When you use Deployments you don’t have to worry about managing the ReplicaSets that they create. Deployments own and manage their ReplicaSets. As such, it is recommended to use Deployments when you want ReplicaSets.
Like you mentioned if you have a Pod with label matching the ReplicaSet label, ReplicaSet will take control over the pod. If you deploy ReplicaSet with 3 replicas and Pod was deployed before that, then RS will spawn only 2 Pods with the matching label. It's explained with details and examples on Non-Template Pod acquisitions.
As to how it works behind the scenes, you can have a look at slides #47-56 of Kubernetes Architecture - beyond a black box - Part 1