What is the difference between ReplicaSet and ReplicationController? - deployment

From what I can tell in the documentation, a ReplicaSet is created when running a Deployment. It seems to support some of the same features of a ReplicationController - scale up/down and auto restart, but it's not clear if it supports rolling upgrades or autoscale.
The v1.1.8 user guide shows how to create a deployment in Deploying Applications (which automatically creates a ReplicaSet), yet the kubectl get replicasets command is not available until v1.2.0. I cannot find any other information about ReplicaSet in the documentation.
Will ReplicaSet eventually replace ReplicationController? Why would I want to use Deployment and ReplicaSet instead of ReplicationController?

Replica Set is the next generation of Replication Controller. Replication controller is kinda imperative, but replica sets try to be as declarative as possible.
1.The main difference between a Replica Set and a Replication Controller right now is the selector support.
+--------------------------------------------------+-----------------------------------------------------+
| Replica Set | Replication Controller |
+--------------------------------------------------+-----------------------------------------------------+
| Replica Set supports the new set-based selector. | Replication Controller only supports equality-based |
| This gives more flexibility. for eg: | selector. for eg: |
| environment in (production, qa) | environment = production |
| This selects all resources with key equal to | This selects all resources with key equal to |
| environment and value equal to production or qa | environment and value equal to production |
+--------------------------------------------------+-----------------------------------------------------+
2.The second thing is the updating the pods.
+-------------------------------------------------------+-----------------------------------------------+
| Replica Set | Replication Controller |
+-------------------------------------------------------+-----------------------------------------------+
| rollout command is used for updating the replica set. | rolling-update command is used for updating |
| Even though replica set can be used independently, | the replication controller. This replaces the |
| it is best used along with deployments which | specified replication controller with a new |
| makes them declarative. | replication controller by updating one pod |
| | at a time to use the new PodTemplate. |
+-------------------------------------------------------+-----------------------------------------------+
These are the two things that differentiates RS and RC. Deployments with RS is widely used as it is more declarative.

For now, the difference should be insignificant in most cases. ReplicaSet has a generalized label selector: https://github.com/kubernetes/kubernetes/issues/341#issuecomment-140809259. It should support all the features the replication controller supports.
Will ReplicaSet eventually replace ReplicationController? Why would I want to use Deployment and ReplicaSet instead of ReplicationController?
This boils down to rolling update vs deployment. Please read docs on deployment to understand the difference: http://kubernetes.io/docs/user-guide/deployments/. In short, if you start a rolling update and close your laptop, your replicas have some mix of intermediate image versions. If you create a deployement and close your laptop, the deployment either gets POSTed successfully to apiserver, in which case it works server side, or it doesn't, in which case all your replicas are still on the old version.
The bad thing is that nearly all current documentation is about ReplicationControllers.
Agreed, most docs are being updated. Unfortunately docs on the internet are harder to update than the ones on github.

The functionality of both Replica Controller and Replica Set is quiet the same - they are responsible to make sure that X number of pods with label that is equal to there label selector will be scheduled to different nodes on the cluster.
(Where X is the value that is specified in the spec.replicas field in the Replica Controller / Replica Set yaml).
ReplicaSet is a replacement for the Replica controller and supports richer expressions for the label selector.
You can choose between 4 values of operators In, NotIn, Exists, DoesNotExist - see Set-based requirement.
A rule of thumb: When you see Replica Controller is mentioned in one the docs or other tutorials - refer to it as ReplicaSet AND consider using Deployment instead.
There is also a small difference in the syntax between Replica Controller:
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 3
selector:
app: nginx
And the ReplicaSet which contains matchLabels field under the selector:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx
spec:
replicas: 3
selector:
matchLabels: #<-- This was added
tier: nginx

Related

Helm --force option

I read in a book written by Helm creators the following fact about the --force option :
Sometimes, though, Helm users want to make sure that the pods are restarted. That’s
where the --force flag comes in. Instead of modifying the Deployment (or similar
object), it will delete and re-create it. This forces Kubernetes to delete the old pods
and create new ones.
What I understand from that is, if I install a chart and then I change the number of replicas (=number of pods) then I upgrade the chart, it should recreate all the pods. This is not what happens in my case and I wanted to understand what I am missing here.
Let's take a hypothetical minimal Deployment (many required details omitted):
spec:
replicas: 3
template:
spec:
containers:
- image: abc:123
and you change this to only increase the replica count
spec:
replicas: 5 # <-- this is the only change
template:
spec:
containers:
- image: abc:123
The Kubernetes Deployment controller looks at this change and says "I already have 3 Pods running abc:123; if I leave those alone, and start 2 more, then I will have 5, and the system will look like what the Deployment spec requests". So absent any change to the embedded Pod spec, the existing Pods will be left alone and the cluster will just scale up.
deployment-12345-aaaaa deployment-12345-aaaaa
deployment-12345-bbbbb deployment-12345-bbbbb
deployment-12345-ccccc ---> deployment-12345-ccccc
deployment-12345-ddddd
deployment-12345-eeeee
(replicas: 3) (replicas: 5)
Usually this is fine, since you're running the same image version and the same code. If you do need to forcibly restart things, I'd suggest using kubectl rollout restart deployment/its-name rather than trying to convince Helm to do it.

If a Deployment in K8s creates replicas and a ReplicaSet, for you what is the point of creating a ReplicationController?

In the Kubernetes website it mentions
The following is an example of a Deployment. It creates a ReplicaSet
to bring up three nginx Pods:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
If it creates a ReplicaSet for you then for what reason would we create a new ReplicationController? Is that not duplicating something that exists?
I could just be confused as I'm still learning.
ReplicationController's updated version is ReplicaSet. ReplicaSet is considered as a replacement of replication controller. The key difference between the replica set and the replication controller is: replication controller only supports equality-based selector whereas the replica set supports set-based selector.
As far the k8s doc : ReplicaSet is the next-generation ReplicationController that supports the new set-based label selector. It's mainly used by Deployment as a mechanism to orchestrate pod creation, deletion and updates. Note that we recommend using Deployments instead of directly using Replica Sets, unless you require custom update orchestration or don't require updates at all.

Kubernetes Controllers - ReplicaSet vs Replication Controllers etc

I'm trying to understand basic Kubernetes concepts but its documentation a bit confusing, as for me.
For example, the Replication Controller is mentioned in the kube-controller-manager.
At the same time, Kubernetes Concepts page says about ReplicaSet object.
And only after some googling I found this post on Medium:
Replication Controllers perform the same function as ReplicaSets, but Replication Controllers are old school. ReplicaSets are the smart way to manage replicated Pods in 2019.
And this is not mentioned anywhere in the official docs.
Can somebody please explain to me about Endpoints and Namespace Controllers?
Are they still "valid" Controllers - or they are also outdated/replaced by some other controller/s?
Replica Controller Vs Replica Set
The functionality of both Replica Controller and Replica Set are quite the same - they are responsible to make sure that X number of pods with label that is equal to there label selector will be scheduled to different nodes on the cluster.
(Where X is the value that is specified in the spec.replicas field in the Replica Controller / Replica Set yaml).
ReplicaSet is a replacement for the Replica controller and supports richer expressions for the label selector.
You can choose between 4 values of operators In, NotIn, Exists, DoesNotExist - see Set-based requirement.
A rule of thumb: When you see Replica Controller is mentioned in one the docs or other tutorials - refer to it as ReplicaSet AND consider using Deployment instead.
Regarding Endpoints and Namespace Controllers
The K8S control plane contains multiple controllers - each controller watches a desired state of the resource that it responsible for (Pods, Endpoints, Namespaces etc') via an infinite control loop - also called Reconciliation Loop.
When a change is made to the desired state (by external client like kubectl) the reconciliation loop detects this and attempts to mutate the existing state in order to match the desired state.
For example, if you increase the value of the replicas field from 3 to 4, the ReplicaSet controller would see that one new instance needs to be created and will make sure it is scheduled in one of the nodes on the cluster. This reconciliation process applies to any modified property of the pod template.
K8S supports the following controllers (at least those which I'm familiar with):
1 ) ReplicaSet controller.
2 ) DaemonSet controller.
4 ) Job controller.
5 ) Deployment controller.
6 ) StatefulSet controller.
7 ) Service controller.
8 ) Node controller.
9 ) Endpoints controller. # <---- Yes - its a valid controller.
10 ) Namespace controller. # <---- Yes - its a valid controller.
11 ) Serviceaccounts controller.
12 ) PersistentVolume controller.
13 ) More?
All resides in the control plane under a parent unit which is called the 'Controller Manager'.
Additional point
There is also a small difference in the syntax between Replica Controller:
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 3
selector:
app: nginx
And the ReplicaSet which contains matchLabels field under the selector:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx
spec:
replicas: 3
selector:
matchLabels: #<-- This was added
tier: nginx
replication controllers are deprecated and is not recommended any more. Use ReplicaSets instead.
With ReplicaSet you define number of replicas you want to run for a specific application or a service. You would have those many replicas running at any point of time in the kubernetes cluster. It is taken care by ReplicaSet controller.

Gitlab Autodevops How to always keep one pod alive

I'm using Gitlab Autodevops to deploy app on my kubernetes cluster. That app should always have only one instance running.
Problem is, during the update process, Helm kills currently running pod before the new pod is ready. This causes downtime period, when old version is already killed and new one isn't ready yet. To make it worse, app need significant time to start (2+ minutes).
I have tried to set minAvailable: 1 in PodDisruptionBudget, but no help.
Any idea how can i tell helm to wait for readiness of updated pod before killing old one? (Having 2 instances running simultaneously for several second is not such a problem for me)
You can release a new application version in few ways, it's necessary to choose the one that fit your needs.
I would recommend one of the following:
Ramped - slow rollout
A ramped deployment updates pods in a rolling update fashion, a secondary ReplicaSet is created with the new version of the application, then the number of replicas of the old version is decreased and the new version is increased until the correct number of replicas is reached.
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 2 # how many pods we can add at a time
maxUnavailable: 0 # maxUnavailable define how many pods can be unavailable
# during the rolling update
Full example and steps can be found here.
Blue/Green - best to avoid API versioning issues
A blue/green deployment differs from a ramped deployment because the “green” version of the application is deployed alongside the “blue” version. After testing that the new version meets the requirements, we update the Kubernetes Service object that plays the role of load balancer to send traffic to the new version by replacing the version label in the selector field.
apiVersion: v1
kind: Service
metadata:
name: my-app
labels:
app: my-app
spec:
type: NodePort
ports:
- name: http
port: 8080
targetPort: 8080
# Note here that we match both the app and the version.
# When switching traffic, we update the label “version” with
# the appropriate value, ie: v2.0.0
selector:
app: my-app
version: v1.0.0
Full example and steps can be found here.
Canary - for testing
A canary deployment consists of routing a subset of users to a new functionality. In Kubernetes, a canary deployment can be done using two Deployments with common pod labels. One replica of the new version is released alongside the old version. Then after some time and if no error is detected, scale up the number of replicas of the new version and delete the old deployment.
Using this ReplicaSet technique requires spinning-up as many pods as necessary to get the right percentage of traffic. That said, if you want to send 1% of traffic to version B, you need to have one pod running with version B and 99 pods running with version A. This can be pretty inconvenient to manage so if you are looking for a better managed traffic distribution, look at load balancers such as HAProxy or service meshes like Linkerd, which provide greater controls over traffic.
Manifest for version A:
spec:
replicas: 3
Manifest for version B:
spec:
replicas: 1
Full example and steps can be found here.
You can also play with Interactive Tutorial - Updating Your App on Kubernetes.
I recommend reading Deploy, Scale And Upgrade An Application On Kubernetes With Helm.

Kubernetes set deploment number of replicas based on namespace

I've split our Kubernetes cluster into two different namespaces; staging and production, aiming to have production deployments having two replicas (for rolling deployments, autoscaling comes later) and staging having one single replica.
Other than having one deployment configuration per namespace, I was wondering whether or not we could set the default number of replicas per deployment, per namespace?
When creating the deployment config, if you don't specify the number of replicas, it will default to one. Is there a way of defaulting it to two on the production namespace?
If not, is there a recommended approach for this which will prevent the need to have a deployment config per namespace?
One way of doing this would be to scale the deployment up to two replicas, manually, in the production namespace, once it has been created for the first time, but I would prefer to skip any manual steps.
It is not possible to set different number of replicas per namespace in one deployment.
But you can have 2 different deployment files 1 per each namespace, i.e. <your-app>-production.yaml and <your-app>-staging.yaml.
In these descriptions you can determine any custom values and settings that you need.
For an example:
<your-app>-production.yaml:
apiVersion: v1
kind: Deployment
metadata:
name: <your-app>
namespace: production #Here is namespace
...
spec:
replicas: 2 #Here is the count of replicas of your application
template:
spec:
containers:
- name: <your-app-pod-name>
image: <your-app-image>
...
<your-app>-staging.yaml:
apiVersion: v1
kind: Deployment
metadata:
name: <your-app>
namespace: staging #Here is namespace
...
spec:
replicas: 1 #Here is the count of replicas of your application
template:
spec:
containers:
- name: <your-app-pod-name>
image: <your-app-image>
...
I don't think you can avoid having two deployments, but you can get rid of the duplicated code by using helm templates (https://docs.helm.sh/chart_template_guide). Then you can define a single deployment yaml and substitute different values when you deploy with an if statement.
When creating the deployment config, if you don't specify the number of replicas, it will default to one. Is there a way of defaulting it to two on the production namespace?
Actually, there are two ways to do it, but both of them involved coding.
Admission Controllers:
This is the recommended way of assigning default values to fields.
While creating objects in Kubernetes, it passes through some admission controllers and one of them is MutatingWebhook.
MutatingWebhook has been upgraded to beta version since v1.9+. This admission controller modifies (mutates) the object before actully created (or modified/deleted), say, assigning default values of some fields and some similar task. You can change the minimum replicas number here.
User Have to implement a admission server to receive requests from kubernetes and give modified object as response accordingly.
Here is a sample admission server implemented by Openshift kubernetes-namespace-reservation.
Deployment Controller:
This is comparatively easier but kind of hacking the deployment procedure.
You can write a Deployment controller which will watch for deployment and if there is any deployment made, it will do some task. Here, you can update the deployment with some minimum values you wish.
You can see the official Sample Pod Controller.
If both of them seems lots to do, it is better to assign fields more carefully each time for each deployment.