In a Kubernetes deployment yaml, why do we have to match the template labels to the deployment labels in the selector? - kubernetes

I am new to Kubernetes, so this might be obvious, but in a deployment yaml, why do we have to define the labels in the deployment metadata, then define the same labels in the template metadata, but match those separately in the selector?
Shouldn't it be obvious that the template belongs to the deployment it's under?
Is there a use case for a deployment to have a template it doesn't match?
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-backend
spec:
replicas: 2
selector:
matchLabels:
app: api-backend
template:
metadata:
labels:
app: api-backend
spec:
#...etc
I might be missing some key understanding of k8s or yamls.
Having the template with no label, and it seems to work, but I don't understand why. Kubernetes could be auto-magically inserting the labels.

Technically, the parameter matchLabels decides on which Pods belongs to the given Deployment (and the underlying ReplicaSet). In practice, I have never seen a Deployment with different labels than matchLabels. So, the reason might be the uniformity between other Kubernetes resources (like Service where the matchLabels makes more sense).
I recommend reading the blog post matchLabels, labels, and selectors explained in detail, for beginners.

Let's simplify labels, selectors and template labels first.
The Labels in the metadata section are assigned to the deployment itself.
The Labels in the .spec.template section are assigned to the pods created by the deployment. These are actually called PodTemplate labels.
The selectors provide uniqueness to your resource. It is used to identify the resources that match the labels in .spec.selector.matchLabels section.
Now, it is not mandatory to have all the podTemplate Labels in the matchLabels section. a pod can have many labels but only one of the matchLabels is enough to identify the pods. Here's an use case to understand why it has to be used
"Let’s say you have deployment X which creates two pods with label nginx-pods and image nginx and another deployment Y which applies to the pods with the same label nginx-pods but uses images nginx:alpine. If deployment X is running and you run deployment Y after, it will not create new pods, but instead, it will replace the existing pods with nginx:alpine image. Both deployment will identify the pods as the labels in the pods matches the labels in both of the deployments .spec.selector.matchLabels"

Because the Deployment.metadata.labels belongs to the Deployment resource, and the Deployment.spec.template.metadata.labels to the Pods which are handled by the Deployment controller. The Deployment controller knows which Pods are belongs to which Deployment based on the labels on the Pod resources.
This is why you have to specify the labels this way.

Related

service selector vs deployment selector matchlabels

I understand that services use a selector to identify which pods to route traffic to by thier labels.
apiVersion: v1
kind: Service
metadata:
name: svc
spec:
ports:
- name: tcp
protocol: TCP
port: 443
targetPort: 443
selector:
app: nginx
Thats all well and good.
Now what is the difference between this selector and the one of the spec.selector from the deployment. I understand that it is used so that the deployment can match and manage its pods.
I dont understand however why i need the extra matchLabels declaration and cant just do it like in the service. Whats the use of this semantically?
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
Thanks in advance
In the Service's spec.selector, you can identify which pods to route traffic to only by their labels.
On the other hand, in the Deployment's spec.selector you have two options to decide on which node the pods will be scheduled on, which are: matchExpressions, matchLabels.
How Deployment uses spec.selector
When a Deployment is changed, a new ReplicaSet is created. The ReplicaSet is responsible to manage the Pods. It uses the spec.selector to know what Pods it should manage.
Example:
If the replicas: 1 is changed in the Deployment to e.g. replicas: 2 a new ReplicaSet is created, and it observes the Pods using spec.selector to match Pods with matching labels. It only see 1 replica initially, but its desired state is now replicas: 2 so it is responsible for creating additionally one Pod from the template in the Deployment.
Selector syntax
There is two ways to declare the labels under the spec.selector in `Deployment.
matchLabels - you declare the labels
matchExpressions - you write an expression for labels
See kubectl explain deployment.spec.selector for full explanation of spec.selector alternatives.
Labels and Selectors
Labels and Selectors is a generic concept in Kubernetes and is used in multiple places. Another example is how you can filter what resources you want to see or use with kubectl. E.g. you can select the Pods for an app with:
kubectl get pod -l=app=myappname
(if your Pods is labelled with app: myappname.
why i need the extra matchLabels declaration and cant just do it like in the service. Whats the use of this semantically?
Because service spec only support equality-based selectors and the deployment is a newer resource that supports two syntax (equality-based and set-based).
The API currently supports two types of selectors: equality-based and set-based. A label selector can be made of multiple requirements which are comma-separated. In the case of multiple requirements, all must be satisfied so the comma separator acts as a logical AND (&&) operator.
Reference
The Service spec uses just the "equality-based" label selector syntax.
Newer resources, such as Job, Deployment, ReplicaSet, and DaemonSet, support set-based requirements...
Reference
My understanding is that earlier the only supported syntax was the equality-based one, like we have on the service spec, and that now, when the resource you are using supports the new syntax, you are required to use matchLabels or matchExpressions.

Kubectl get deployments, no resources

I've just started learning kubernetes, in every tutorial the writer generally uses "kubectl .... deploymenst" to control the newly created deploys. Now, with those commands (ex kubectl get deploymets) i always get the response No resources found in default namespace., and i have to use "pods" instead of "deployments" to make things work (which works fine).
Now my question is, what is causing this to happen, and what is the difference between using a deployment or a pod? ? i've set the docker driver in the first minikube, it has something to do with this?
First let's brush up some terminologies.
Pod - It's the basic building block for Kubernetes. It groups one or more containers (such as Docker containers), with shared storage/network, and a specification for how to run the containers.
Deployment - It is a controller which wraps Pod/s and manages its life cycle, which is to say actual state to desired state. There is one more layer in between Deployment and Pod which is ReplicaSet : A ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
Below is the visualization:
Source: I drew it!
In you case what might have happened :
Either you have created a Pod not a Deployment. Therefore, when you do kubectl get deployment you don't see any resources. Note when you create Deployments it in turn creates a ReplicaSet for you and also creates the defined pods.
Or may be you created your deployment in a different namespace, if that's the case, then type this command to find your deployments in that namespace kubectl get deploy NAME_OF_DEPLOYMENT -n NAME_OF_NAMESPACE
More information to clarify your concepts:
Source
Below the section inside spec.template is the section which is supposedly your POD manifest if you were to create it manually and not take the deployment route. Now like I said earlier in simple terms Deployments are a wrapper to your PODs, therefore anything which you see outside the path spec.template is the configuration which you will need to defined on how you want to manage (scaling,affinity, e.t.c) your POD
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
Deployment is a controller providing higher level abstraction on top of pods and ReplicaSets. A Deployment provides declarative updates for Pods and ReplicaSets. Deployments internally creates ReplicaSets within which pods are created.
Use cases of deployment is documented here
One reason for No resources found in default namespace could be that you created the deployment in a specific namespace and not in default namespace.
You can see deployments in a specific namespace or in all namespaces via
kubectl get deploy -n namespacename
kubectl get deploy -A

kubernetes Service select multi labels

I have two StatefulSets named my-sts and my-sts-a, want to create a single service, that addresses same-indexed pods, from two different StatefulSets, like: my-sts-0 and my-sts-a-0. But found k8s doc says:
Labels selectors for both objects are defined in json or yaml files using maps, and only equality-based requirement selectors are supported
My idea is to create a label for the two sts pods like:
my-sts-0 has label abc:sts-0
my-sts-a-0 has label abc:sts-0
my-sts-1 has label abc:sts-1
my-sts-a-1 has label abc:sts-1
How to get the index of those pods so that I can create a label named abc=sts-<index> to approach it?
Is there any other way?
Kubernetes already gives you a DNS name to select individual StatefulSet pods. Say you have a Service my-sts that matches every pod in the StatefulSet, and the StatefulSet is set up with serviceName: my-sts; then you can access host names my-sts-0.my-sts.namespace.svc.cluster.local and so on.
If you specifically want a service to target a specific pod, there is also a statefulset.kubernetes.io/pod-name label that gets added automatically, so you can attach to that
apiVersion: v1
kind: Service
metadata:
name: my-sts-master
spec:
selector:
statefulset.kubernetes.io/pod-name: my-sts-0
ports: [...]

What is the behavior of Kubernetes if multiple ReplicaControllers have overlapping label selectors?

In Kubernetes, ReplicaController uses label selectors to specify which pod(s) it will operate on.
Now, my question is, if multiple ReplicaControllers have overlapping label selectors, what will be the behavior of the Kubernetes cluster? Or is it considered some kind of error and should be avoided?
For example, I have two ReplicationControllers described as below.
rc1.yaml:
apiVersion: v1
kind: ReplicationController
metadata:
name: frontend
spec:
replicas: 3
selector:
app: frontend
team: payment
rc2.yaml:
apiVersion: v1
kind: ReplicationController
metadata:
name: backend
spec:
replicas: 2
selector:
app: backend
team: payment
You can see that they both have the label selector team=payment, but one specifies a replica count of 3 while another of 2.
Any explanations or references will be appreciated. Thanks.
The Kubernetes docs state the following about overlapping selectors (either from other pods, replication controllers or jobs):
Also you should not normally create any pods whose labels match this selector, either directly, with another ReplicationController, or with another controller such as Job. If you do so, the ReplicationController thinks that it created the other pods. Kubernetes does not stop you from doing this.
If you do end up with multiple controllers that have overlapping selectors, you will have to manage the deletion yourself
In the docs about Deployments there is another statement that confirms the above:
If you have multiple controllers that have overlapping selectors, the controllers will fight with each other and won’t behave correctly.
So, in summary, you should try to avoid overlapping selectors for replication controllers, replica sets (the next-generation replication controller which I would advise you to use instead of replication controllers) and deployments because it might probably lead to severe problems as confirmed by this issue.

Kubernetes set deploment number of replicas based on namespace

I've split our Kubernetes cluster into two different namespaces; staging and production, aiming to have production deployments having two replicas (for rolling deployments, autoscaling comes later) and staging having one single replica.
Other than having one deployment configuration per namespace, I was wondering whether or not we could set the default number of replicas per deployment, per namespace?
When creating the deployment config, if you don't specify the number of replicas, it will default to one. Is there a way of defaulting it to two on the production namespace?
If not, is there a recommended approach for this which will prevent the need to have a deployment config per namespace?
One way of doing this would be to scale the deployment up to two replicas, manually, in the production namespace, once it has been created for the first time, but I would prefer to skip any manual steps.
It is not possible to set different number of replicas per namespace in one deployment.
But you can have 2 different deployment files 1 per each namespace, i.e. <your-app>-production.yaml and <your-app>-staging.yaml.
In these descriptions you can determine any custom values and settings that you need.
For an example:
<your-app>-production.yaml:
apiVersion: v1
kind: Deployment
metadata:
name: <your-app>
namespace: production #Here is namespace
...
spec:
replicas: 2 #Here is the count of replicas of your application
template:
spec:
containers:
- name: <your-app-pod-name>
image: <your-app-image>
...
<your-app>-staging.yaml:
apiVersion: v1
kind: Deployment
metadata:
name: <your-app>
namespace: staging #Here is namespace
...
spec:
replicas: 1 #Here is the count of replicas of your application
template:
spec:
containers:
- name: <your-app-pod-name>
image: <your-app-image>
...
I don't think you can avoid having two deployments, but you can get rid of the duplicated code by using helm templates (https://docs.helm.sh/chart_template_guide). Then you can define a single deployment yaml and substitute different values when you deploy with an if statement.
When creating the deployment config, if you don't specify the number of replicas, it will default to one. Is there a way of defaulting it to two on the production namespace?
Actually, there are two ways to do it, but both of them involved coding.
Admission Controllers:
This is the recommended way of assigning default values to fields.
While creating objects in Kubernetes, it passes through some admission controllers and one of them is MutatingWebhook.
MutatingWebhook has been upgraded to beta version since v1.9+. This admission controller modifies (mutates) the object before actully created (or modified/deleted), say, assigning default values of some fields and some similar task. You can change the minimum replicas number here.
User Have to implement a admission server to receive requests from kubernetes and give modified object as response accordingly.
Here is a sample admission server implemented by Openshift kubernetes-namespace-reservation.
Deployment Controller:
This is comparatively easier but kind of hacking the deployment procedure.
You can write a Deployment controller which will watch for deployment and if there is any deployment made, it will do some task. Here, you can update the deployment with some minimum values you wish.
You can see the official Sample Pod Controller.
If both of them seems lots to do, it is better to assign fields more carefully each time for each deployment.