What is the purpose of a kubernetes deployment pod selector? - kubernetes

I fail to see why kubernetes need a pod selector in a deployment statement that can only contain one pod template? Feel free to educate me why kubernetes engineers introduced a selector statement inside a deployment definition instead of automatically select the pod from the template?
---
apiVersion: v1
kind: Service
metadata:
name: grpc-service
spec:
type: LoadBalancer
ports:
- name: grpc
port: 8080
targetPort: 8080
protocol: TCP
selector:
app: grpc-test
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: grpc-deployment
spec:
replicas: 1
revisionHistoryLimit: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
selector:
matchLabels:
app: grpc-test
template:
metadata:
labels:
app: grpc-test
spec:
containers:
...
Why not simply define something like this?
---
apiVersion: v1
kind: Service
metadata:
name: grpc-service
spec:
type: LoadBalancer
ports:
- name: grpc
port: 8080
targetPort: 8080
protocol: TCP
selector:
app: grpc-test
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: grpc-deployment
spec:
replicas: 1
revisionHistoryLimit: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
app: grpc-test
spec:
containers:
...

Ah! Funny enough, I have once tried wrapping my head around the concept of label selectors as well before. So, here it goes...
First of all, what the hell are these labels used for? Labels within Kubernetes are the core means of identifying objects. A controller controls pods based on their label instead of their name. In this particular case they are meant to identify the pods belonging to the deployment’s replica set.
You actually didn’t have to implicitly define .spec.selector when using the v1beta1 extensions. It would in that case default from .spec.template.labels. However, if you don’t, you can run into problems with kubectl apply once one or more of the labels that are used for selecting change because kubeclt apply will look at kubectl.kubernetes.io/last-applied-configuration when comparing changes and that annotation will only contain the user input when he created the resource and none of the defaulted fields. You’ll get an error because it cannot calculate the diff like:
spec.template.metadata.labels: Invalid value: {"app":"nginx"}: `selector` does not match template `labels`
As you can see, this is a pretty big shortcoming since it means you can not change any of the labels that are being used as a selector label or it would completely break your deployment flow. It was “fixed” in apps/v1beta2 by requiring selectors to be explicitly defined, disallowing mutation on those fields.
So in your example, you actually don’t have to define them! The creation will work and will use your .spec.template.labels by default. But yeah, in the near future when you have to use v1beta2, the field will be mandatory. I hope this kind of answers your question and I didn’t make it any more confusing ;)

However, if you don’t, you can run into problems with kubectl apply once one or more of the labels that are used for selecting change because kubeclt apply will look at kubectl.kubernetes.io/last-applied-configuration when comparing changes and that annotation will only contain the user input when he created the resource and none of the defaulted fields.
Quoting from Toon's answer.
My interpretation is it's not logically necessary at all. It's only due to the limitation of the current implementation of Kubernetes, that it has some weird "behavior" in that the functionality it uses to "compare" two deployments / objects does not take into account "default values".

It is a method to decouple a replicaset type from a pod type. There are many similar answers here, but the crux of it is that a deployment/replicaset may be changed at a future point in time, but it won't know what the previous selector was for the last revision. It would have to look at the last revision's template.metadata.labels and then recursively apply those pod labels as the current revision selector. But wait! What if the template.metadata.labels in the current revision changes? Now how do you account for two template.metadata.labels label sets if the new spec doesn't include the same label(s) in the prior revision where the matchLabels was inferred?
Consider inferred matchLabels:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: grpc-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: grpc-test
spec:
containers:
...
Now if I were to go and revise this deployment, my client-side doesn't have awareness of the inferred matchLabels, so my changes would need to account for existing pods. Server-side could do some magic to assume the context in a diff, but what if I changed my template.metadata.labels:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: grpc-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: grpc-test-new
spec:
containers:
...
Now my deployment would need to both infer the new template.metadata.labels as well as munged with the existing server-side, else you end up orphaning a bunch of pods.
I hope this helps illustrate a scenario where explicitly defining the selector allows you to be more flexible in your template updates while still retaining the revision history of previous selectors.

As far as I know, the selector in the deployment is an optional property.
The template is the only required field of spec.
So, you don't need the use the label selector in the deployment, and in you're example I don't see why you couldn't use the latter part?

Deployments are dynamic objects, for example, when your system need a scale up and add more Pods. The template section only defines the Pods that this Deployment would create when you do kubectl apply, while the selector section ensures that the newly created Pods by scaling up are still managed by the already existing Deployment.
Generally speaking, Deployment continuously watches all the Pods and see if there are any Pods it should control, via the selector section.

Related

How to (re-)name a pod in a K8s deployment?

I want to deploy two containers in a pod through a deployment. But I want the pod to have exactly the name yoda. But in my case, a random string is always append after yoda like that yoda-f8bcb7bf4-khml6. Is it possible to force the pod name? I try the following but I did not get what I expected.
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: yoda
name: yoda
spec:
replicas: 1
selector:
matchLabels:
app: yoda
strategy: {}
template:
metadata:
creationTimestamp: null
name: yoda
labels:
app: yoda
spec:
containers:
- image: busybox
name: anakin
resources: {}
- image: nginx
name: obiwan
resources: {}
status: {}
Regards,
Benoît
This may not be the answer you expect but with Kubernetes pods should not be seen as pets, i. e. they should not receive a lot of attention but considered as highly replaceable. The name generation is part of this consideration among others to avoid conflicts.
Almost all ways of Kubernetes involve a kind of decoupling, including container rollouts. If a pod always receives the same name it cuts itself from things like rolling deployment strategies, in which on pod terminates while another spawns. Alternatively a conflict would be the alternative.
Without a deeper discussion why the pod should be maintained by hand I am not sure you will find a proper solution.
To give some perspective:
Labels (which you already use) give a good way to select a certain pod. If you change the deployment with a different image there might be two pods be selectable with your yoda label.
So, if you want to select either the older or the newer pod (but not both), adding another label with the respective version could solve the distinguishing problem (if that is what you want). See the template metadata section below.
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: yoda
name: yoda
spec:
replicas: 1
selector:
matchLabels:
app: yoda
strategy: {}
template:
metadata:
name: yoda
labels:
app: yoda
app.version: 2.0.0
spec:
containers:
- image: busybox
name: anakin
resources: {}
- image: nginx
name: obiwan
resources: {}
I hope this helps.
I am not sure if statefulset can solve your issue. But the statefulset always retain pod name.How ever it also append a numeric number(start from 0) after the pod name & goes upto no of replicas you define in the yaml definition file.
For example, if you define the replica count to 3 in statefulset definition yaml file, then pod's name will be listed below.
[podName]-0
[podName]-1
[podName]-2

Kubernetes deployment/service specification app vs run label

I am trying to connect a k8s deployment to a (Oracle DB) deployment/service. Here is my DB deployment and service:
apiVersion: v1
kind: Service
metadata:
name: oracle-db
labels:
app: oracle-db
spec:
ports:
- name: oracle-db
port: 1521
protocol: TCP
targetPort: 1521
selector:
app: oracle-db
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: oracle-db-depl
labels:
app: oracle-db
spec:
selector:
matchLabels:
app: oracle-db
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: oracle-db
spec:
containers:
- name: oracle-db
image: oracledb:latest
imagePullPolicy: Always
ports:
- containerPort: 1521
env:
...
I'm wondering in the labels sections, what exactly is the difference between specifying 'run' vs 'app' (both of which I have seen used). I have scoured the k8s documentation and cannot find an answer.
Labels are arbitrary key value pairs. There is no special meaning of app or run. You can choose any key and value for your labels. One thing to remember though is that the service's selector needs to have a label which matches with what has been specified as label on the deployment otherwise it will not work.
So if you have app: oracle-db label in your deployment then have app: oracle-db in service's selector and if you have run: oracle-db label in your deployment then have run: oracle-db in service's selector.
Actually the only difference between run and app is the name, labels are used to identify the object in Kubernetes and you can give the name that you like, not necessarily app or run.
You probably can find a lot of run online because if you create an object via imperative command the tag run will be placed automatically for you.
Of course you can change this to a key/value pair that makes more sense to you.
According to k8s documentation:
Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system.
Labels can be used to organize and to select subsets of objects.
Labels can be attached to objects at creation time and subsequently added and modified at any time. Each object can have a set of key/value labels defined.
Each Key must be unique for a given object

Kubernetes - can a Deployment contain a Service?

Just finished reading Nigel Poulton's The Kubernetes Book, but I am somewhat puzzled with Services.
Could a Service be added to the Deployment manifest below somehow?Or does the Service have to be POSTed on its own? Isn't the whole purpose of a deployment to specify everything needed for the app to run?
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: hello-deploy
spec:
replicas: 10
selector:
matchLabels:
app: hello-world
minReadySeconds: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: hello-pod
image: nigelpoulton/k8sbook : latest
ports:
- containerPort: 8080
They're different objects and you have to submit them separately (HTTP POST, kubectl apply, ...).
There are a couple of tricks you can do to minimize the impact of this:
You can use a multi-document YAML file and submit that as a single thing, like
---
apiVersion: apps/v1
kind: Deployment
...
---
apiVersion: v1
kind: Service
...
There is an undocumented kind: List that could embed multiple objects
apiVersion: v1
kind: List
items:
- apiVersion: apps/v1
kind: Deployment
...
- apiVersion: v1
kind: Service
...
You can use a higher-level deployment manager such as Helm that lets you keep each object in a separate file, but deploy them in a single command.
It's perhaps unfortunate that a couple of Kubernetes objects have names that are different from their plain English meanings (a Deployment doesn't cover all of the steps or parts of deploying a whole application; a Service is just an IP/DNS pointer and not a service implementation) but that's the way it is. I tend to capitalize the Kubernetes object names when it will disambiguate things.
Isn't the whole purpose of a deployment to specify everything needed for the app to run?
The whole purpose of "Deployment" is to manage the deployment of pods/replicasets including replication, scaling, rolling update, rollbacks. The DeploymentController is part of the master node's controller manager, and it makes sure that the current state always matches the desired state.
does the Service have to be POSTed on its own?
If you are familiar with Load balancers terminology, Services are frontends and Pods are its backends. Since it is frontend, Service forwards requests to its backend (pods).

what's the differences between the different value for tag strategy in k8s yaml file

i test with two yaml file, that only different in tag strategy
first one:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
test.k8s: test
name: test
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
test.k8s: test
spec:
containers:
- name: test
image: alpine3.6
imagePullPolicy: IfNotPresent
...
the second:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
test.k8s: test
name: test
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
template:
metadata:
labels:
test.k8s: test
spec:
containers:
- name: test
image: alpine3.6
imagePullPolicy: IfNotPresent
...
then I update the deployment with kubectl patch and kubectl replace command.
it seems only the new pod start time different.
and the old pod will be terminated at the end under the two conditions when the new pod start failed with missing image.
does anyone knows about it?
many thanks~
Basically, .spec.strategy tag specifies the way how the cluster engine replaces old Pods with new ones.
In your case, .spec.strategy.type==Recreate tag tells cluster engine to terminate (kill) all existing Pods before new ones are created.
As for the second example, .spec.strategy.type==RollingUpdate tag describes approach to update a service without a temporary outage, as it concerns to update one pod per time to avoid service unavailability.
From your example, there are two parameters which define RollingUpdate strategy:
.spec.strategy.rollingUpdate.maxUnavailable - indicates the maximum number of Pods that can be unavailable during the update process.
.spec.strategy.rollingUpdate.maxSurge - specifies the maximum number of Pods that can be created over the desired number of Pods.
There are several additional parameters which you can consider to use in RollingUpdate, for more information, refer to the Documentation.
By using kubectl replace command you recreate strategy and rebuild object, but not update.

Is it possible to move the running pods from ReplicationController to a Deployment?

We are using RC to run our workload and want to migrate to Deployment. Is there a way to do that with out causing any impact to the running workload. I mean, can we move these running pods under Deployment?
Like, #matthew-l-daniel answered, the answer is yes. But I am more than 80% certain about it. Because I have tested it
Now whats the process we need to follow
Lets say I have a ReplicationController.
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 3
selector:
app: nginx
template:
metadata:
name: nginx
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Question: can we move these running pods under Deployment?
Lets follow these step to see if we can.
Step 1:
Delete this RC with --cascade=false. This will leave Pods.
Step 2:
Create ReplicaSet first, with same label as ReplicationController
apiVersion: apps/v1beta2
kind: ReplicaSet
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
---
So, now these Pods are under ReplicaSet.
Step 3:
Create Deployment Now with same label.
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
----
And Deployment will find one ReplicaSet already exists and our job is done.
Now we can check increasing replicas to see if it works.
And It works.
Which way It doesn't work
After deleting ReplicationController, do not create Deployment directly. This will not work. Because, Deployment will find no ReplicaSet, and will create new one with additional label which will not match with your existing Pods
I'm about 80% certain the answer is yes, since they both use Pod selectors to determine whether new instances should be created. The key trick is to use the --cascade=false (the default is true) in kubectl delete, whose help even speaks to your very question:
--cascade=true: If true, cascade the deletion of the resources managed by this resource (e.g. Pods created by a ReplicationController). Default true.
By deleting the ReplicationController but not its subordinate Pods, they will continue to just hang out (although be careful, if a reboot or other hazard kills one or all of them, no one is there to rescue them). Creating the Deployment with the same selector criteria and a replicas count equal to the number of currently running Pods should cause a "no action" situation.
I regret that I don't have my cluster in front of me to test it, but I would think a small nginx RC with replicas=3 should be a simple enough test to prove that it behaves as you wish.