How to do controlled rollout using Kubernetes deployment - kubernetes

We have 1000 store nodes and need to deploy an application image on every kubernetes node by rolling out in the below order and would like to specify the deployment node details during the deployment. Is there a way to specify node details in the command line when we execute kubectl create or apply deployment commands?
This application image would be configured to store/node specific details during container/POD creation.
1 node on day 1,
10 node on day 2,
100 node on day 3 etc.

Answering on the question from the title:
How to do controlled rollout using Kubernetes deployment
You can create a Deployment that will have specific fields in its manifest that will configure the way Kubernetes handles it.
With the fields like: podAntiAffinity, requiredDuringSchedulingIgnoredDuringExecution you can ensure that Kubernetes will distribute the Pods equally across the cluster Nodes. You can read more about it by following below documentation:
Kubernetes.io: Docs: Concepts: Scheduling eviction: Assign Pod to Node
Having in mind following rollout schedule:
DAY
REPLICAS_COUNT
1
1
2
10
3
100
4
1000
You could use CI/CD tools (like for example Jenkins) to rollout (change) the amount of replicas of your Deployment across a specific schedule.
You could create a Jenkins pipeline with a deploy stage where you could put your own command with it's scheduler (or delay).
The example of such Deployment that could be used with Jenkins is following:
cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: ${REPLICAS_COUNT}
template:
metadata:
labels:
app: nginx
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx
topologyKey: "kubernetes.io/hostname"
containers:
- name: nginx
image: nginx
EOF
This Deployment will assign Pods to the Nodes that aren't already having an already running replica of this Deployment (i.e. 1 Pod = 1 Node). If the amount of Pods exceeds the amount of Nodes they will remain in Pending state.
Additional resources:
Jenkins.io: Doc: Pipeline: Tour: Environment
Kubernetes.io: Docs: Concepts: Workloads: Controllers: Deployment

Related

How to manually autoscale pods while load balancing?

I have tried defining LoadBalancer in my service type and creating a deployment for it with 3 replicas:
kind: Service
apiVersion: v1
metadata:
name: springboot-postgres-k8s
labels:
name: springboot-postgres-k8s
spec:
ports: # ...
selector: # type: ...
type: LoadBalancer # <=====
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: springboot-postgres-k8s
spec:
selector:
matchLabels:
app: springboot-postgres-k8s
replicas: 3 # <=====
template: # ...
This starts up three pod and a load balancer which successfully load balances request among these three pods.
I want to know if k8s allows to manually autoscale the pods. That is if my cluster with 3 replicas + a load balancer is up and running, how can I manually increase replicas and still make the existing load balancer to scale across all 4 replicas (3 old and one newly created)?
Do I have to run (ref1 ref2):
kubectl scale --current-replicas=3 --replicas=4 deployment/springboot-postgres-k8s
Q1. Will above command notify existing load balancer of newly created pod?
or I have to run following (as specified in ref2)
kubectl scale --replicas=4 -f foo.yaml
Q2. Will above command notify existing load balancer of newly created pod?
Q3. What if my foo.yaml contains both service and deployment definition?
Yes its allows manual autoscale.
When you create a service in Kubernetes, k8s automatically creates kind: Endpoints type resource(s) for your pods with a matching label selector. This resource referenced by a Service to define which Pods the traffic can be sent to and periodically updated by the k8s when pods are created or deleted.
So regardless of the creation time of resources, k8s will handle the update and pods will be able to receive traffic from loadbalancer.

Spread specific number of deployment pods per node

I have an EKS node group with 2 nodes for compute workloads. I use a taint on these nodes and tolerations in the deployment. I have a deployment with 2 replicas I want these two pods to be spread on these two nodes like one pod on each node.
I tried using:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- appname
Each pod is put on each node but if I updated the deployment file like changing its image name, it fails to schedule a new pod.
I also tried:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: type
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
type: compute
but they aren't spread evenly like 2 pods on a node.
Try adding:
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
By default K8s is trying to scale the new replicaset up first before it starts downscaling the old replicas. Since it cannot schedule new replicas (because antiaffinity) they are stuck in pending state.
Once you set the deployment's maxSurge=0, you tell k8s that you don't want the deployment to scale up first during update, and thus in result it can only scale down making place for new replicas to be scheduled.
Setting maxUnavailable=1 tells k8s to replace only one pod at a time.
You can use DeamonSet instead of Deployment. A DaemonSet ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.
See documentation for Deamonset
I was having the same problem with pods failing to schedule and getting stuck in pending state while rolling out new versions while my goal was to run exactly 3 pods at all times, 1 on each of the 3 available nodes.
That means I could not use maxUnavailable: 1 because that would temporarily result in less than 3 pods during the rollout.
Instead of using the app name label for matching anti-affinity, I ended up using a label with a random value ("version") on each deployment. This means new deployments will happily schedule pods to nodes where a previous version is still running, but the new versions will always be spread evenly.
Something like this:
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
template:
metadata:
labels:
deploymentVersion: v1
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: deploymentVersion
operator: In
values:
- v1
topologyKey: "kubernetes.io/hostname"
v1 can be anything that's a valid label and changes on every deployment attempt.
I'm using envsubst to have dynamic variables in yaml files:
DEPLOYMENT_VERSION=$(date +%s) envsubst < deploy.yaml | kubectl apply -f -
And then the config looks like this:
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
template:
metadata:
labels:
deploymentVersion: v${DEPLOYMENT_VERSION}
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: deploymentVersion
operator: In
values:
- v${DEPLOYMENT_VERSION}
topologyKey: "kubernetes.io/hostname"
I wish Kubernetes offered a more straightforward way to achieve this.

How do I make Kubernetes scale my deployment based on the "ready"/ "not ready" status of my Pods?

I have a deployment with a defined number of replicas. I use readiness probe to communicate if my Pod is ready/ not ready to handle new connections – my Pods toggle between ready/ not ready state during their lifetime.
I want Kubernetes to scale the deployment up/ down to ensure that there is always the desired number of pods in a ready state.
Example:
If replicas is 4 and there are 4 Pods in ready state, then Kubernetes should keep the current replica count.
If replicas is 4 and there are 2 ready pods and 2 not ready pods, then Kubernetes should add 2 more pods.
How do I make Kubernetes scale my deployment based on the "ready"/ "not ready" status of my Pods?
I don't think this is possible. If pod is not ready, k8 will not make it ready as It is something which releated to your application.Even if it create new pod, how readiness will be guaranted. So you have to resolve the reasons behind non ready status and then k8. Only thing k8 does it keep them away from taking world load to avoid request failure
Ensuring you always have 4 pods running can be done by specifying the replicas property in your deployment definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 4 #here we define a requirement for 4 replicas
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
Kubernetes will ensure that if any pods crash, replacement pods will be created so that a total of 4 are always available.
You cannot schedule deployments on unhealthy nodes in the cluster. The master api will only create pods on nodes which are healthy and meet the quota criteria to create any additional pods on the nodes which are schedulable.
Moreover, what you define is called an auto-heal concept of k8s which in basic terms will be taken care of.

How to force Kubernetes to update deployment with a pod in every node

I would like to know if there is a way to force Kubernetes, during a deploy, to use every node in the cluster.
The question is due some attempts that I have done where I noticed a situation like this:
a cluster of 3 nodes
I update a deployment with a command like: kubectl set image deployment/deployment_name my_repo:v2.1.2
Kubernetes updates the cluster
At the end I execute kubectl get pod and I notice that 2 pods have been deployed in the same node.
So after the update, the cluster has this configuration:
one node with 2 pods
one node with 1 pod
one node without any pod (totally without any workload)
The scheduler will try to figure out the most reasonable way of scheduling at given point in time, which can change later on and results in situations like you described. Two simple ways to manage this in one way or another are :
use DaemonSet instead of Deployment : will make sure you have one and only one pod per node (matching nodeSelector / tolerations etc.)
use PodAntiAffinity : you can make sure that two pods of the same deployment in the same version are never deployed on the same node. This is what I personally prefer for many apps (unless I want more then one to be scheduled per node). Note that it will be in a bit of trouble if you decide to scale your deployment to more replicas then you have nodes.
Example for versioned PodAntiAffinity I use :
metadata:
labels:
app: {{ template "fullname" . }}
version: {{ .Values.image.tag }}
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["{{ template "fullname" . }}"]
- key: version
operator: In
values: ["{{ .Values.image.tag }}"]
topologyKey: kubernetes.io/hostname
consider fiddling with Descheduler which is like an evil twin of Kubes Scheduler component which will cause deleting of pods for them tu reschedule differently
I tried some solutions and what is working at the moment is simply based on the change of version inside my deployment.yaml on DaemonSet controller.
I mean:
1) I have to deploy for the 1' time my application based on a pod with some containers. These pods should be deployed on every cluster node (I have 3 nodes). I have set up the deployment setting in the yaml file with the option replicas equal to 3:
apiVersion: apps/v1beta2 # for versions before 1.8.0 use apps/v1beta1
kind: Deployment
metadata:
name: my-deployment
labels:
app: webpod
spec:
replicas: 3
....
I have set up the daemonset (or ds) in the yaml file with the option updateStrategy equal to RollingUpdate:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: my-daemonset
spec:
updateStrategy:
type: RollingUpdate
...
The version used for one of my containers is 2.1 for example
2) I execute the deployment with the command: kubectl apply -f my-deployment.yaml
I execute the deployment with the command: kubectl apply -f my-daemonset.yaml
3) I get one pod for every node without problem
4) Now I want to update the deployment changing the version of the image that I use for one of my containers. So I simply change the yaml file editing 2.1 with 2.2. Then I re-launch the command: kubectl apply -f my-deployment.yaml
So I can simply change the version of the image (2.1 -> 2.2) with this command:
kubectl set image ds/my-daemonset my-container=my-repository:v2.2
5) Again, I obtain one pod for every node without problem
Behavior very different if instead I use the command:
kubectl set image deployment/my-deployment my-container=xxxx:v2.2
In this case I get a wrong result where a node has 2 pod, a node 1 pod and last node without any pod...
To see how the deployment evolves, I can launch the command:
kubectl rollout status ds/my-daemonset
getting something like that
Waiting for rollout to finish: 0 out of 3 new pods have been updated...
Waiting for rollout to finish: 0 out of 3 new pods have been updated...
Waiting for rollout to finish: 1 out of 3 new pods have been updated...
Waiting for rollout to finish: 1 out of 3 new pods have been updated...
Waiting for rollout to finish: 1 out of 3 new pods have been updated...
Waiting for rollout to finish: 2 out of 3 new pods have been updated...
Waiting for rollout to finish: 2 out of 3 new pods have been updated...
Waiting for rollout to finish: 2 out of 3 new pods have been updated...
Waiting for rollout to finish: 2 of 3 updated pods are available...
daemon set "my-daemonset" successfully rolled out

Kubernetes keeps spawning Pods after deletion

The following is the file used to create the Deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kloud-php7
namespace: kloud-hosting
spec:
replicas: 1
template:
metadata:
labels:
app: kloud-php7
spec:
containers:
- name: kloud-php7
image: 192.168.1.1:5000/kloud-php7
- name: kloud-nginx
image: 192.168.1.1:5000/kloud-nginx
ports:
- containerPort: 80
The Deployment and the Pod worked fine, but after deleting the Deployment and a generated ReplicaSet, the I cannot delete the spawn Pods permanently. New Pods will be created if old ones are deleted.
The kubernetes cluster is created with kargo, containing 4 nodes running CentOS 7.3, kubernetes version 1.5.6
Any idea how to solve this problem ?
This is working as intended. The Deployment creates (and recreates) a ReplicaSet and the ReplicaSet creates (and recreates!) Pods. You need to delete the Deployment, not the Pods or the ReplicaSet:
kubectl delete deploy -n kloud-hosting kloud-php7
This is Because the replication set always enables to recreate the pods as mentioned in the deployment file(suppose say 3 ..kube always make sure that 3 pods up and running)
so here we need to delete replication set first to get rid of pods.
kubectl get rs
and delete the replication set .this will in turn deletes the pods
It could be the deamonsets need to be deleted.
For example:
$ kubectl get DaemonSets
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
elasticsearch-operator-sysctl 5 5 5 5 5 <none> 6d
$ kubectl delete daemonsets elasticsearch-operator-sysctl
Now running get pods should not list elasticsearch* pods.