Deploy a feature with zero downtime - kubernetes

How do you deploy a feature with zero downtime in Kubernetes?
kubectl run nginx --image=nginx # creates a deployment
○ → kubectl get deploy
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 1 1 1 0 7s
Now let’s assume we are going to update the nginx image
kubectl set image deployment nginx nginx=nginx:1.15 # updates the image
Now when we check the replica sets
kubectl get replicasets # get replica sets
NAME DESIRED CURRENT READY AGE
nginx-65899c769f 0 0 0 7m
nginx-6c9655f5bb 1 1 1 13s
From the above, we can notice that one more replica set was added and then the other replica set was brought down
kubectl rollout status deployment nginx
check the status of a deployment rollout
kubectl rollout history deployment nginx
check the revisions in a deployment
○ → kubectl rollout history deployment nginx
deployment.extensions/nginx
REVISION CHANGE-CAUSE
1
2

You should use strategy as rolling update with max surge and max unavailable defined
for morer information go here https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

Related

Failed pods of previous helm release are not removed automatically

I have an application Helm chart with two deployments:
app (2 pod replicas)
app-dep (1 pod replica)
app-dep has an init container that waits for the app pods (using its labels) to be ready:
initContainers:
- name: wait-for-app-pods
image: groundnuty/k8s-wait-for:v1.5.1
imagePullPolicy: Always
args:
- "pod"
- "-l app.kubernetes.io/component=app"
I am using helm to deploy an application:
helm upgrade --install --wait --create-namespace --timeout 10m0s app ./app
Revision 1 of the release app is deployed:
helm ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
app default 1 2023-02-03 01:10:18.796554241 +1100 AEDT deployed app-0.1.0 1.0.0
Everything goes fine probably.
After some time, one of the app pods is evicted due to the low Memory available.
These are some lines from the pod's description details:
Status: Failed
Reason: Evicted
Message: The node was low on resource: memory. Container app was using 2513780Ki, which exceeds its request of 0.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Evicted 12m kubelet The node was low on resource: memory. Container app was using 2513780Ki, which exceeds its request of 0.
Normal Killing 12m kubelet Stopping container app
Warning ExceededGracePeriod 12m kubelet Container runtime did not kill the pod within specified grace period.
Later a new pod was added automatically to match the deployment's replica count too.
But the Failed pod still remains in the namespace.
Now comes the next helm upgrade. The pods of app for release revision 2 are ready.
But the init-container of app-dep of the latest revision remains to wait for all the pods with the label app.kubernetes.io/component=app" to become ready. After 10 minutes of timeout helm release 2 is declared as failed.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
app-7595488c8f-4v42n 1/1 Running 0 7m37s
app-7595488c8f-xt4qt 1/1 Running 0 6m17s
app-86448b6cd-7fq2w 0/1 Error 0 36m
app-dep-546d897d6c-q9sw6 1/1 Running 0 38m
app-dep-cd9cfd975-w2fzn 0/1 Init:0/1 0 7m37s
ANALYSIS FOR SOLUTION:
In order to address this issue, we can try two approaches:
Approach 1:
Find and remove all the failed pods of the previous revision first, just before doing a helm upgrade.
kubectl get pods --field-selector status.phase=Failed -n default
You can do it as part of the CD pipeline or add that task as a pre-install hook job to the helm chart too.
Approach 2:
Add one more label to the pods that change on every helm upgrade ( something like helm/release-revision=2 )
Add that label also in the init-container so that it waits for the pods that have both labels.
It will then ignore the Failed pods of the previous release that have a different label.
initContainers:
- name: wait-for-app-pods
image: groundnuty/k8s-wait-for:v1.5.1
imagePullPolicy: Always
args:
- "pod"
- "-l app.kubernetes.io/component=app, helm/release-revision=2"
This approach causes a frequent updation of pod labels and therefore recreates the pod each time. Also, it is better to update the pod labels only in the deployment because as per official Kubernetes documentation of Deployment resource:
It is generally discouraged to make label selector updates
Also, there is no need to add the revision label to the selector field in the service manifest.
QUESTION:
Which approach would be better practice?
What would be the caveats and benefits of each method?
Is there any other approach to fix this issue?

kubectl drain node --dry-run not showing errors

I have a deployment A where replica count is set to 1 and in Pod Disruption budget minAvailable is also set to 1. Upon describing PDB, it shows ALLOWED DISRUPTIONS as 0 but the problem is when I do kubectl drain node-1 --dry-run , the output still shows the above deployment pod evicted. Is it like dry run does not show errors ? I am using Kubernetes 1.19

How can I delete all pods via delete job

I created a job,and rerun it several times。
When I delete this job. Only the latest pod be deleted.
How Can I delete these pods all.
For Cronjob
You can use the successfulJobsHistoryLimit to manage the pod count, if you will set it to 0, POD will get removed as soon as it complete it's execution successfully.
successfulJobsHistoryLimit: 0
failedJobsHistoryLimit: 0
Read more at : https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#jobs-history-limits
GCP ref : https://cloud.google.com/kubernetes-engine/docs/how-to/cronjobs#history-limit
For Job
If you are using job not cronjob you can use ttlSecondsAfterFinished - delete the job pod after set second automatically, you can set it accordingly keeping some buffer.
ttlSecondsAfterFinished: 100
will solve your issue.
Example : https://kubernetes.io/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically
Extra :
You can delete those pods using simple command but that is one-time solution using the label in PODs or used in job
kubectl delete pods -l <labels> -n <namespace>
You can create a label or you maybe have already one to match the targeted group of pods , so you can delete them all based on this label as follow:
kubectl delete pods -l app=my-app
I assume you have a number of pods from the same image, and you want to clean them up, and then have only one pod running? If so, you need to delete the deploy:
kubectl -n <namespace> get deploy
kubectl -n <namespace> delete deploy <deployname>
Or you can scale to 0 replicas:
kubectl scale deploy <deploy-name> --replicas=0
which will kill all these pods, and then apply the manifest anew, so it creates 1 pod (assuming you are not scaling to more than 1 active pod)
kubectl -n <namespace> apply -f <manifest-for-that-deploy.yaml>

Correct way to scale/restart an application down/up in kubernetes (replicaset, deployments and pod deletion)?

I usually restart my applications by:
kubectl scale deployment my-app --replicas=0
Followed by:
kubectl scale deployment my-app --replicas=1
which works fine. I also have another running application but when I look at its replicaset I see:
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
another-app 2 2 2 2d
So to restart that correctly I would of course need to:
kubectl scale deployment another-app --replicas=0
kubectl scale deployment another-app --replicas=2
But is there a better way to do this so I don't have to manually look at the repliasets before scaling/restarting my application (that might have replicas > 1)?
You can restart pods by using level
kubectl delete pods -l name=myLabel
You can rolling restart of all pods for a deployments, so that you don't take the service down
kubectl patch deployment your_deployment_name -p \
"{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"
And After kubernetes version 1.15 you can
kubectl rollout restart deployment your_deployment_name
To make changes in your current deployment you can use kubectl rollout pause deployment/YOUR_DEPLOYMENT. This way the deployment will be marked as paused and won't be reconciled by the controller. After it's paused you can make necessary changes to your configuration and then resume it by using kubectl rollout resume deployment/YOUR_DEPLOYMENT. This way it will create a new replicaset with updated configuration.
Pod with new configuration will be started and when it's in running status, pod with old configuration will be terminated.
Using this method you will be able to rollout the deployment to previous version by using:
kubectl rollout history deployment/YOUR_DEPLOYMENT
to check history of the rollouts and then execute following command to rollback:
kubectl rollout undo deployment/YOUR_DEPLOYMENT --to-revision=REVISION_NO

Kubernates autoscaling. ScalingActive False

Trying to add autoscaling to my deployment,but getting ScalingActive False,most answers are about DNS,Heapster,Limits I've done all but still can't find solution.
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
fetch Deployment/fetch <unknown>/50% 1 4 1 13m
kubectl cluster-info
Kubernetes master is running at --
addon-http-application-routing-default-http-backend is running at --
addon-http-application-routing-nginx-ingress is running at --
Heapster is running at --
KubeDNS is running at --
kubernetes-dashboard is running at --
kubectl describe hpa`
yaml `
PS.I tried to deploy example witch azure provides....getting the same,so yaml settings isn't problem
kubectl describe pod `
kubectl top pod fetch-54f697989d-wczvn --namespace=default`
autoscaling by memory yaml `
description`
kubectl get hpa give the same result,unknown/60%
I've experienced similar issues, my solutions are setting resources.requests.cpu section up in deployment config in order to calculate the current percentage based on the requested resource values. Your event log messages also means not to set up the request resource, but your deployment yaml seems no problem to me too.
Let we do double check as following steps.
If you can verify the resources as following cmd,
# kubectl top pod <your pod name> --namespace=<your pod running namespace>
And you would also need to check the pod requested cpu resources using below cmd in order to ensure same the config with your deployment yaml.
# kubectl describe pod <your pod name>
...
Requests:
cpu: 250m
...
I hope it help you to resolve your issues. ;)
This one helped me github issue. I just deployed metric server to my cluster and recreated hpa.