Old ReplicaSet not getting replaced by new ReplicaSet after an kubectl edit - kubernetes

I am creating a deployment using this yaml file. It creates a replica of 4 busybox pods. All fine till here.
But when I edit this deployment using the command kubectl edit deployment my-dep2, only changing the version of busybox image to 1.31 (a downgrade but still an update from K8s point of view), the ReplicaSet is not completely replaced.
The output of kubectl get all --selector app=my-dep2 post the edit is:
NAME READY STATUS RESTARTS AGE
pod/my-dep2-55f67b974-5k7t9 0/1 ErrImagePull 2 5m26s
pod/my-dep2-55f67b974-wjwfv 0/1 CrashLoopBackOff 2 5m26s
pod/my-dep2-dcf7978b7-22khz 0/1 CrashLoopBackOff 6 12m
pod/my-dep2-dcf7978b7-2q5lw 0/1 CrashLoopBackOff 6 12m
pod/my-dep2-dcf7978b7-8mmvb 0/1 CrashLoopBackOff 6 12m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/my-dep2 0/4 2 0 12m
NAME DESIRED CURRENT READY AGE
replicaset.apps/my-dep2-55f67b974 2 2 0 5m27s
replicaset.apps/my-dep2-dcf7978b7 3 3 0 12m
As you can see from the output above, there are 2 ReplicaSet which are existing in parallel. I expect the old ReplicaSet to be completely replaced by new ReplicaSet (containing the 1.31 version of busybox). But this is not happening. What am I missing here?

You are ignoring the errors ErrImagePull and CrashLoopBackOff. Those are telling you it is not being possible to run new containers (the image was not found in the docker registry), so old ones are kept to ensure the service runs (blue-green default/rolling update).
Edit
Also, your Busybox containers start and run nothing (as far as I can remember) and then finish, which causes Kubernetes to restart it and never arrive to an alive state. Maybe you'd better run some sleep 300 to it's entrypoint?

This is totally normal, expected result, related with Rolling Update mechanism in kubernetes
Take a quick look at the following working example, in which I used sample nginx Deployment. Once it's deployed, I run:
kubectl edit deployments.apps nginx-deployment
and removed the image tag which actually equals to performing an update to nginx:latest. Immediatelly after applying the change you can see the following:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-574b87c764-bvmln 0/1 Terminating 0 2m6s
pod/nginx-deployment-574b87c764-zfzmh 1/1 Running 0 2m6s
pod/nginx-deployment-574b87c764-zskkk 1/1 Running 0 2m7s
pod/nginx-deployment-6fcf476c4-88fdm 0/1 ContainerCreating 0 1s
pod/nginx-deployment-6fcf476c4-btvgv 1/1 Running 0 3s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 2 3 2m7s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 2 2 2 2m7s
replicaset.apps/nginx-deployment-6fcf476c4 2 2 1 3s
As you can see, at certain point in time there are running pods in both replicas. It's because of the mentioned rolling update mechanism, which ensures your app availability when it is being updated.
When the update process is ended replicas count in the old replicaset is reduced to 0 so there are no running pods, managed by this replicaset as the new one achieved its desired state:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-6fcf476c4-88fdm 1/1 Running 0 10s
pod/nginx-deployment-6fcf476c4-btvgv 1/1 Running 0 12s
pod/nginx-deployment-6fcf476c4-db5z7 1/1 Running 0 8s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 2m16s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 0 0 0 2m16s
replicaset.apps/nginx-deployment-6fcf476c4 3 3 3 12s
You may ask yourself: why is it still there ? why it is not deleted immediatelly after the new one becomes ready. Try the following:
$ kubectl rollout history deployment nginx-deployment
deployment.apps/nginx-deployment
REVISION CHANGE-CAUSE
1 <none>
2 <none>
As you can see, there are 2 revisions of our rollout for this deployment. So now we may want to simply undo this recent change:
$ kubectl rollout undo deployment nginx-deployment
deployment.apps/nginx-deployment rolled back
Now, when wee look at our replicas we can observe a reverse process:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-574b87c764-6j7l5 0/1 ContainerCreating 0 1s
pod/nginx-deployment-574b87c764-m7956 1/1 Running 0 4s
pod/nginx-deployment-574b87c764-v2r75 1/1 Running 0 3s
pod/nginx-deployment-6fcf476c4-88fdm 0/1 Terminating 0 3m25s
pod/nginx-deployment-6fcf476c4-btvgv 1/1 Running 0 3m27s
pod/nginx-deployment-6fcf476c4-db5z7 0/1 Terminating 0 3m23s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 5m31s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 3 3 2 5m31s
replicaset.apps/nginx-deployment-6fcf476c4 1 1 1 3m27s
Note, that there is no need to create a 3rd replicaset, as there is still the old one which can be used to undo our recent change. The final result looks as follows:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-574b87c764-6j7l5 1/1 Running 0 40s
pod/nginx-deployment-574b87c764-m7956 1/1 Running 0 43s
pod/nginx-deployment-574b87c764-v2r75 1/1 Running 0 42s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 6m10s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 3 3 3 6m10s
replicaset.apps/nginx-deployment-6fcf476c4 0 0 0 4m6s
I hope that the above example helped you to realize why this old replicaset isn't immediatelly removed and what it can be still useful for.

as #emi said busybox and alpine etc don't do anything in case you give an explicit command. Kubernetes try to keep running but the default container does not perform any action and at the end, Kubernetes say okay, something is wrong no need to try to restart the container again and again. For test purposes, it might look as below.
kind: Pod
apiVersion: v1
metadata:
name: my-test-pod
spec:
containers:
- image: nginx
name: enginx
- image: alpine
name: alpine
command: ["sleep", "3600"]

Related

kube-apiserver: constantly 5 to 10% CPU: Although there is no single request

I installed kind to play around with Kubernetes.
If I use top and sort by CPU usage (key C), then I see that kube-apiserver is constantly consuming 5 to 10% CPU.
Why?
I don't have installed something up to now:
guettli#p15:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-ntg7c 1/1 Running 0 40h
kube-system coredns-558bd4d5db-sx8w9 1/1 Running 0 40h
kube-system etcd-kind-control-plane 1/1 Running 0 40h
kube-system kindnet-9zkkg 1/1 Running 0 40h
kube-system kube-apiserver-kind-control-plane 1/1 Running 0 40h
kube-system kube-controller-manager-kind-control-plane 1/1 Running 0 40h
kube-system kube-proxy-dthwl 1/1 Running 0 40h
kube-system kube-scheduler-kind-control-plane 1/1 Running 0 40h
local-path-storage local-path-provisioner-547f784dff-xntql 1/1 Running 0 40h
guettli#p15:~$ kubectl get services --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 40h
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 40h
guettli#p15:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kind-control-plane Ready control-plane,master 40h v1.21.1
guettli#p15:~$ kubectl get nodes --all-namespaces
NAME STATUS ROLES AGE VERSION
kind-control-plane Ready control-plane,master 40h v1.21.1
I am curious. Where does the CPU usage come from? How can I investigate this?
Even in an empty cluster with just one master node, there are at least 5 components that reach out to the API server on a regular basis:
kubelet for the master node
Controller manager
Scheduler
CoreDNS
Kube proxy
This is because API Server acts as the only entry point for all components in Kubernetes to know what the cluster state should be and take action if needed.
If you are interested in the details, you could enable audit logs in the API server and get a very verbose file with all the requests being made.
How to do so is not the goal of this answer, but you can start from the apiserver documentation.

Controll job deletions

We have a cronjob monitoring in our cluster. If a pod did not appear in 24 hours, It means that the cronjob haven't ran and we need to alert. But sometimes, due to some garbage collection, pod is deleted (but job completed successfully). How to keep all pods and avoid garbage collection? I know about finalizers, but looks like It's not working in this case.
Posting this as answer since it's a reason why it can happen.
Answer
Cloud kubernetes clusters have nodes autoscaling policies. Or sometimes node pools can be scaled down/up manually.
Cronjob creates job for each run which in turn creates a corresponding pod. Pods are assigned to exact nodes. And if for any reason node with assigned to it pod(s) was removed due to node autoscaling/manual scaling, pods will be gone. However jobs will be preserved since they are stored in etcd.
There are two flags which control amount of jobs stored in the history:
.spec.successfulJobsHistoryLimit - which is by default set to 3
.spec.failedJobsHistoryLimit - set by default to 1
If setting up 0 then everything will be removed right after jobs finish.
Jobs History Limits
How it happens in fact
I have a GCP GKE cluster with two nodes:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-cluster-xxxx Ready <none> 15h v1.21.3-gke.2001
gke-cluster-yyyy Ready <none> 3d20h v1.21.3-gke.2001
cronjob.yaml for testing:
apiVersion: batch/v1
kind: CronJob
metadata:
name: test-cronjob
spec:
schedule: "*/2 * * * *"
successfulJobsHistoryLimit: 5
jobTemplate:
spec:
template:
spec:
containers:
- name: test
image: busybox
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
Pods created:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-cronjob-27253914-mxnzg 0/1 Completed 0 8m59s 10.24.0.22 gke-cluster-4-xxxx <none> <none>
test-cronjob-27253916-88cjn 0/1 Completed 0 6m59s 10.24.0.25 gke-cluster-4-xxxx <none> <none>
test-cronjob-27253918-hdcg9 0/1 Completed 0 4m59s 10.24.0.29 gke-cluster-4-xxxx <none> <none>
test-cronjob-27253920-shnnp 0/1 Completed 0 2m59s 10.24.1.15 gke-cluster-4-yyyy <none> <none>
test-cronjob-27253922-cw5gp 0/1 Completed 0 59s 10.24.1.18 gke-cluster-4-yyyy <none> <none>
Scaling down one node:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-cluster-4-xxxx NotReady,SchedulingDisabled <none> 16h v1.21.3-gke.2001
gke-cluster-4-yyyy Ready <none> 3d21h v1.21.3-gke.2001
And getting pods now:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
test-cronjob-27253920-shnnp 0/1 Completed 0 7m47s 10.24.1.15 gke-cluster-4-yyyy <none> <none>
test-cronjob-27253922-cw5gp 0/1 Completed 0 5m47s 10.24.1.18 gke-cluster-4-yyyy <none> <none>
Previously completed pods on the first node are gone now.
Jobs are still in place:
$ kubectl get jobs
NAME COMPLETIONS DURATION AGE
test-cronjob-27253914 1/1 1s 13m
test-cronjob-27253916 1/1 2s 11m
test-cronjob-27253918 1/1 1s 9m55s
test-cronjob-27253920 1/1 34s 7m55s
test-cronjob-27253922 1/1 2s 5m55s
How it can be solved
Changing monitoring alert to look for jobs completion is much more precise method and independent to any cluster nodes scaling actions.
E.g. I still can retrieve a result from job test-cronjob-27253916 where corresponding pod to it is deleted:
$ kubectl get job test-cronjob-27253916 -o jsonpath='{.status.succeeded'}
1
Useful links:
Jobs history limits
Garbage collection in kubernetes
TTL controller for finished resources

What is 'AVAILABLE' column in kubernetes daemonsets

I may have a stupid question but could someone explain what "Available" correctly represent in DaemonSets? I checked What is the difference between current and available pod replicas in kubernetes deployment? answer but there are no readiness errors.
In cluster i see below status:
$ kubectl get ds -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR
kube-proxy 6 6 5 6 5 beta.kubernetes.io/os=linux
Why it is showing as 5 instead of 6?
all pods are running perfectly fine without any "readiness" errors or restarts?
$ kubectl get pods -n kube-system | grep kube-proxy
kube-proxy-cv7vv 1/1 Running 0 20d
kube-proxy-kcd67 1/1 Running 0 20d
kube-proxy-l4nfk 1/1 Running 0 20d
kube-proxy-mkvjd 1/1 Running 0 87d
kube-proxy-qb7nz 1/1 Running 0 36d
kube-proxy-x8l87 1/1 Running 0 87d
Could someone tell what can be checked further?
The Available field shows the number of replicas or pods that are ready to accept traffic and passed all the criterion such as readiness or liveness probe or any other condition that verifies that your application is ready to serve the requests coming from user.

kubernetes deployments / replicasets are recreated after deletion

I'm trying to delete some old deployments / replicasets I have in my cluster but when I run kubectl delete deployment
It'll say the deployment is deleted and the pod from that deployment is Terminating, but then a few seconds later the deployment is magically recreated and the pod comes back.
This is the same result for another replicaset I have.
What could be re-creating these deployments / replicasets and how can I stop it so I can permanently delete these deployments/rs?
Edit: Here's some output. This is on a kubernetes cluster in GKE btw:
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
quickstart-kb 1/1 1 1 41m
ubuntu 1/1 1 1 66d
kubectl get pods
NAME READY STATUS RESTARTS AGE
ubuntu-677fc9fd77-fgd7k 1/1 Running 0 19d
quickstart-kb-f9b65577f-4fxph 1/1 Running 0 40m
kubectl delete deployment quickstart-kb
deployment.extensions "quickstart-kb" deleted
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
quickstart-kb 0/1 1 0 7s
ubuntu 1/1 1 1 66d
kubectl get pods
NAME READY STATUS RESTARTS AGE
quickstart-kb-6cb6cf897d-qcjff 0/1 Running 0 11s
ubuntu-677fc9fd77-fgd7k 1/1 Running 0 19d
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
quickstart-kb 1/1 1 1 4m6s
ubuntu 1/1 1 1 66d
kubectl get pods
NAME READY STATUS RESTARTS AGE
quickstart-kb-6cb6cf897d-qcjff 1/1 Running 0 4m13s
ubuntu-677fc9fd77-fgd7k 1/1 Running 0 19d
I think your deployment object is created with the deployment of some custom resources (CRD).
When you created the CRD, the CRD controller created the deployment object. So, even if you delete the deployment object, the CRD controller re-creates it.
Delete the CRD object itself, to delete the deployment and other objects (if any) that were created with it.
From the name, it seems like Kibana CRD object:
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
Use the following command to delete the Kibana object:
$ kubectl delete Kibana quickstart-kb

Kubernetes - does not start the role of master

I'm starting a Kubernetes cluster of 3 nodes (1 master, 2 worker)
Trying to go by steps described in Ansible playbook - https://gitlab.com/LinarNadyrov/gcp/tree/master
Applying playbook steps 1,2,3 consequentially
After than, I connect to master to check status:
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
NAME STATUS ROLES AGE VERSION
master NotReady master 17m v1.13.0
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
enter link description here
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-86c58d9df4-7jc4b 0/1 Pending 0 3h45m
coredns-86c58d9df4-929xf 0/1 Pending 0 3h45m
etcd-officemasterkub 1/1 Running 2 7h26m
kube-apiserver-officemasterkub 1/1 Running 2 7h26m
kube-controller-manager-officemasterkub 1/1 Running 2 7h26m
kube-flannel-ds-5jhbx 0/1 Pending 0 7h20m
kube-flannel-ds-wqfvs 0/1 Pending 0 7h20m
kube-proxy-gmngj 1/1 Running 2 7h27m
kube-proxy-ppbqp 1/1 Running 1 7h20m
kube-proxy-r2rn6 1/1 Running 1 7h20m
kube-scheduler-officemasterkub 1/1 Running 2 7h26m
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Status is NotReady
Could anyone help me with it?
What's the problem? What should be done to fix it? Maybe I missed something?
Thanx in advance!
Линар Надыров the problem here is with your flannel yaml file. You did not specify any resources in the DaemonSet so there are no flannel pods spawning.
I did not check any further as it was enough reason why is this issue occurring. You can use this yaml if this is for testing purposes. Or edit your accordingly to the provided example.
In your file change the line 43 to:
shell: kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml >> pod_network_setup.txt
You can find more about DaemonSets here.