kubernetes deployments / replicasets are recreated after deletion - kubernetes

I'm trying to delete some old deployments / replicasets I have in my cluster but when I run kubectl delete deployment
It'll say the deployment is deleted and the pod from that deployment is Terminating, but then a few seconds later the deployment is magically recreated and the pod comes back.
This is the same result for another replicaset I have.
What could be re-creating these deployments / replicasets and how can I stop it so I can permanently delete these deployments/rs?
Edit: Here's some output. This is on a kubernetes cluster in GKE btw:
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
quickstart-kb 1/1 1 1 41m
ubuntu 1/1 1 1 66d
kubectl get pods
NAME READY STATUS RESTARTS AGE
ubuntu-677fc9fd77-fgd7k 1/1 Running 0 19d
quickstart-kb-f9b65577f-4fxph 1/1 Running 0 40m
kubectl delete deployment quickstart-kb
deployment.extensions "quickstart-kb" deleted
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
quickstart-kb 0/1 1 0 7s
ubuntu 1/1 1 1 66d
kubectl get pods
NAME READY STATUS RESTARTS AGE
quickstart-kb-6cb6cf897d-qcjff 0/1 Running 0 11s
ubuntu-677fc9fd77-fgd7k 1/1 Running 0 19d
kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
quickstart-kb 1/1 1 1 4m6s
ubuntu 1/1 1 1 66d
kubectl get pods
NAME READY STATUS RESTARTS AGE
quickstart-kb-6cb6cf897d-qcjff 1/1 Running 0 4m13s
ubuntu-677fc9fd77-fgd7k 1/1 Running 0 19d

I think your deployment object is created with the deployment of some custom resources (CRD).
When you created the CRD, the CRD controller created the deployment object. So, even if you delete the deployment object, the CRD controller re-creates it.
Delete the CRD object itself, to delete the deployment and other objects (if any) that were created with it.
From the name, it seems like Kibana CRD object:
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
Use the following command to delete the Kibana object:
$ kubectl delete Kibana quickstart-kb

Related

Kiali Dashboard Not able to fetch the k8 namespaces application

I have successfully installed istio and deployed some sample app and application is up and running.
root#master:~# kubectl get pod
NAME READY STATUS RESTARTS AGE
mydata-v1-847cd777c4-kc495 2/2 Running 0 39m
mydata-v2-65bbf55977-j67xp 2/2 Running 0 39m
myweb-66dc56ccd6-5g64b 2/2 Running 0 40m
NAME READY STATUS RESTARTS AGE
grafana-784c89f4cf-cxpcz 1/1 Running 0 15d
istio-egressgateway-bd477794-qv7n8 1/1 Running 0 15d
istio-ingressgateway-79df7c789f-qlqcf 1/1 Running 0 15d
istiod-6dc55bbdd-t5klg 1/1 Running 0 15d
jaeger-7f78b6fb65-xhz8j 1/1 Running 0 15d
kiali-dc84967d9-99lwv 1/1 Running 1 13d
prometheus-7bfddb8dbf-nd4gn 2/2 Running 35 15d
Next i changed kiali dashboard cluster IP to Nodeport to access the dash brad from the browser
kubectl patch svc kiali -n istio-system --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"},{"op":"replace","path":"/spec/ports/0/nodePort","value":30010}]'
Finally i can able to access the dashboard using node port with my host Ip http://machineip_port/ and could see my k8 namespaces without any apps please find the attached screen shot
could you please help me someone last one week i am running into this issue.
The problem is that
"Namespaces that do not exist at the time of install but are created
later in the future will not be accessible by Kiali". Resource.
So, first, keep in mind you should not edit kiali's ConfigMap, but only Kiali's Custom Resource Definition(CRD), which is used by Kiali Operator.
Run kubectl edit kiali kiali in the namespace you have the CRD available.
Then add the following under spec:
spec:
deployment:
accessible_namespaces:
- ["**"]
This will give Kiali access to all current namespaces and to any you'll create in the future.

Old ReplicaSet not getting replaced by new ReplicaSet after an kubectl edit

I am creating a deployment using this yaml file. It creates a replica of 4 busybox pods. All fine till here.
But when I edit this deployment using the command kubectl edit deployment my-dep2, only changing the version of busybox image to 1.31 (a downgrade but still an update from K8s point of view), the ReplicaSet is not completely replaced.
The output of kubectl get all --selector app=my-dep2 post the edit is:
NAME READY STATUS RESTARTS AGE
pod/my-dep2-55f67b974-5k7t9 0/1 ErrImagePull 2 5m26s
pod/my-dep2-55f67b974-wjwfv 0/1 CrashLoopBackOff 2 5m26s
pod/my-dep2-dcf7978b7-22khz 0/1 CrashLoopBackOff 6 12m
pod/my-dep2-dcf7978b7-2q5lw 0/1 CrashLoopBackOff 6 12m
pod/my-dep2-dcf7978b7-8mmvb 0/1 CrashLoopBackOff 6 12m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/my-dep2 0/4 2 0 12m
NAME DESIRED CURRENT READY AGE
replicaset.apps/my-dep2-55f67b974 2 2 0 5m27s
replicaset.apps/my-dep2-dcf7978b7 3 3 0 12m
As you can see from the output above, there are 2 ReplicaSet which are existing in parallel. I expect the old ReplicaSet to be completely replaced by new ReplicaSet (containing the 1.31 version of busybox). But this is not happening. What am I missing here?
You are ignoring the errors ErrImagePull and CrashLoopBackOff. Those are telling you it is not being possible to run new containers (the image was not found in the docker registry), so old ones are kept to ensure the service runs (blue-green default/rolling update).
Edit
Also, your Busybox containers start and run nothing (as far as I can remember) and then finish, which causes Kubernetes to restart it and never arrive to an alive state. Maybe you'd better run some sleep 300 to it's entrypoint?
This is totally normal, expected result, related with Rolling Update mechanism in kubernetes
Take a quick look at the following working example, in which I used sample nginx Deployment. Once it's deployed, I run:
kubectl edit deployments.apps nginx-deployment
and removed the image tag which actually equals to performing an update to nginx:latest. Immediatelly after applying the change you can see the following:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-574b87c764-bvmln 0/1 Terminating 0 2m6s
pod/nginx-deployment-574b87c764-zfzmh 1/1 Running 0 2m6s
pod/nginx-deployment-574b87c764-zskkk 1/1 Running 0 2m7s
pod/nginx-deployment-6fcf476c4-88fdm 0/1 ContainerCreating 0 1s
pod/nginx-deployment-6fcf476c4-btvgv 1/1 Running 0 3s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 2 3 2m7s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 2 2 2 2m7s
replicaset.apps/nginx-deployment-6fcf476c4 2 2 1 3s
As you can see, at certain point in time there are running pods in both replicas. It's because of the mentioned rolling update mechanism, which ensures your app availability when it is being updated.
When the update process is ended replicas count in the old replicaset is reduced to 0 so there are no running pods, managed by this replicaset as the new one achieved its desired state:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-6fcf476c4-88fdm 1/1 Running 0 10s
pod/nginx-deployment-6fcf476c4-btvgv 1/1 Running 0 12s
pod/nginx-deployment-6fcf476c4-db5z7 1/1 Running 0 8s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 2m16s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 0 0 0 2m16s
replicaset.apps/nginx-deployment-6fcf476c4 3 3 3 12s
You may ask yourself: why is it still there ? why it is not deleted immediatelly after the new one becomes ready. Try the following:
$ kubectl rollout history deployment nginx-deployment
deployment.apps/nginx-deployment
REVISION CHANGE-CAUSE
1 <none>
2 <none>
As you can see, there are 2 revisions of our rollout for this deployment. So now we may want to simply undo this recent change:
$ kubectl rollout undo deployment nginx-deployment
deployment.apps/nginx-deployment rolled back
Now, when wee look at our replicas we can observe a reverse process:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-574b87c764-6j7l5 0/1 ContainerCreating 0 1s
pod/nginx-deployment-574b87c764-m7956 1/1 Running 0 4s
pod/nginx-deployment-574b87c764-v2r75 1/1 Running 0 3s
pod/nginx-deployment-6fcf476c4-88fdm 0/1 Terminating 0 3m25s
pod/nginx-deployment-6fcf476c4-btvgv 1/1 Running 0 3m27s
pod/nginx-deployment-6fcf476c4-db5z7 0/1 Terminating 0 3m23s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 5m31s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 3 3 2 5m31s
replicaset.apps/nginx-deployment-6fcf476c4 1 1 1 3m27s
Note, that there is no need to create a 3rd replicaset, as there is still the old one which can be used to undo our recent change. The final result looks as follows:
$ kubectl get all --selector=app=nginx
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-574b87c764-6j7l5 1/1 Running 0 40s
pod/nginx-deployment-574b87c764-m7956 1/1 Running 0 43s
pod/nginx-deployment-574b87c764-v2r75 1/1 Running 0 42s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-deployment ClusterIP 10.3.247.159 <none> 80/TCP 6d4h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 3/3 3 3 6m10s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-574b87c764 3 3 3 6m10s
replicaset.apps/nginx-deployment-6fcf476c4 0 0 0 4m6s
I hope that the above example helped you to realize why this old replicaset isn't immediatelly removed and what it can be still useful for.
as #emi said busybox and alpine etc don't do anything in case you give an explicit command. Kubernetes try to keep running but the default container does not perform any action and at the end, Kubernetes say okay, something is wrong no need to try to restart the container again and again. For test purposes, it might look as below.
kind: Pod
apiVersion: v1
metadata:
name: my-test-pod
spec:
containers:
- image: nginx
name: enginx
- image: alpine
name: alpine
command: ["sleep", "3600"]

How are pods in kube-system namespace managed?

I'm trying to understand how kubernetes works, so I tried to do this operation for my minikube:
~ kubectl delete pod --all -n kube-system
pod "coredns-f9fd979d6-5n4b6" deleted
pod "etcd-minikube" deleted
pod "kube-apiserver-minikube" deleted
pod "kube-controller-manager-minikube" deleted
pod "kube-proxy-879lg" deleted
pod "kube-scheduler-minikube" deleted
It's okay. Pods deleted as wish. But if I do kubectl get pods -n kube-system I will see:
NAME READY STATUS RESTARTS AGE
coredns-f9fd979d6-5d25r 1/1 Running 0 50s
etcd-minikube 1/1 Running 0 50s
kube-apiserver-minikube 1/1 Running 0 50s
kube-controller-manager-minikube 1/1 Running 0 50s
kube-proxy-nlw69 1/1 Running 0 43s
kube-scheduler-minikube 1/1 Running 0 49s
Okay. I thought it's ReplicaSet or DaemonSet:
➜ ~ kubectl get ds -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-proxy 1 1 1 1 1 kubernetes.io/os=linux 18m
➜ ~ kubectl get rs -n kube-system
NAME DESIRED CURRENT READY AGE
coredns-f9fd979d6 1 1 1 18m
It is true for coredns and kube-proxy. But what about others (apiserver, etcd, controller and scheduler)? Why are they still alive?
The control plane pods are run as static Pods - static Pods are not managed by the control plane controllers like e.g. DaemonSet and ReplicaSet. Static pods are instead managed by the Kubelet daemon on the local node directly.

Dont delete pods in rolling back a deployment

I would like to perform rolling back a deployment in my environment.
Command:
kubectl rollout undo deployment/foo
Steps which are perform:
create pods with old configurations
delete old pods
Is there a way to not perform last step - for example - developer would like to check why init command fail and debug.
I didn't find information about that in documentation.
Yes it is possible, before doing rollout, first you need to remove labels (corresponding to replica-set controlling that pod) from unhealthy pod. This way pod won't belong anymore to the deployment and even if you do rollout, it will still be there. Example:
$kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
sleeper 1/1 1 1 47h
$kubectl get pod --show-labels
NAME READY STATUS RESTARTS AGE LABELS
sleeper-d75b55fc9-87k5k 1/1 Running 0 5m46s pod-template-hash=d75b55fc9,run=sleeper
$kubectl label pod sleeper-d75b55fc9-87k5k pod-template-hash- run-
pod/sleeper-d75b55fc9-87k5k labeled
$kubectl get pod --show-labels
NAME READY STATUS RESTARTS AGE LABELS
sleeper-d75b55fc9-87k5k 1/1 Running 0 6m34s <none>
sleeper-d75b55fc9-swkj9 1/1 Running 0 3s pod-template-hash=d75b55fc9,run=sleeper
So what happens here, we have a pod sleeper-d75b55fc9-87k5k which belongs to sleeper deployment, we remove all labels from it, deployment detects that pod "has gone" so it creates a new one sleeper-d75b55fc9-swkj9, but the old one is still there and ready for debugging. Only pod sleeper-d75b55fc9-swkj9 will be affected by rollout.

Kubernetes coredns pods stuck in Pending status. Cannot start the dashboard

I am building a Kubernetes cluster following this tutorial, and I have troubles to access the Kubernetes dashboard. I already created another question about it that you can see here, but while digging up into my cluster, I think that the problem might be somewhere else and that's why I create a new question.
I start my master, by running the following commands:
> kubeadm reset
> kubeadm init --apiserver-advertise-address=[MASTER_IP] > file.txt
> tail -2 file.txt > join.sh # I keep this file for later
> kubectl apply -f https://git.io/weave-kube/
> kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 2m46s
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 2m46s
etcd-kubemaster 1/1 Running 0 93s
kube-apiserver-kubemaster 1/1 Running 0 93s
kube-controller-manager-kubemaster 1/1 Running 0 113s
kube-proxy-lxhvs 1/1 Running 0 2m46s
kube-scheduler-kubemaster 1/1 Running 0 93s
Here we can see that I have two coredns pods stuck in Pending state forever, and when I run the command :
> kubectl -n kube-system describe pod coredns-fb8b8dccf-kb2zq
I can see in the Events part the following Warning :
Failed Scheduling : 0/1 nodes are available 1 node(s) had taints that the pod didn't tolerate.
Since it is a Warning and not and Error, and that as a Kubernetes newbie, taints does not mean much to me, I tried to connect a node to the master (using the previously saved command) :
> cat join.sh
kubeadm join [MASTER_IP]:6443 --token [TOKEN] \
--discovery-token-ca-cert-hash sha256:[ANOTHER_TOKEN]
> ssh [USER]#[WORKER_IP] 'bash' < join.sh
This node has joined the cluster.
On the master, I check that the node is connected:
> kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubemaster NotReady master 13m v1.14.1
kubeslave1 NotReady <none> 31s v1.14.1
And I check my pods :
> kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 14m
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 14m
etcd-kubemaster 1/1 Running 0 13m
kube-apiserver-kubemaster 1/1 Running 0 13m
kube-controller-manager-kubemaster 1/1 Running 0 13m
kube-proxy-lxhvs 1/1 Running 0 14m
kube-proxy-xllx4 0/1 ContainerCreating 0 2m16s
kube-scheduler-kubemaster 1/1 Running 0 13m
We can see that another kube-proxy pod have been created and is stuck in ContainerCreating status.
And when I am doing a describe again :
kubectl -n kube-system describe pod kube-proxy-xllx4
I can see in the Events part multiple identical Warnings :
Failed create pod sandbox : rpx error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Get https://k8s.gcr.io/v1/_ping: dial tcp: lookup k8s.gcr.io on [::1]:53 read up [::1]43133->[::1]:53: read: connection refused
Here are my repositories :
docker image ls
REPOSITORY TAG
k8s.gcr.io/kube-proxy v1.14.1
k8s.gcr.io/kube-apiserver v1.14.1
k8s.gcr.io/kube-controller-manager v1.14.1
k8s.gcr.io/kube-scheduler v1.14.1
k8s.gcr.io/coredns 1.3.1
k8s.gcr.io/etcd 3.3.10
k8s.gcr.io/pause 3.1
And so, for the dashboard part, I tried to start it with the command
> kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended/kubernetes-dashboard.yaml
But the dashboard pod is stuck in Pending state.
kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 40m
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 40m
etcd-kubemaster 1/1 Running 0 38m
kube-apiserver-kubemaster 1/1 Running 0 38m
kube-controller-manager-kubemaster 1/1 Running 0 39m
kube-proxy-lxhvs 1/1 Running 0 40m
kube-proxy-xllx4 0/1 ContainerCreating 0 27m
kube-scheduler-kubemaster 1/1 Running 0 38m
kubernetes-dashboard-5f7b999d65-qn8qn 1/1 Pending 0 8s
So, event though my problem originaly was that I cannot access to my dashboard, I guess that the real problem is deeper thant that.
I know that I just put a lot of information here, but I am a k8s beginner and I am completely lost on this.
There is an issue I experienced with coredns pods stuck in a pending mode when setting up your own cluster; which I resolve by adding pod network.
Looks like because there is no Network Addon installed, the nodes are taint as not-ready. Installing the Addon would remove the taints and the Pods will be able to schedule. In my case adding flannel fixed the issue.
EDIT: There is a note about this in the official k8s documentation - Create cluster with kubeadm:
The network must be deployed before any applications. Also, CoreDNS
will not start up before a network is installed. kubeadm only
supports Container Network Interface (CNI) based networks (and does
not support kubenet).
Actually it is the opposite of a deep or serious issue. This is a trivial issue. Always you see a pod stuck on Pending state, it means the scheduler is having a hard time to schedule the pod; mostly because there are no enough resources on the node.
In your case it is a taint that has the node, and your pod doesn't have the toleration. What you have to do is to describe the node and get the taint:
kubectl describe node | grep -i taints
Note: you might have more then one taint. So you might want to do kubectl describe no NODE since with grep you will only see one taint.
Once you get the taint, that will be something like hello=world:NoSchedule; which means key=value:effect, you will have to add a toleration section in your Deployment. This is an example Deployment so you can see how it should look like:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 10
strategy:
type: Recreate
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
name: http
tolerations:
- effect: NoExecute #NoSchedule, PreferNoSchedule
key: node
operator: Equal
value: not-ready
tolerationSeconds: 3600
As you can see there is the toleration section in the yaml. So, if I would have a node with node=not-ready:NoExecute taint, no pod would be able to be scheduled on that node, unless would have this toleration.
Also you can remove the taint, if you don need it. To remove a taint you would describe the node, get the key of the taint and do:
kubectl taint node NODE key-
Hope it makes sense. Just add this section to your deployment, and it will work.
Set up the flannel network tool.
Running commands:
$ sysctl net.bridge.bridge-nf-call-iptables=1
$ kubectl apply -f
https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml