Crashloopbackoff while creating nginx controller - kubernetes

I have installed Kubernetes on AWS-EC2 machines, the cluster has a master and 2 nodes connected to it
[root#k8-m deployments]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8-m Ready control-plane,master 107m v1.20.1
k8-n1 Ready <none> 101m v1.20.1
k8-n2 Ready <none> 91m v1.20.1
I have a requirement to install ingress controller for exposing the traffic outside and the chosen controller is nginx. Am creating the resources such as ns, service account, secret, rbac, config map, ap-rbac, daemon-set config taken from https://github.com/nginxinc/kubernetes-ingress.git.
After creating the properties for ingress controller, am seeing the pods going to crashloopbackoff state
[root#k8-m deployments]# kubectl get all -n nginx-ingress
NAME READY STATUS RESTARTS AGE
pod/nginx-ingress-555f75f85f-5vxf6 0/1 CrashLoopBackOff 7 11m
pod/nginx-ingress-7wmhw 0/1 CrashLoopBackOff 7 11m
pod/nginx-ingress-mss7v 0/1 CrashLoopBackOff 7 11m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/nginx-ingress 2 2 0 2 0 <none> 11m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-ingress 0/1 1 0 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-ingress-555f75f85f 1 1 0 11m
By describing the pod i get the below(pasting only the event details),
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned nginx-ingress/nginx-ingress-555f75f85f-5vxf6 to k8-n2
Normal Pulled 14m kubelet Successfully pulled image "nginx/nginx-ingress:edge" in 2.456877779s
Normal Pulled 14m kubelet Successfully pulled image "nginx/nginx-ingress:edge" in 2.501405255s
Normal Pulled 13m kubelet Successfully pulled image "nginx/nginx-ingress:edge" in 2.63456627s
Normal Created 13m (x4 over 14m) kubelet Created container nginx-ingress
Normal Started 13m (x4 over 14m) kubelet Started container nginx-ingress
Normal Pulled 13m kubelet Successfully pulled image "nginx/nginx-ingress:edge" in 2.659821346s
Normal Pulling 12m (x5 over 14m) kubelet Pulling image "nginx/nginx-ingress:edge"
Warning BackOff 3m53s (x47 over 14m) kubelet Back-off restarting failed container
Am not able to see the logs though
Below are the executions while creating the nginx controller,
kubectl create -f common/ns-and-sa.yaml
kubectl create -f rbac/rbac.yaml
kubectl create -f rbac/ap-rbac.yaml
kubectl create -f common/default-server-secret.yaml
kubectl create -f common/nginx-config.yaml
kubectl create -f deployment/nginx-ingress.yaml
kubectl create -f daemon-set/nginx-ingress.yaml
Could anyone here advice me on this

Related

How to make k8s imagePullPolicy = never work?

I have followed the instructions on this blog to create a simple container image and deploy it in a k8s cluster.
However, in my case the pods do not run:
student#master:~$ k get pod -o wide -l app=hello-python --field-selector spec.nodeName=master
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-python-58547cf485-7l8dg 0/1 ErrImageNeverPull 0 2m26s 192.168.219.126 master <none> <none>
hello-python-598c594dc5-4c9zd 0/1 ErrImageNeverPull 0 2m26s 192.168.219.67 master <none> <none>
student#master:~$ sudo podman images hello-python
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/hello-python latest 11cf1e5a86b1 50 minutes ago 941 MB
student#master:~$ hostname
master
student#master:~$
I understand why it may not work on the worker node, but why it does not work on the same node where the image is cached - the master node?
student#master:~$ k describe pod hello-python-58547cf485-7l8dg | grep -A 10 'Events:'
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/hello-python-58547cf485-7l8dg to master
Warning Failed 8m7s (x12 over 10m) kubelet Error: ErrImageNeverPull
Warning ErrImageNeverPull 4m59s (x27 over 10m) kubelet Container image "localhost/hello-python:latest" is not present with pull policy of Never
student#master:~$
My question is: how to make the pod run on the master node with the imagePullPolicy = never given that the image in question is available on the master node as the podman images attests?
EDIT 1
I am using a k8s cluster running on two VMs deployed in GCE. It was setup with a script provided in the context of the Linux Foundation Kubernetes Developer course LFD0259.
EDIT 2
The master node is allowed to run workloads - this is how the LFD259 course sets it up. For example:
student#master:~$ k create deployment xyz --image=httpd
deployment.apps/xyz created
student#master:~$ k get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
xyz-6c6bd4cd89-qn4zr 1/1 Running 0 5m37s 192.168.171.66 worker <none> <none>
student#master:~$
student#master:~$ k scale deployment xyz --replicas=10
deployment.apps/xyz scaled
student#master:~$ k get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
xyz-6c6bd4cd89-c2xv4 1/1 Running 0 73s 192.168.219.71 master <none> <none>
xyz-6c6bd4cd89-g89k2 0/1 ContainerCreating 0 73s <none> master <none> <none>
xyz-6c6bd4cd89-jfftl 0/1 ContainerCreating 0 73s <none> worker <none> <none>
xyz-6c6bd4cd89-kbdnq 1/1 Running 0 73s 192.168.219.106 master <none> <none>
xyz-6c6bd4cd89-nm6rt 0/1 ContainerCreating 0 73s <none> worker <none> <none>
xyz-6c6bd4cd89-qn4zr 1/1 Running 0 7m22s 192.168.171.66 worker <none> <none>
xyz-6c6bd4cd89-vts6x 1/1 Running 0 73s 192.168.171.84 worker <none> <none>
xyz-6c6bd4cd89-wd2ls 1/1 Running 0 73s 192.168.171.127 worker <none> <none>
xyz-6c6bd4cd89-wv4jn 0/1 ContainerCreating 0 73s <none> worker <none> <none>
xyz-6c6bd4cd89-xvtlm 0/1 ContainerCreating 0 73s <none> master <none> <none>
student#master:~$
It depends how you've set up your Kubernetes Cluster. I assume you've installed it with kubeadm. However, by default the Master is not scheduleable for workloads. And by my understanding the image you're talking about only exists on the master node right? If that's the case you can't start a pod with that Image as it only exists on the master node, which doesn't allow workloads by default.
If you were to copy the Image to the worker node, your given command should work.
However if you want to make your Master-Node scheduleable just taint it with (maybe you need to amend the last bit if it differs from yours):
kubectl taint nodes --all node-role.kubernetes.io/control-plane-

GKE Deploy issue - Free Tier with credit - Workloads

I am trying to deploy on a minimal cluster and failing
How can I tweak the configuration to make the availability green?
My Input:
My application is a spring- angular (please suggest an easy way where I can deploy both)
My docker-compose created 2 containers. I pushed them to registry (tagged)
When deploying in Workload, I added 1 container after another, and clicked Deploy. The result error is above
Is there a file I need to create - a kind of yml or yaml etc?
kubectl get pods
> NAME READY STATUS RESTARTS AGE
> nginx-1-d...7-2s6hb 0/2 CrashLoopBackOff 18 25m
> nginx-1-6..d7-7645w 0/2 CrashLoopBackOff 18 25m
> nginx-1-6...7-9qgjx 0/2 CrashLoopBackOff 18 25m
Events from describe
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17m default-scheduler Successfully assigned default/nginx-1-5d...56xp4 to gke-cluster-huge-default-pool-b6..60-4rj5
Normal Pulling 17m kubelet Pulling image "eu.gcr.io/p..my/py...my_appserver#sha256:479bf3e12ee2b410d730...579b940adc8845be74956f5"
Normal Pulled 17m kubelet Successfully pulled image "eu.gcr.io/py..my/py...emy_appserver#sha256:479bf3e12ee2b4..8b99a178ee05e8579b940adc8845be74956f5" in 11.742649177s
Normal Created 15m (x5 over 17m) kubelet Created container p..my-appserver-sha256-1
Normal Started 15m (x5 over 17m) kubelet Started container p..emy-appserver-sha256-1
Normal Pulled 15m (x4 over 17m) kubelet Container image "eu.gcr.io/py...my/pya...my_appserver#sha256:479bf3e12ee2b41..e05e8579b940adc8845be74956f5" already present on machine
Warning BackOff 2m42s (x64 over 17m) kubelet Back-off restarting failed container

Coredns in Crashloopbackoff state with calico network

I have a ubuntu 16.04 running in virtual box. I installed Kubernetes on it as a single node using kubeadm.
But coredns pods are in Crashloopbackoff state.
All other pods are running.
Single interface(enp0s3) - Bridge Network
Applied calico using
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
output on kubectl describe pod:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 41m default-scheduler Successfully assigned kube-system/coredns-66bff467f8-dxzq7 to kube
Normal Pulled 39m (x5 over 41m) kubelet, kube Container image "k8s.gcr.io/coredns:1.6.7" already present on machine
Normal Created 39m (x5 over 41m) kubelet, kube Created container coredns
Normal Started 39m (x5 over 41m) kubelet, kube Started container coredns
Warning BackOff 87s (x194 over 41m) kubelet, kube Back-off restarting failed container
I did a kubectl logs <coredns-pod> and found error logs below and looked in the mentioned link
As per suggestion, added resolv.conf = /etc/resolv.conf at the end of /etc/kubernetes/kubelet/conf.yaml and recreated the pod.
kubectl logs coredns-66bff467f8-dxzq7 -n kube-system
.:53 [INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7 CoreDNS-1.6.7 linux/amd64, go1.13.6, da7f65b [FATAL] plugin/loop: Loop (127.0.0.1:34536 -> :53) detected for zone ".", see coredns.io/plugins/loop#troubleshooting. Query: "HINFO 8322382447049308542.5528484581440387393."
root#kube:/home/kube#
Commented below line in /etc/resolv.conf (Host machine) and delete the coredns pods in kube-system namespace.
New pods came in running state :)
#nameserver 127.0.1.1

Troubles while installing GlusterFS on Kubernetes cluster using Heketi

I try to install GlusterFS on my kubernetes cluster using heketi. I start gk-deploy but it shows that pods aren't found:
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
GlusterFS pods ... not found.
deploy-heketi pod ... not found.
heketi pod ... not found.
gluster-s3 pod ... not found.
Creating initial resources ... Error from server (AlreadyExists): error when creating "/heketi/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "heketi-sa-view" already exists
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view not labeled
OK
node/sapdh2wrk1 not labeled
node/sapdh2wrk2 not labeled
node/sapdh2wrk3 not labeled
daemonset.extensions/glusterfs created
Waiting for GlusterFS pods to start ... pods not found.
I've started gk-deploy more than once.
I have 3 nodes in my kubernetes cluster and it seems like pods can't start up on none of them, but I don't understand why.
Pods are created but aren't ready:
kubectl get pods
NAME READY STATUS RESTARTS AGE
glusterfs-65mc7 0/1 Running 0 16m
glusterfs-gnxms 0/1 Running 0 16m
glusterfs-htkmh 0/1 Running 0 16m
heketi-754dfc7cdf-zwpwn 0/1 ContainerCreating 0 74m
Here is a log of one GlusterFS Pod, it ends with a warning:
Events:
Type Reason Age From Message
Normal Scheduled 19m default-scheduler Successfully assigned default/glusterfs-65mc7 to sapdh2wrk1
Normal Pulled 19m kubelet, sapdh2wrk1 Container image "gluster/gluster-centos:latest" already present on machine
Normal Created 19m kubelet, sapdh2wrk1 Created container
Normal Started 19m kubelet, sapdh2wrk1 Started container
Warning Unhealthy 13m (x12 over 18m) kubelet, sapdh2wrk1 Liveness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Warning Unhealthy 3m58s (x35 over 18m) kubelet, sapdh2wrk1 Readiness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Glusterfs-5.8-100.1 is installed and started up on every node including master.
What is the reason why Pods don't start up?

kube-scheduler and kube-controller-manager restarting

I have kubernetes 1.15.3 setup
My kube-controller & kube-scheduler are restarting very frequently . This is happening after kubernetes is upgraded to 1.15.3.
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-nmt5d 1/1 Running 37 24d
coredns-5c98db65d4-tg4kx 1/1 Running 37 24d
etcd-ana01 1/1 Running 1 24d
kube-apiserver-ana01 1/1 Running 10 24d
**kube-controller-manager-ana01 1/1 Running 477 9d**
kube-flannel-ds-amd64-2srzb 1/1 Running 0 12d
kube-proxy-2hvcl 1/1 Running 0 23d
**kube-scheduler-ana01 1/1 Running 518 9d**
tiller-deploy-8557598fbc-kxntc 1/1 Running 0 11d
Here is the logs of the system
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 39m (x500 over 23d) kubelet, ana01 Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
Warning BackOff 39m (x1873 over 23d) kubelet, ana01 Back-off restarting failed container
Normal Pulled 28m (x519 over 24d) kubelet, ana01 Container image "k8s.gcr.io/kube-scheduler:v1.15.3" already present on machine
Normal Created 28m (x519 over 24d) kubelet, ana01 Created container kube-scheduler
Normal Started 27m (x519 over 24d) kubelet, ana01 Started container kube-scheduler
logs are
I0928 09:10:23.554335 1 serving.go:319] Generated self-signed cert in-memory
W0928 09:10:25.002268 1 authentication.go:387] failed to read in-cluster kubeconfig for delegated authentication: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W0928 09:10:25.002523 1 authentication.go:249] No authentication-kubeconfig provided in order to lookup client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication won't work.
W0928 09:10:25.002607 1 authentication.go:252] No authentication-kubeconfig provided in order to lookup requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
W0928 09:10:25.002947 1 authorization.go:177] **failed to read in-cluster kubeconfig for delegated authorization: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory**
W0928 09:10:25.003116 1 authorization.go:146] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.
I0928 09:10:25.021201 1 server.go:142] Version: v1.15.3