New deployment is going into ImagePullBackOff - kubernetes

I have installed minikube in my local machine and have created a deployment from a yaml file with imagePullPolicy: Always.
On runnning, minikube kubectl -- get pods,the status of the pods is imagePullPolicy: ImagePullBackOff.
and on running
minikube kubectl -- describe pod podname
I am getting the following results:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 5m39s (x103 over 26h) kubelet Pulling image "deploy1:1.14.2"
Normal BackOff 44s (x2244 over 26h) kubelet Back-off pulling image "deploy1:1.14.2"
Please suggest how to make the deployment running. I have gone through the link but I could not find the service.xml file of the pod. Where is it in the Kubernetes in the local system/?

It means either you are trying to pull a image from a private repo or you don't have connectivity to outside. You can test this, but running command kubectl run <pod_name> --image=nginx. If this works then it means you are trying to pull a image from a repo which requires auth.

Related

kubernetes cannot pull a public image

kubernetes cannot pull a public image. Standard images like nginx are downloading successfully, but my pet project is not downloading. I'm using minikube for launch kubernetes-cluster
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-gateway-deploumnet
labels:
app: api-gateway
spec:
replicas: 3
selector:
matchLabels:
app: api-gateway
template:
metadata:
labels:
app: api-gateway
spec:
containers:
- name: api-gateway
image: creatorsprodhouse/api-gateway:latest
imagePullPolicy: Always
ports:
- containerPort: 80
when I try to create a deployment I get an error that kubernetes cannot download my public image.
$ kubectl get pods
result:
NAME READY STATUS RESTARTS AGE
api-gateway-deploumnet-599c784984-j9mf2 0/1 ImagePullBackOff 0 13m
api-gateway-deploumnet-599c784984-qzklt 0/1 ImagePullBackOff 0 13m
api-gateway-deploumnet-599c784984-csxln 0/1 ImagePullBackOff 0 13m
$ kubectl logs api-gateway-deploumnet-599c784984-csxln
result
Error from server (BadRequest): container "api-gateway" in pod "api-gateway-deploumnet-86f6cc5b65-xdx85" is waiting to start: trying and failing to pull image
What could be the problem? The standard images are downloading but my public one is not. Any help would be appreciated.
EDIT 1
$ api-gateway-deploumnet-599c784984-csxln
result:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m22s default-scheduler Successfully assigned default/api-gateway-deploumnet-849899786d-mq4td to minikube
Warning Failed 3m8s kubelet Failed to pull image "creatorsprodhouse/api-gateway:latest": rpc error: code = Unknown desc = context deadline exceeded
Warning Failed 3m8s kubelet Error: ErrImagePull
Normal BackOff 3m7s kubelet Back-off pulling image "creatorsprodhouse/api-gateway:latest"
Warning Failed 3m7s kubelet Error: ImagePullBackOff
Normal Pulling 2m53s (x2 over 8m21s) kubelet Pulling image "creatorsprodhouse/api-gateway:latest"
EDIT 2
If I try to download a separate docker image, it's fine
$ docker pull creatorsprodhouse/api-gateway:latest
result:
Digest: sha256:e664a9dd9025f80a3dd60d157ce1464d4df7d0f8a00538e6a137d44f9f9f12aa
Status: Downloaded newer image for creatorsprodhouse/api-gateway:latest
docker.io/creatorsprodhouse/api-gateway:latest
EDIT 3
After advice to restart minikube
$ minikube stop
$ minikube delete --purge
$ minikube start --cni=calico
I started the pods.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m28s default-scheduler Successfully assigned default/api-gateway-deploumnet-849899786d-bkr28 to minikube
Warning FailedCreatePodSandBox 4m27s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "7e112c92e24199f268ec9c6f3a6db69c2572c0751db9fd57a852d1b9b412e0a1" network for pod "api-gateway-deploumnet-849899786d-bkr28": networkPlugin cni failed to set up pod "api-gateway-deploumnet-849899786d-bkr28_default" network: failed to set bridge addr: could not add IP address to "cni0": permission denied, failed to clean up sandbox container "7e112c92e24199f268ec9c6f3a6db69c2572c0751db9fd57a852d1b9b412e0a1" network for pod "api-gateway-deploumnet-849899786d-bkr28": networkPlugin cni failed to teardown pod "api-gateway-deploumnet-849899786d-bkr28_default" network: running [/usr/sbin/iptables -t nat -D POSTROUTING -s 10.85.0.34 -j CNI-57e7da7379b524635074e6d0 -m comment --comment name: "crio" id: "7e112c92e24199f268ec9c6f3a6db69c2572c0751db9fd57a852d1b9b412e0a1" --wait]: exit status 2: iptables v1.8.4 (legacy): Couldn't load target `CNI-57e7da7379b524635074e6d0':No such file or directory
Try `iptables -h' or 'iptables --help' for more information.
I could not solve the problem in the ways I was suggested. However, it worked when I ran minikube with a different driver
$ minikube start --driver=none
--driver=none means that the cluster will run on your host instead of the standard --driver=docker which runs the cluster in docker.
It is better to run minikube with --driver=docker as it is safer and easier, but it didn't work for me as I could not download my images. For me personally it is ok to use --driver=none although it is a bit dangerous.
In general, if anyone knows what the problem is, please answer my question. In the meantime you can try to run minikube cluster on your host with the command I mentioned above.
In any case, thank you very much for your attention!

kubectl ImagePullBackOff due to secret

I'm creating kubevirt in minikube, initially kubevirt-operator.yaml fails with ImagePullBackOff. After I added secret in the yaml
imagePullSecrets:
- name: regcred
containers:
all my virt-operator* started to run. virt-api* pods still shows ImagePullBackOff.
The error comes out as
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 27m default-scheduler Successfully assigned kubevirt/virt-api-787487d9cd-t68qf to minikube
Normal Pulling 25m (x4 over 27m) kubelet Pulling image "us-ashburn-1.ocir.io/xxx/virt-api:v0.54.0"
Warning Failed 25m (x4 over 27m) kubelet Failed to pull image "us-ashburn-1.ocir.io/xxx/virt-api:v0.54.0": rpc error: code = Unknown desc = Error response from daemon: pull access denied for us-ashburn-1.ocir.io/xxx/virt-api, repository does not exist or may require 'docker login': denied: Anonymous users are only allowed read access on public repos
Warning Failed 25m (x4 over 27m) kubelet Error: ErrImagePull
Warning Failed 25m (x6 over 27m) kubelet Error: ImagePullBackOff
Normal BackOff 2m26s (x106 over 27m) kubelet Back-off pulling image "us-ashburn-1.ocir.io/xxx/virt-api:v0.54.0"
Manually, I can pull the same image with docker login. Any help would be much appreciated. Thanks
This docker image looks like it is in a private registry(and from oracle). And I assume the regcred is not correct. Can you login there with docker login? if so you can create regcred secret like this
$ kubectl create secret docker-registry regcred --docker-server=<region-key>.ocir.io --docker-username='<tenancy-namespace>/<oci-username>' --docker-password='<oci-auth-token>' --docker-email='<email-address>'
Also check this oracle tutorial: https://www.oracle.com/webfolder/technetwork/tutorials/obe/oci/oke-and-registry/index.html
Here you can find the steps to implement secret values to the cluster.
If you are using a private registry, check that your secret exists and the secret is correct. Your secret should also be in the same namespace.
Your Minikube is a VM not your localhost. You try this
Open Terminal
eval $(minikube docker-env)
docker build .
kubectl create -f deployment.yaml
just valid this terminal. if closed terminal again open terminal and write eval $(minikube docker-env)
eval $(minikube docker-env) this code build image in Minikube
Also, try to login docker on all nodes by using docker login.
There is also a lengthy blog post describing how to debug image pull back-off in depth here
Hey if you look here. I think you can find some helpful documentation.
What they are doing is, they are upload the dockerconfig file which has login credentials as a secret and then referring to that in the deployment.
You could try to follow these steps and do something similar. Let me know if it works

More detailed monitoring of Pod states

Our Pods usually spend at least a minute and up to several minutes in the Pending state, the events via kubectl describe pod x yield:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned testing/runner-2zyekyp-project-47-concurrent-0tqwl4 to host
Normal Pulled 55s kubelet, host Container image "registry.com/image:c1d98da0c17f9b1d4ca81713c138ee2e" already present on machine
Normal Created 55s kubelet, host Created container build
Normal Started 54s kubelet, host Started container build
Normal Pulled 54s kubelet, host Container image "gitlab/gitlab-runner-helper:x86_64-6214287e" already present on machine
Normal Created 54s kubelet, host Created container helper
Normal Started 54s kubelet, host Started container helper
The provided information is not exactly detailed as to figure out exactly what is happening.
Question:
How can we gather more detailed metrics of what exactly and when exactly something happens in regards to get a Pod running in order to troubleshoot which step exactly needs how much time?
Special interest would be the metric of how long it takes to mount a volume.
Check kubelet and kube scheduler logs because kube scheduler schedules the pod to a node and kubelet starts the pod on that node and reports the status as ready.
journalctl -u kubelet # after logging into the kubernetes node
kubectl logs kube-scheduler -n kube-system
Describe the pod, deployment, replicaset to get more details
kubectl describe pod podnanme -n namespacename
kubectl describe deploy deploymentnanme -n namespacename
kubectl describe rs replicasetnanme -n namespacename
Check events
kubectl get events -n namespacename
Describe the nodes and check available resources and status which should be ready.
kubectl describe node nodename

My old windows pods are dead and don't respond to http requests / exec fails

I have an AKS cluster with a mix of Windows and Linux nodes and an nginx-ingress.
This all worked great, but a few days ago all my windows pods have become unresponsive.
Everything is still green on the K8s dashboard, but they don't respond to HTTP requests and kubectl exec fails.
All the linux pods still work.
I created a new deployment with the exact same image and other properties, and this new pod works, responds to HTTP and kubectl exec works.
Q: How can I find out why my old pods died? How can I prevent this from occuring again in the future?
Note that this is a test cluster, so I have the luxury of being able to investigate, if this was prod I would have burned and recreated the cluster already.
Details:
https://aks-test.progress-cloud.com/eboswebApi/ is one of the old pods, https://aks-test.progress-cloud.com/eboswebApi2/ is the new pod.
When I look at the nginx log, I see a lot of connect() failed (111: Connection refused) while connecting to upstream.
When I try kubectl exec -it <podname> --namespace <namespace> -- cmd I get one of two behaviors:
Either the command immediatly returns without printing anything, or I get an error:
container 1dfffa08d834953c29acb8839ea2d4c6b78b7a530371d98c16b15132d49f5c52 encountered an error during CreateProcess: failure in a Windows system call: The remote procedure call failed and did not execute. (0x6bf) extra info: {"CommandLine":"cmd","WorkingDirectory":"C:\\inetpub\\wwwroot","Environment":{...},"EmulateConsole":true,"CreateStdInPipe":true,"CreateStdOutPipe":true,"ConsoleSize":[0,0]}
command terminated with exit code 126
kubectl describe pod works on both.
The only difference I could find was that on the old pod, I don't get any events:
Events: <none>
whereas on the new pod I get a bunch of them for pulling the image etc:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 39m default-scheduler Successfully assigned ingress-basic/ebos-webapi-test-2-78786968f4-xmvfw to aksnpwin000000
Warning Failed 38m kubelet, aksnpwin000000 Error: failed to start container "ebos-webapi-test-2": Error response from daemon: hcsshim::CreateComputeSystem ebos-webapi-test-2: The binding handle is invalid.
(extra info: {"SystemType":"Container","Name":"ebos-webapi-test-2","Owner":"docker","VolumePath":"\\\\?\\Volume{dac026db-26ab-11ea-bb33-e3730ff9432d}","IgnoreFlushesDuringBoot":true,"LayerFolderPath":"C:\\ProgramData\\docker\\windowsfilter\\ebos-webapi-test-2","Layers":[{"ID":"8c160b6e-685a-58fc-8c4b-beb407ad09b4","Path":"C:\\ProgramData\\docker\\windowsfilter\\12061f29088664dc41c0836c911ed7ced1f6d7ed38b1c932c25cd8ca85a3a88e"},{"ID":"6a230a46-a97c-5e30-ac4a-636e62cd9253","Path":"C:\\ProgramData\\docker\\windowsfilter\\8c0ce5a9990bc433c4d937aa148a4251ef55c1aa7caccf1b2025fd64b4feee97"},{"ID":"240d5705-d8fe-555b-a966-1fc304552b64","Path":"C:\\ProgramData\\docker\\windowsfilter\\2b334b769fe19d0edbe1ad8d1ae464c8d0103a7225b0c9e30fdad52e4b454b35"},{"ID":"5f5d8837-5f62-5a76-a706-9afb789e45e4","Path":"C:\\ProgramData\\docker\\windowsfilter\\3d1767755b0897aaae21e3fb7b71e2d880de22473f0071b0dca6301bb6110077"},{"ID":"978503cb-b816-5f66-ba41-ed154db333d5","Path":"C:\\ProgramData\\docker\\windowsfilter\\53d2e85a90d2b8743b0502013355df5c5e75448858f0c1f5b435281750653520"},{"ID":"d7d0d14e-b097-5104-a492-da3f9396bb06","Path":"C:\\ProgramData\\docker\\windowsfilter\\38830351b46e7a0598daf62d914eb2bf01e6eefde7ac560e8213f118d2bd648c"},{"ID":"90b1c608-be4c-55a1-a787-db3a97670149","Path":"C:\\ProgramData\\docker\\windowsfilter\\84b71fda82ea0eacae7b9382eae2a26f3c71bf118f5c80e7556496f21e754126"},{"ID":"700711b2-d578-5d7c-a17f-14165a5b3507","Path":"C:\\ProgramData\\docker\\windowsfilter\\08dd6f93c96c1ac6acd3d2e8b60697340c90efe651f805809dbe87b6bd26a853"},{"ID":"270de12a-461c-5b0c-8976-a48ae0de2063","Path":"C:\\ProgramData\\docker\\windowsfilter\\115de87074fadbc3c44fc33813257c566753843f8f4dd7656faa111620f71f11"},{"ID":"521250bb-4f30-5ac4-8fcd-b4cf45866627","Path":"C:\\ProgramData\\docker\\windowsfilter\\291e51f5f030d2a895740fae3f61e1333b7fae50a060788040c8d926d46dbe1c"},{"ID":"6dded7bf-8c1e-53bb-920e-631e78728316","Path":"C:\\ProgramData\\docker\\windowsfilter\\938e721c29d2f2d23a00bf83e5bc60d92f9534da409d0417f479bd5f06faa080"},{"ID":"90dec4e9-89fe-56ce-a3c2-2770e6ec362c","Path":"C:\\ProgramData\\docker\\windowsfilter\\d723ebeafd1791f80949f62cfc91a532cc5ed40acfec8e0f236afdbcd00bbff2"},{"ID":"94ac6066-b6f3-5038-9e1b-d5982fcefa00","Path":"C:\\ProgramData\\docker\\windowsfilter\\00d1bb6fc8abb630f921d3651b1222352510d5821779d8a53d994173a4ba1126"},{"ID":"037c6d16-5785-5bea-bab4-bc3f69362e0c","Path":"C:\\ProgramData\\docker\\windowsfilter\\c107cf79e8805e9ce6d81ec2a798bf4f1e3b9c60836a40025272374f719f2270"}],"ProcessorWeight":5000,"HostName":"ebos-webapi-test-2-78786968f4-xmvfw","MappedDirectories":[{"HostPath":"c:\\var\\lib\\kubelet\\pods\\c44f445c-272b-11ea-b9bc-ae0ece5532e1\\volumes\\kubernetes.io~secret\\default-token-n5tnc","ContainerPath":"c:\\var\\run\\secrets\\kubernetes.io\\serviceaccount","ReadOnly":true,"BandwidthMaximum":0,"IOPSMaximum":0,"CreateInUtilityVM":false}],"HvPartition":false,"NetworkSharedContainerName":"4c9bede623553673fde0da6e8dc92f9a55de1ff823a168a35623ad8128f83ecb"})
Normal Pulling 38m (x2 over 38m) kubelet, aksnpwin000000 Pulling image "progress.azurecr.io/eboswebapi:release-2019-11-11_16-41"
Normal Pulled 38m (x2 over 38m) kubelet, aksnpwin000000 Successfully pulled image "progress.azurecr.io/eboswebapi:release-2019-11-11_16-41"
Normal Created 38m (x2 over 38m) kubelet, aksnpwin000000 Created container ebos-webapi-test-2
Normal Started 38m kubelet, aksnpwin000000 Started container ebos-webapi-test-2

Installing kubernetes-dashboard with helm fails

I've just created a new kubernetes cluster. The only thing I have done beyond set up the cluster is install Tiller using helm init and install kubernetes dashboard through helm install stable/kubernetes-dashboard.
The helm install command seems to be successful and helm ls outputs:
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
exhaling-ladybug 1 Thu Oct 24 16:56:49 2019 DEPLOYED kubernetes-dashboard-1.10.0 1.10.1 default
However after waiting a few minutes the deployment is still not ready.
Running kubectl get pods shows that the pod's status as CrashLoopBackOff.
NAME READY STATUS RESTARTS AGE
exhaling-ladybug-kubernetes-dashboard 0/1 CrashLoopBackOff 10 31m
The description for the pod shows the following events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 31m default-scheduler Successfully assigned default/exhaling-ladybug-kubernetes-dashboard to nodes-1
Normal Pulling 31m kubelet, nodes-1 Pulling image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"
Normal Pulled 31m kubelet, nodes-1 Successfully pulled image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"
Normal Started 30m (x4 over 31m) kubelet, nodes-1 Started container kubernetes-dashboard
Normal Pulled 30m (x4 over 31m) kubelet, nodes-1 Container image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1" already present on machine
Normal Created 30m (x5 over 31m) kubelet, nodes-1 Created container kubernetes-dashboard
Warning BackOff 107s (x141 over 31m) kubelet, nodes-1 Back-off restarting failed container
And the logs show the following panic message
panic: secrets is forbidden: User "system:serviceaccount:default:exhaling-ladybug-kubernetes-dashboard" cannot create resource "secrets" in API group "" in the namespace "kube-system"
Am I doing something wrong? Why is it trying to create a secret somewhere it cannot?
Is it possible to setup without giving the dashboard account cluster-admin permissions?
Check this out mate:
https://akomljen.com/installing-kubernetes-dashboard-per-namespace/
You can create your own roles if you want to.
By default i have puted namespace equals default, but if is other you need to replace for yours
kubectl create serviceaccount exhaling-ladybug-kubernetes-dashboard
kubectl create clusterrolebinding kubernetes-dashboard --clusterrole=cluster-admin --serviceaccount=default:exhaling-ladybug-kubernetes-dashboard
based on the error you have posted what is happineening is:
1. helm is trying is install dashboard but by default it was picking up the namespace you have provided.
For solving that:
1. either you create roles based on the namespace you are trying to install, by default namespace should be: default
2. just install the helm chart in the proper location which is required by helm chart, in your case you can do:
helm install stable/kubernetes-dashboard --name=kubernetes-dashboard --namespace=kube-system
Try creating clusterrole
kubectl create clusterrolebinding kubernetes-dashboard --clusterrole=cluster-admin --serviceaccount=kube-system:kubernetes-dashboard