kubernetes cannot pull a public image. Standard images like nginx are downloading successfully, but my pet project is not downloading. I'm using minikube for launch kubernetes-cluster
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-gateway-deploumnet
labels:
app: api-gateway
spec:
replicas: 3
selector:
matchLabels:
app: api-gateway
template:
metadata:
labels:
app: api-gateway
spec:
containers:
- name: api-gateway
image: creatorsprodhouse/api-gateway:latest
imagePullPolicy: Always
ports:
- containerPort: 80
when I try to create a deployment I get an error that kubernetes cannot download my public image.
$ kubectl get pods
result:
NAME READY STATUS RESTARTS AGE
api-gateway-deploumnet-599c784984-j9mf2 0/1 ImagePullBackOff 0 13m
api-gateway-deploumnet-599c784984-qzklt 0/1 ImagePullBackOff 0 13m
api-gateway-deploumnet-599c784984-csxln 0/1 ImagePullBackOff 0 13m
$ kubectl logs api-gateway-deploumnet-599c784984-csxln
result
Error from server (BadRequest): container "api-gateway" in pod "api-gateway-deploumnet-86f6cc5b65-xdx85" is waiting to start: trying and failing to pull image
What could be the problem? The standard images are downloading but my public one is not. Any help would be appreciated.
EDIT 1
$ api-gateway-deploumnet-599c784984-csxln
result:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m22s default-scheduler Successfully assigned default/api-gateway-deploumnet-849899786d-mq4td to minikube
Warning Failed 3m8s kubelet Failed to pull image "creatorsprodhouse/api-gateway:latest": rpc error: code = Unknown desc = context deadline exceeded
Warning Failed 3m8s kubelet Error: ErrImagePull
Normal BackOff 3m7s kubelet Back-off pulling image "creatorsprodhouse/api-gateway:latest"
Warning Failed 3m7s kubelet Error: ImagePullBackOff
Normal Pulling 2m53s (x2 over 8m21s) kubelet Pulling image "creatorsprodhouse/api-gateway:latest"
EDIT 2
If I try to download a separate docker image, it's fine
$ docker pull creatorsprodhouse/api-gateway:latest
result:
Digest: sha256:e664a9dd9025f80a3dd60d157ce1464d4df7d0f8a00538e6a137d44f9f9f12aa
Status: Downloaded newer image for creatorsprodhouse/api-gateway:latest
docker.io/creatorsprodhouse/api-gateway:latest
EDIT 3
After advice to restart minikube
$ minikube stop
$ minikube delete --purge
$ minikube start --cni=calico
I started the pods.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m28s default-scheduler Successfully assigned default/api-gateway-deploumnet-849899786d-bkr28 to minikube
Warning FailedCreatePodSandBox 4m27s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "7e112c92e24199f268ec9c6f3a6db69c2572c0751db9fd57a852d1b9b412e0a1" network for pod "api-gateway-deploumnet-849899786d-bkr28": networkPlugin cni failed to set up pod "api-gateway-deploumnet-849899786d-bkr28_default" network: failed to set bridge addr: could not add IP address to "cni0": permission denied, failed to clean up sandbox container "7e112c92e24199f268ec9c6f3a6db69c2572c0751db9fd57a852d1b9b412e0a1" network for pod "api-gateway-deploumnet-849899786d-bkr28": networkPlugin cni failed to teardown pod "api-gateway-deploumnet-849899786d-bkr28_default" network: running [/usr/sbin/iptables -t nat -D POSTROUTING -s 10.85.0.34 -j CNI-57e7da7379b524635074e6d0 -m comment --comment name: "crio" id: "7e112c92e24199f268ec9c6f3a6db69c2572c0751db9fd57a852d1b9b412e0a1" --wait]: exit status 2: iptables v1.8.4 (legacy): Couldn't load target `CNI-57e7da7379b524635074e6d0':No such file or directory
Try `iptables -h' or 'iptables --help' for more information.
I could not solve the problem in the ways I was suggested. However, it worked when I ran minikube with a different driver
$ minikube start --driver=none
--driver=none means that the cluster will run on your host instead of the standard --driver=docker which runs the cluster in docker.
It is better to run minikube with --driver=docker as it is safer and easier, but it didn't work for me as I could not download my images. For me personally it is ok to use --driver=none although it is a bit dangerous.
In general, if anyone knows what the problem is, please answer my question. In the meantime you can try to run minikube cluster on your host with the command I mentioned above.
In any case, thank you very much for your attention!
Related
This is my Skaffold config.
apiVersion: skaffold/v2beta19
kind: Config
metadata:
name: my-api
build:
local:
useDockerCLI: true
artifacts:
- image: registry.gitlab.com/lodiee/my-api
docker:
buildArgs:
DATABASE_HOST: '{{.DATABASE_HOST}}'
deploy:
helm:
releases:
- name: my-api
artifactOverrides:
image: registry.gitlab.com/lodiee/my-api
imageStrategy:
helm: {}
remoteChart: bitnami/node
setValueTemplates:
image.tag: '{{.DIGEST_HEX}}'
setValues:
image.repository: registry.gitlab.com/lodiee/my-api
When I run Skaffold dev, everything works fine until I see an ImagePullBackoff error in pod logs. Here is a partial output of the log:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 102s default-scheduler Successfully assigned default/my-api-node-55cf68d989-7vltw to kind-control-plane
Normal BackOff 21s (x5 over 99s) kubelet Back-off pulling image "docker.io/registry.gitlab.com/lodiee/my-api:60ef7d4a463e269de9e176afbdd2362fd50870e494ff7a8abf25ece16c0d100c"
Warning Failed 21s (x5 over 99s) kubelet Error: ImagePullBackOff
Normal Pulling 6s (x4 over 101s) kubelet Pulling image "docker.io/registry.gitlab.com/lodiee/my-api:60ef7d4a463e269de9e176afbdd2362fd50870e494ff7a8abf25ece16c0d100c"
Warning Failed 4s (x4 over 100s) kubelet Failed to pull image "docker.io/registry.gitlab.com/lodiee/my-api:60ef7d4a463e269de9e176afbdd2362fd50870e494ff7a8abf25ece16c0d100c": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/registry.gitlab.com/lodiee/my-api:60ef7d4a463e269de9e176afbdd2362fd50870e494ff7a8abf25ece16c0d100c": failed to resolve reference "docker.io/registry.gitlab.com/lodiee/my-api:60ef7d4a463e269de9e176afbdd2362fd50870e494ff7a8abf25ece16c0d100c": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
As you can see from the output, kubernetes adds docker.io prefix in front of my image repo address and so it can't find the image at that address.
Is this the normal behavior? Also this is a local development setup and I use skaffold dev to launch it, why does k8s try to go to that address?
Thank you.
I am trying to setup RabbitMQ Operator and RabbitMQ Cluster on K8S cluster on bare metal using this link
K8S Cluster has got 1 master and 1 worker node
RabbitMQ Cluster pod log
[root#re-ctrl01 containers]# kubectl logs definition-server-0 -n rabbitmq-system
BOOT FAILED (Tailored output)
===========
ERROR: epmd error for host definition-server-0.definition-nodes.rabbitmq-system: nxdomain (non-existing domain)
11:51:13.733 [error] Supervisor rabbit_prelaunch_sup had child prelaunch started with rabbit_prelaunch:run_prelaunch_first_phase() at undefined exit with reason {epmd_error,"definition-server-0.definition-nodes.rabbitmq-system",nxdomain} in context start_error. Crash dump is being written to: erl_crash.dump...
[root#re-ctrl01 containers]# kubectl describe pod definition-server-0 -n rabbitmq-system
Name: definition-server-0
Namespace: rabbitmq-system
Priority: 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 44s default-scheduler Successfully assigned rabbitmq-system/definition-server-0 to re-ctrl01.local
Normal Pulled 43s kubelet Container image "rabbitmq:3.8.16-management" already present on machine
Normal Created 43s kubelet Created container setup-container
Normal Started 43s kubelet Started container setup-container
Normal Pulled 42s kubelet Container image "rabbitmq:3.8.16-management" already present on machine
Normal Created 42s kubelet Created container rabbitmq
Normal Started 42s kubelet Started container rabbitmq
Warning Unhealthy 4s (x3 over 24s) kubelet Readiness probe failed: dial tcp 10.244.0.xxx:5672: connect: connection refused
I added the following entries to /etc/hosts file of worker node because I am NOT sure whether the entry has to be added to master or worker
[root#re-worker01 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
127.0.0.1 re-worker01.local re-worker01 definition-server-0.definition-nodes.rabbitmq-system
I am stuck with this issue for almost 2 days. I googled and found similar issues but none resolved my issue
I see multiple issues in pod logs and describe output and I am unable to find out the root cause
Where can I find erl_crash.dump file on K8S ?
Is this really a hostname related issue ?
10.244.0.xxx:5672: connect: connection refused - Is this issue is because of 'epmd' or something else ?
I managed to resolve the issue after spending lot of time
I added the host definition-server-0.definition-nodes.rabbitmq-system to /etc/hosts file of RabbitMQ Cluster pod using hostAliases
YAML to add hostAliases is given below
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: definition
namespace: rabbitmq-system
spec:
replicas: 1
override:
statefulSet:
spec:
template:
spec:
containers: []
hostAliases:
- ip: "127.0.0.1"
hostnames:
- "definition-server-0"
- "definition-server-0.definition-nodes.rabbitmq-system"
This question already has answers here:
My kubernetes pods keep crashing with "CrashLoopBackOff" but I can't find any log
(21 answers)
Closed 2 years ago.
I am new to Kubernetes and trying to learn but I am stuck with an error that I cannot find an explanation for. I am running Pods and Deployments in my cluster and they are running perfectly as shown in the CLI, but after a while they keep crashing and the Pods need to restart.
I did some research to fix my issue before posting here, but the way I understood it, I will have to make a deployment so that my replicaSets will manage my Pods lifecycle and not deploy Pods independently. But as you can see also Pods in deployment is crashing.
kubectl get pods
operator-5bf8c8484c-fcmnp 0/1 CrashLoopBackOff 9 34m
operator-5bf8c8484c-phptp 0/1 CrashLoopBackOff 9 34m
operator-5bf8c8484c-wh7hm 0/1 CrashLoopBackOff 9 34m
operator-pod 0/1 CrashLoopBackOff 12 49m
kubectl describe pods operator
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/operator-pod to workernode
Normal Created 30m (x5 over 34m) kubelet, workernode Created container operator-pod
Normal Started 30m (x5 over 34m) kubelet, workernode Started container operator-pod
Normal Pulled 29m (x6 over 34m) kubelet, workernode Container image "operator-api_1:java" already present on machine
Warning BackOff 4m5s (x101 over 33m) kubelet, workernode Back-off restarting failed container
deployment yaml file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: operator
labels:
app: java
spec:
replicas: 3
selector:
matchLabels:
app: call
template:
metadata:
labels:
app: call
spec:
containers:
- name: operatorapi
image: operator-api_1:java
ports:
- containerPort: 80
Can someone help me out, how can I debug?
The reason is most probably the process running in container finished its task and terminated by container OS after a while. Then the pod is being restarted by kubelet.
What I recommend you to solve this issue, please check the process running in container and try to keep it alive forever. You can create a loop to run this process in container or you can use some commands for container on the deployment.yaml
Here is a reference for you to understand and debug pod failure reason.
https://kubernetes.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/
There are several ways to debug such a scenario and I recommend viewing Kubernetes documentation for best-practices. I typically have success with the following 2 approaches:
Logs: You can view the logs for the application using the command below:
kubectl logs -l app=java
If you have multiple containers within that pod, you can filter it down:
kubectl logs -l app=java -c operatorapi
Events: You can get a lot of information from events as shown below (sorted by timestamp). Keep in mind that there could be a lot of noise in events depending on the number of apps and services that you may have so you have to filter it down further:
kubectl get events --sort-by='.metadata.creationTimestamp'
Feel free to share the output from those two and I can help you debug further.
Im trying to create a pod using my local docker image as follow.
1.First I run this command in terminal
eval $(minikube docker-env)
2.I created a docker image as follow
sudo docker image build -t my-first-image:3.0.0 .
3.I created the pod.yml as shown below and I run this command
kubectl -f create pod.yml.
4.then i tried to run this command
kubectl get pods
but it shows following error
NAME READY STATUS RESTARTS AGE
multiplication-6b6d99554-d62kk 0/1 CrashLoopBackOff 9 22m
multiplication2019-5b4555bcf4-nsgkm 0/1 CrashLoopBackOff 8 17m
my-first-pod 0/1 CrashLoopBackOff 4 2m51
5.i get the pods logs
kubectl describe pod my-first-pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m22s default-scheduler Successfully assigned default/my-first-pod to minikube
Normal Pulled 5m20s (x4 over 6m17s) kubelet, minikube Successfully pulled image "docker77nira/myfirstimage:latest"
Normal Created 5m20s (x4 over 6m17s) kubelet, minikube Created container
Normal Started 5m20s (x4 over 6m17s) kubelet, minikube Started container
Normal Pulling 4m39s (x5 over 6m21s) kubelet, minikube pulling image "docker77nira/myfirstimage:latest"
Warning BackOff 71s (x26 over 6m12s) kubelet, minikube Back-off restarting failed container
Dockerfile
FROM node:carbon
WORKDIR /app
COPY . .
CMD [ "node", "index.js" ]
pods.yml
kind: Pod
apiVersion: v1
metadata:
name: my-first-pod
spec:
containers:
- name: my-first-container
image: my-first-image:3.0.0
index.js
var http = require('http');
var server = http.createServer(function(request, response) {
response.statusCode = 200;
response.setHeader('Content-Type', 'text/plain');
response.end('Welcome to the Golden Guide to Kubernetes
Application Development!');
});
server.listen(3000, function() {
console.log('Server running on port 3000');
});
Try checking logs with command kubectl logs -f my-first-pod
I succeeded in running your image by performing these steps:
docker build -t foo .
then check if the container is working docker run -it foo
/app/index.js:5
response.end('Welcome to the Golden Guide to Kubernetes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: Invalid or unexpected token
at createScript (vm.js:80:10)
at Object.runInThisContext (vm.js:139:10)
at Module._compile (module.js:617:28)
at Object.Module._extensions..js (module.js:664:10)
at Module.load (module.js:566:32)
at tryModuleLoad (module.js:506:12)
at Function.Module._load (module.js:498:3)
at Function.Module.runMain (module.js:694:10)
at startup (bootstrap_node.js:204:16)
at bootstrap_node.js:625:3
Not sure if this was the outcome you wanted to see, the container itself runs. But in Kubernetes it gets into ErrImagePull
Then after editing your Pod.yaml inspired by #Harsh Manvar it works fine with this. So the problem with exiting after completed command was just part of the problem.
apiVersion: v1
kind: Pod
metadata:
name: hello-pod
spec:
restartPolicy: Never
containers:
- name: hello
image: "foo"
imagePullPolicy: Never
command: [ "sleep" ]
args: [ "infinity" ]
This is Minikube so you can reuse the images, but if you would have more nodes this might not work at all. You can find a good explanation about using local docker images with Kubernetes here.
kind: Pod
apiVersion: v1
metadata:
name: my-first-pod
spec:
containers:
- name: my-first-container
image: my-first-image:3.0.0
command: [ "sleep" ]
args: [ "infinity" ]
I think your pod is getting terminated after execution of script inside index.js
I have a raspberry pi cluster (one master , 3 nodes)
My basic image is : raspbian stretch lite
I already set up a basic kubernetes setup where a master can see all his nodes (kubectl get nodes) and they're all running.
I used a weave network plugin for the network communication
When everything is all setup i tried to run a nginx pod (first with some replica's but now just 1 pod) on my cluster as followed
kubectl run my-nginx --image=nginx
But somehow the pod get stuck in the status "Container creating" , when i run docker images i can't see the nginx image being pulled. And normally an nginx image is not that large so it had to be pulled already by now (15 minutes).
The kubectl describe pods give the error that the pod sandbox failed to create and kubernetes will rec-create it.
I searched everything about this issue and tried the solutions on stackoverflow (reboot to restart cluster, searched describe pods , new network plugin tried it with flannel) but i can't see what the actual problem is.
I did the exact same thing in Virtual box (just ubuntu not ARM ) and everything worked.
First i thougt it was a permission issue because i run everything as a normal user , but in vm i did the same thing and nothing changed.
Then i checked kubectl get pods --all-namespaces to verify that the pods for the weaver network and kube-dns are running and also nothing wrong over there .
Is this a firewall issue in Raspberry pi ?
Is the weave network plugin not compatible (even the kubernetes website says it is) with arm devices ?
I 'am guessing there is an api network problem and thats why i can't get my pod runnning on a node
[EDIT]
Log files
kubectl describe podName
>
> Name: my-nginx-9d5677d94-g44l6 Namespace: default Node: kubenode1/10.1.88.22 Start Time: Tue, 06 Mar 2018 08:24:13
> +0000 Labels: pod-template-hash=581233850
> run=my-nginx Annotations: <none> Status: Pending IP: Controlled By: ReplicaSet/my-nginx-9d5677d94 Containers:
> my-nginx:
> Container ID:
> Image: nginx
> Image ID:
> Port: 80/TCP
> State: Waiting
> Reason: ContainerCreating
> Ready: False
> Restart Count: 0
> Environment: <none>
> Mounts:
> /var/run/secrets/kubernetes.io/serviceaccount from default-token-phdv5 (ro) Conditions: Type Status
> Initialized True Ready False PodScheduled True
> Volumes: default-token-phdv5:
> Type: Secret (a volume populated by a Secret)
> SecretName: default-token-phdv5
> Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for
> 300s
> node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From
> Message ---- ------ ---- ----
> ------- Normal Scheduled 5m default-scheduler Successfully assigned my-nginx-9d5677d94-g44l6 to kubenode1 Normal
> SuccessfulMountVolume 5m kubelet, kubenode1 MountVolume.SetUp
> succeeded for volume "default-token-phdv5" Warning
> FailedCreatePodSandBox 1m kubelet, kubenode1 Failed create pod
> sandbox. Normal SandboxChanged 1m kubelet, kubenode1
> Pod sandbox changed, it will be killed and re-created.
kubectl logs podName
Error from server (BadRequest): container "my-nginx" in pod "my-nginx-9d5677d94-g44l6" is waiting to start: ContainerCreating
journalctl -u kubelet gives this error
Mar 12 13:42:45 kubeMaster kubelet[16379]: W0312 13:42:45.824314 16379 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Mar 12 13:42:45 kubeMaster kubelet[16379]: E0312 13:42:45.824816 16379 kubelet.go:2104] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
The problem seems to be with my network plugin. In my /etc/systemd/system/kubelet.service.d/10.kubeadm.conf . the flags for the network plugins are present ? environment= kubelet_network_args --cni-bin-dir=/etc/cni/net.d
--network-plugin=cni
Thank you all for responding to my question.
I solved my problem now. For anyone who has come to my question in the future the solution was as followed.
I cloned my raspberry pi images because i wanted a basicConfig.img for when i needed to add a new node to my cluster of when one gets down.
Weave network (the plugin i used) got confused because on every node and master the os had the same machine-id. When i deleted the machine id and created a new one (and reboot the nodes) my error got fixed.
The commands to do this was
sudo rm /etc/machine-id
sudo rm /var/lib/dbus/machine-id
sudo dbus-uuidgen --ensure=/etc/machine-id
Once again my patience was being tested. Because my kubernetes setup was normal and my raspberry pi os was normal. I founded this with the help of someone in the kubernetes community. This again shows us how important and great our IT community is. To the people of the future who will come to this question. I hope this solution will fix your error and will decrease the amount of time you will be searching after a stupid small thing.
You can see if it's network related by finding the node trying to pull the image:
kubectl describe pod <name> -n <namespace>
SSH to the node, and run docker pull nginx on it. If it's having issues pulling the image manually, then it might be network related.