I deployed a kubernete cluster with two EC2 instances.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-31-21-12 Ready control-plane,master 36h v1.20.2
ip-172-31-21-62 Ready <none> 12h v1.20.2
There is an error pod in the cluster:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
es-cluster-0 0/1 Error 141 12h
I see this error in the log of this pod:
$ kubectl logs pods/es-cluster-0
Error from server: Get "https://172.31.21.62:10250/containerLogs/default/es-cluster-0/elasticsearch": dial tcp 172.31.21.62:10250: i/o timeout
It is talking about having error with the node 172.31.21.62:10250. how can I fix the issue? I am not sure that this error mean?
Below command is from master node:
kubectl describe pods/es-cluster-0 command output:
$ kubectl describe pods/es-cluster-0
Name: es-cluster-0
Namespace: default
Priority: 0
Node: ip-172-31-21-62/172.31.21.62
Start Time: Wed, 17 Feb 2021 10:08:04 +0000
Labels: controller-revision-hash=es-cluster-55b6944c56
name=elasticsearch
statefulset.kubernetes.io/pod-name=es-cluster-0
Annotations: <none>
Status: Running
IP: 10.32.0.2
IPs:
IP: 10.32.0.2
Controlled By: StatefulSet/es-cluster
Containers:
elasticsearch:
Container ID: docker://838e02ff6fba31234656e68b804f49e86ec7fea0053e5d1062abdd9d24b728b9
Image: elasticsearch:7.10.1
Image ID: docker-pullable://elasticsearch#sha256:7cd88158f6ac75d43b447fdd98c4eb69483fa7bf1be5616a85fe556262dc864a
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 78
Started: Thu, 18 Feb 2021 01:40:23 +0000
Finished: Thu, 18 Feb 2021 01:40:39 +0000
Ready: False
Restart Count: 178
Environment: <none>
Mounts:
/usr/share/elasticsearch/config/elasticsearch.yml from elasticsearch-config (rw,path="elasticsearch.yml")
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ntgwr (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
elasticsearch-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: es-config
Optional: false
default-token-ntgwr:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-ntgwr
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 42s (x4082 over 15h) kubelet Back-off restarting failed container
Below command is from master node:
$ netstat -tunlp| grep 10250
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::10250 :::* LISTEN -
Below command is from worker node:
$ kubectl describe pods/es-cluster-0
Name: es-cluster-0
Namespace: default
Priority: 0
Node: ip-172-31-21-62/172.31.21.62
Start Time: Wed, 17 Feb 2021 10:08:04 +0000
Labels: controller-revision-hash=es-cluster-55b6944c56
name=elasticsearch
statefulset.kubernetes.io/pod-name=es-cluster-0
Annotations: <none>
Status: Running
IP: 10.32.0.2
IPs:
IP: 10.32.0.2
Controlled By: StatefulSet/es-cluster
Containers:
elasticsearch:
Container ID: docker://0efa71da7b70725b248cfe8dfae5560c2fc95aaf40e3a3a899970e8579e7ac27
Image: elasticsearch:7.10.1
Image ID: docker-pullable://elasticsearch#sha256:7cd88158f6ac75d43b447fdd98c4eb69483fa7bf1be5616a85fe556262dc864a
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 78
Started: Thu, 18 Feb 2021 05:31:55 +0000
Finished: Thu, 18 Feb 2021 05:32:12 +0000
Ready: False
Restart Count: 221
Environment: <none>
Mounts:
/usr/share/elasticsearch/config/elasticsearch.yml from elasticsearch-config (rw,path="elasticsearch.yml")
/var/run/secrets/kubernetes.io/serviceaccount from default-token-ntgwr (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
elasticsearch-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: es-config
Optional: false
default-token-ntgwr:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-ntgwr
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 55m (x211 over 19h) kubelet Container image "elasticsearch:7.10.1" already present on machine
Warning BackOff 54s (x5087 over 19h) kubelet Back-off restarting failed container
Below command is from worker node:
$ netstat -tunlp| grep 10250
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 :::10250 :::* LISTEN -
Related
I am trying to create POD in kubernetes inside minikube, but getting Message exec /usr/bin/mvn: exec format error.
Background: so i am using Yaks from citrusframework which is used for BDD testing and supports kubernetes.
I am creating this POD using yaks run helloworld.feature file which creates pod inside which it does testing which we define in .feature file. Yaks based on cucumber.
kubectl describe pod test
Name: test-test-cf93r5qsoe02b4hj9o2g-jhqxf
Namespace: default
Priority: 0
Service Account: yaks-viewer
Node: minikube/192.168.49.2
Start Time: Thu, 26 Jan 2023 08:45:11 +0000
Labels: app=yaks
controller-uid=ff467e84-6ed5-47d7-9b6a-bd10128c589e
job-name=test-test-cf93r5qsoe02b4hj9o2g
yaks.citrusframework.org/test=test
yaks.citrusframework.org/test-id=cf93r5qsoe02b4hj9o2g
Annotations: <none>
Status: Failed
IP : 172.17.0.5
IPs:
IP: 172.17.0.5
Controlled By: Job/test-test-cf93r5qsoe02b4hj9o2g
Containers:
test:
Container ID: docker://e9017e0e5727d736ddbb6057e804f181d558492db18aa914d2cbc0eaeb3d9ee3
Image: docker.io/citrusframework/yaks:0.12.0
Image ID: docker-pullable://citrusframework/yaks#sha256:3504e26ae47bf5a613d38f641f4eb1b97e0bf72678e67ef9d13f41b74b31a70c
Port: <none>
Host Port: <none>
Command:
mvn
--no-snapshot-updates
-B
-q
--no-transfer-progress
-f
/deployments/data/yaks-runtime-maven
-s
/deployments/artifacts/settings.xml
verify
-Dit.test=org.citrusframework.yaks.feature.Yaks_IT
-Dmaven.repo.local=/deployments/artifacts/m2
State: Terminated
Reason: Error
Message: exec /usr/bin/mvn: exec format error
Exit Code: 1
Started: Thu, 26 Jan 2023 08:45:12 +0000
Finished: Thu, 26 Jan 2023 08:45:12 +0000
Ready: False
Restart Count: 0
Environment:
YAKS_TERMINATION_LOG: /dev/termination-log
YAKS_TESTS_PATH: /etc/yaks/tests
YAKS_SECRETS_PATH: /etc/yaks/secrets
YAKS_NAMESPACE: default
YAKS_CLUSTER_TYPE: KUBERNETES
YAKS_TEST_NAME: test
YAKS_TEST_ID: cf93r5qsoe02b4hj9o2g
Mounts:
/etc/yaks/secrets from secrets (rw)
/etc/yaks/tests from tests (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hn7kv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
tests:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: test-test
Optional: false
secrets:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-hn7kv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
here are kubectl get all results:
NAME READY STATUS RESTARTS AGE
pod/hello-minikube 1/1 Running 3 (20h ago) 20h
pod/nginx 0/1 Completed 0 21h
pod/test-hello-world-cf94ttisoe02b4hj9o30-lcdmb 0/1 Error 0 56m
pod/test-test-cf93r5qsoe02b4hj9o2g-jhqxf 0/1 Error 0 131m
pod/yaks-operator-65b956c564-h5tsx 1/1 Running 0 131m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 23h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/yaks-operator 1/1 1 1 131m
NAME DESIRED CURRENT READY AGE
replicaset.apps/yaks-operator-65b956c564 1 1 1 131m
NAME COMPLETIONS DURATION AGE
job.batch/test-hello-world-cf94ttisoe02b4hj9o30 0/1 56m 56m
job.batch/test-test-cf93r5qsoe02b4hj9o2g 0/1 131m 131m
I tried to install mvn as error says it is formatting issue. but still its not the requirement to run yaks it sould also work without mvn. and should do his thing on its own
Additionall I found this but i dont think so/know if it will helps out.
kubectl logs deployment.apps/yaks-operator
{"level":"error","ts":1674732026.5891097,"logger":"controller.test-controller","msg":"Reconciler error","name":"test","namespace":"default","error":"invalid character 'e' looking for beginning of value","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/Users/cdeppisc/Projects/Go/pkg/mod/sigs.k8s.io/controller-runtime#v0.11.2/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/Users/cdeppisc/Projects/Go/pkg/mod/sigs.k8s.io/controller-runtime#v0.11.2/pkg/internal/controller/controller.go:227"}
I am getting a 0/1 ready state when i run kubectl get pod -n mongodb
To install binami/mongo I ran:
helm repo add bitnami https://charts.bitnami.com/bitnami
kubectl create namespace mongodb
helm install mongodb -n mongodb bitnami/mongodb --values ./factory/yaml/mongodb/values.yaml
when I run kubectl get pod -n mongodb
NAME READY STATUS RESTARTS AGE
mongodb-79bf77f485-8bdm6 0/1 CrashLoopBackOff 69 (55s ago) 4h17m
Here I want the ready state to be 1/1 and status running
thenI ran kubectl describe pod -n mongodb to view the log, and I got
Name: mongodb-79bf77f485-8bdm6
Namespace: mongodb
Priority: 0
Node: ip-192-168-58-58.ec2.internal/192.168.58.58
Start Time: Tue, 19 Jul 2022 07:31:32 +0000
Labels: app.kubernetes.io/component=mongodb
app.kubernetes.io/instance=mongodb
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=mongodb
helm.sh/chart=mongodb-12.1.26
pod-template-hash=79bf77f485
Annotations: kubernetes.io/psp: eks.privileged
Status: Running
IP: 192.168.47.246
IPs:
IP: 192.168.47.246
Controlled By: ReplicaSet/mongodb-79bf77f485
Containers:
mongodb:
Container ID: docker://11999c2e13e382ceb0a8ba2ea8255ed3d4dc07ca18659ee5a1fe1a8d071b10c0
Image: docker.io/bitnami/mongodb:4.4.2-debian-10-r0
Image ID: docker-pullable://bitnami/mongodb#sha256:add0ef947bc26d25b12ee1b01a914081e08b5e9242d2f9e34e2881b5583ce102
Port: 27017/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 19 Jul 2022 11:46:12 +0000
Finished: Tue, 19 Jul 2022 11:47:32 +0000
Ready: False
Restart Count: 69
Liveness: exec [/bitnami/scripts/ping-mongodb.sh] delay=30s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [/bitnami/scripts/readiness-probe.sh] delay=5s timeout=5s period=10s #success=1 #failure=6
Environment:
BITNAMI_DEBUG: false
MONGODB_ROOT_USER: root
MONGODB_ROOT_PASSWORD: <set to the key 'mongodb-root-password' in secret 'mongodb'> Optional: false
ALLOW_EMPTY_PASSWORD: no
MONGODB_SYSTEM_LOG_VERBOSITY: 0
MONGODB_DISABLE_SYSTEM_LOG: no
MONGODB_DISABLE_JAVASCRIPT: no
MONGODB_ENABLE_JOURNAL: yes
MONGODB_PORT_NUMBER: 27017
MONGODB_ENABLE_IPV6: no
MONGODB_ENABLE_DIRECTORY_PER_DB: no
Mounts:
/bitnami/mongodb from datadir (rw)
/bitnami/scripts from common-scripts (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-nsr69 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
common-scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: mongodb-common-scripts
Optional: false
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mongodb
ReadOnly: false
kube-api-access-nsr69:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 30m (x530 over 4h19m) kubelet Readiness probe failed: /bitnami/scripts/readiness-probe.sh: line 9: mongosh: command not found
Warning BackOff 9m59s (x760 over 4h10m) kubelet Back-off restarting failed container
Warning Unhealthy 5m6s (x415 over 4h19m) kubelet Liveness probe failed: /bitnami/scripts/ping-mongodb.sh: line 2: mongosh: command not found.
Don't understand the log or where the problem is coming from
How can I make the ready state 1/1 and a running status
I get this log error for a pod like below but I updated kubernetes orchestrator, clusters, and nodes to kubernetes v1.21.2. Before updating it, they were v1.20.7. I found a reference that from v1.21, selfLink is completely removed. Why am I getting this error? How can I resolve this issue?
error log for kubectl logs (podname)
...
2021-08-10T03:07:19.535Z INFO setup starting manager
2021-08-10T03:07:19.536Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
E0810 03:07:19.550636 1 event.go:247] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"controller-leader-election-helper", GenerateName:"", Namespace:"kubestone-system", SelfLink:"", UID:"b01651ed-7d54-4815-a047-57b16d26cfdf", ResourceVersion:"65956", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63764161639, loc:(*time.Location)(0x21639e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"kubestone-controller-manager-f467b7c47-cv7ws_1305bc36-f988-11eb-81fc-a20dfb9758a2\",\"leaseDurationSeconds\":15,\"acquireTime\":\"2021-08-10T03:07:19Z\",\"renewTime\":\"2021-08-10T03:07:19Z\",\"leaderTransitions\":0}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"manager", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc0000956a0), Fields:(*v1.Fields)(nil)}}}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'LeaderElection' 'kubestone-controller-manager-f467b7c47-cv7ws_1305bc36-f988-11eb-81fc-a20dfb9758a2 became leader'
2021-08-10T03:07:21.636Z INFO controller-runtime.controller Starting Controller {"controller": "kafkabench"}
...
kubectl get nodes to show kubernetes version: the node that the pod is scheduled is aks-default-41152893-vmss000000
PS C:\Users\user> kubectl get nodes -A
NAME STATUS ROLES AGE VERSION
aks-default-41152893-vmss000000 Ready agent 5h32m v1.21.2
aks-default-41152893-vmss000001 Ready agent 5h29m v1.21.2
aksnpwi000000 Ready agent 5h32m v1.21.2
aksnpwi000001 Ready agent 5h26m v1.21.2
aksnpwi000002 Ready agent 5h19m v1.21.2
kubectl describe pods (pod name: kubestone-controller-manager-f467b7c47-cv7ws)
PS C:\Users\user> kubectl describe pods kubestone-controller-manager-f467b7c47-cv7ws -n kubestone-system
Name: kubestone-controller-manager-f467b7c47-cv7ws
Namespace: kubestone-system
Priority: 0
Node: aks-default-41152893-vmss000000/10.240.0.4
Start Time: Mon, 09 Aug 2021 23:07:16 -0400
Labels: control-plane=controller-manager
pod-template-hash=f467b7c47
Annotations: <none>
Status: Running
IP: 10.240.0.21
IPs:
IP: 10.240.0.21
Controlled By: ReplicaSet/kubestone-controller-manager-f467b7c47
Containers:
manager:
Container ID: containerd://01594df678a2c1d7163c913eff33881edf02e39633b1a4b51dcf5fb769d0bc1e
Image: user2/imagename
Image ID: docker.io/user2/imagename#sha256:aa049f135931192630ceda014d7a24306442582dbeeaa36ede48e6599b6135e1
Port: <none>
Host Port: <none>
Command:
/manager
Args:
--enable-leader-election
State: Running
Started: Mon, 09 Aug 2021 23:07:18 -0400
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 30Mi
Requests:
cpu: 100m
memory: 20Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jvjjh (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-jvjjh:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 23m default-scheduler Successfully assigned kubestone-system/kubestone-controller-manager-f467b7c47-cv7ws to aks-default-41152893-vmss000000
Normal Pulling 23m kubelet Pulling image "user2/imagename"
Normal Pulled 23m kubelet Successfully pulled image "user2/imagename" in 354.899039ms
Normal Created 23m kubelet Created container manager
Normal Started 23m kubelet Started container manager
Kubestone has had no releases since 2019, it needs to upgrade its copy of the Kubernetes Go client. That said, this appears to only impact the event recorder system so probably not a huge deal.
I am running a cronjob in kubernetes. Cronjob started and but not exited. Status of pod is always in RUNNING.
Below is logs
kubectl get pods
cronjob-1623253800-xnwwx 1/1 Running 0 13h
When i describe the JOB below are noticed
kubectl describe job cronjob-1623300120
Name: cronjob-1623300120
Namespace: cronjob
Selector: xxxxx
Labels: xxxxx
Annotations: <none>
Controlled By: CronJob/cronjob
Parallelism: 1
Completions: 1
Start Time: Thu, 9 Jun 2021 10:12:03 +0530
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=cronjob
controller-xxxx
job-name=cronjob-1623300120
Containers:
plannercronjob:
Image: xxxxxxxxxxxxx
Port: <none>
Host Port: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 13h job-controller Created pod: cronjob-1623300120
I Noticed that Pods Statuses: 1 Running / 0 Succeeded / 0 Failed. This means that the when code return zero , then job Succeeded/Failed. Is that correct ?.
When i enter into the pod using execute command
kubectl exec --stdin --tty cronjob-1623253800-xnwwx -n cronjob -- /bin/bash
root#cronjob-1623253800-xnwwx:/# ps ax| grep python
1 ? Ssl 0:01 python -m sfit.src.app
18 pts/0 S+ 0:00 grep python
I found that python process is still running. Is this a code issue deadlock or something else.
pod describe
Name: cronjob-1623302220-xnwwx
Namespace: default
Priority: 0
Node: aks-agentpool-xxxxvmss000000/10.240.0.4
Start Time: Thu, 9 Jun 2021 10:47:02 +0530
Labels: app=cronjob
controller-uid=xxxxxx
job-name=cronjob-1623302220
Annotations: <none>
Status: Running
IP: 10.244.1.30
IPs:
IP: 10.244.1.30
Controlled By: Job/cronjob-1623302220
Containers:
plannercronjob:
Container ID: docker://xxxxxxxxxxxxxxxx
Image: xxxxxxxxxxx
Image ID: docker-xxxx
Port: <none>
Host Port: <none>
State: Running
Started: Thu, 9 Jun 2021 10:47:06 +0530
Ready: True
Restart Count: 0
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-97xzv (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-97xzv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-97xzv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13h default-scheduler Successfully assigned cronjob/cronjob-1623302220-xnwwx to aks-agentpool-xxx-vmss000000
Normal Pulling 13h kubelet, aks-agentpool-xxx-vmss000000 Pulling image "xxxx.azurecr.io/xxx:1.1.1"
Normal Pulled 13h kubelet, aks-agentpool-xxx-vmss000000 Successfully pulled image "xxx.azurecr.io/xx:1.1.1"
Normal Created 13h kubelet, aks-agentpool-xxx-vmss000000 Created container cronjob
Normal Started 13h kubelet, aks-agentpool-xxx-vmss000000 Started container cronjob
#KrishnaChaurasia . I run the docker image in my system. There is some error in my python code. But it is exit with error. But in the kubernetes it is not exited and not stop
docker run xxxxx/cronjob:1
File "/usr/local/lib/python3.8/site-packages/azure/core/pipeline/transport/_requests_basic.py", line 261, in send
raise error
azure.core.exceptions.ServiceRequestError: <urllib3.connection.HTTPSConnection object at 0x7f113f6480a0>: Failed to establish a new connection: [Errno -2] Name or service not known
echo $?
1
If you are seeing your pod is always running and never completed, try to add staratingDeadlineSeconds.
https://medium.com/#hengfeng/what-does-kubernetes-cronjobs-startingdeadlineseconds-exactly-mean-cc2117f9795f
I am running an Argo workflow and getting the following error in the pod's log:
error: a container name must be specified for pod <name>, choose one of: [wait main]
This error only happens some of the time and only with some of my templates, but when it does, it is a template that is run later in the workflow (i.e. not the first template run). I have not yet been able to identify the parameters that will run successfully, so I will be happy with tips for debugging. I have pasted the output of describe below.
Based on searches, I think the solution is simply that I need to attach "-c main" somewhere, but I do not know where and cannot find information in the Argo docs.
Describe:
Name: message-passing-1-q8jgn-607612432
Namespace: argo
Priority: 0
Node: REDACTED
Start Time: Wed, 17 Mar 2021 17:16:37 +0000
Labels: workflows.argoproj.io/completed=false
workflows.argoproj.io/workflow=message-passing-1-q8jgn
Annotations: cni.projectcalico.org/podIP: 192.168.40.140/32
cni.projectcalico.org/podIPs: 192.168.40.140/32
workflows.argoproj.io/node-name: message-passing-1-q8jgn.e
workflows.argoproj.io/outputs: {"exitCode":"6"}
workflows.argoproj.io/template:
{"name":"egress","arguments":{},"inputs":{...
Status: Failed
IP: 192.168.40.140
IPs:
IP: 192.168.40.140
Controlled By: Workflow/message-passing-1-q8jgn
Containers:
wait:
Container ID: docker://26d6c30440777add2af7ef3a55474d9ff36b8c562d7aecfb911ce62911e5fda3
Image: argoproj/argoexec:v2.12.10
Image ID: docker-pullable://argoproj/argoexec#sha256:6edb85a84d3e54881404d1113256a70fcc456ad49c6d168ab9dfc35e4d316a60
Port: <none>
Host Port: <none>
Command:
argoexec
wait
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 17 Mar 2021 17:16:43 +0000
Finished: Wed, 17 Mar 2021 17:17:03 +0000
Ready: False
Restart Count: 0
Environment:
ARGO_POD_NAME: message-passing-1-q8jgn-607612432 (v1:metadata.name)
Mounts:
/argo/podmetadata from podmetadata (rw)
/mainctrfs/mnt/logs from log-p1-vol (rw)
/mainctrfs/mnt/processed from processed-p1-vol (rw)
/var/run/docker.sock from docker-sock (ro)
/var/run/secrets/kubernetes.io/serviceaccount from argo-token-v2w56 (ro)
main:
Container ID: docker://67e6d6d3717ab1080f14cac6655c90d990f95525edba639a2d2c7b3170a7576e
Image: REDACTED
Image ID: REDACTED
Port: <none>
Host Port: <none>
Command:
/bin/bash
-c
Args:
State: Terminated
Reason: Error
Exit Code: 6
Started: Wed, 17 Mar 2021 17:16:43 +0000
Finished: Wed, 17 Mar 2021 17:17:03 +0000
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/mnt/logs/ from log-p1-vol (rw)
/mnt/processed/ from processed-p1-vol (rw)
/var/run/secrets/kubernetes.io/serviceaccount from argo-token-v2w56 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
podmetadata:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
metadata.annotations -> annotations
docker-sock:
Type: HostPath (bare host directory volume)
Path: /var/run/docker.sock
HostPathType: Socket
processed-p1-vol:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: message-passing-1-q8jgn-processed-p1-vol
ReadOnly: false
log-p1-vol:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: message-passing-1-q8jgn-log-p1-vol
ReadOnly: false
argo-token-v2w56:
Type: Secret (a volume populated by a Secret)
SecretName: argo-token-v2w56
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m35s default-scheduler Successfully assigned argo/message-passing-1-q8jgn-607612432 to ack1
Normal Pulled 7m31s kubelet Container image "argoproj/argoexec:v2.12.10" already present on machine
Normal Created 7m31s kubelet Created container wait
Normal Started 7m30s kubelet Started container wait
Normal Pulled 7m30s kubelet Container image already present on machine
Normal Created 7m30s kubelet Created container main
Normal Started 7m30s kubelet Started container main
This happens when you try to see logs for a pod with multiple containers and not specify for what container you want to see the log. Typical command to see logs:
kubectl logs <podname>
But your Pod has two container, one named "wait" and one named "main". You can see the logs from the container named "main" with:
kubectl logs <podname> -c main
or you can see the logs from all containers with
kubectl logs <podname> --all-containers