Pods starting but not working in Kubernetes - kubernetes

Created Kubernetes cluster deployment with 3 Pods, and all are running fine, but when trying to run them cannot do it, tried doing curl the Ip (Internal)of the Pods in describe section i could see this error "" MountVolume.SetUp failed for volume "default-token-twhht" : failed to sync secret cache:
errors below:
5m51s Normal RegisteredNode node/ip-10-1-1-4 Node ip-10-1-1-4 event: Registered Node ip-10-1-1-4 in Controller
57m Normal Scheduled pod/nginx-deployment-585449566-9bqp7 Successfully assigned default/nginx-deployment-585449566-9bqp7 to ip-10-1-1-4
57m Warning FailedMount pod/nginx-deployment-585449566-9bqp7 MountVolume.SetUp failed for volume "default-token-twhht" : failed to sync secret cache: timed out waiting for the condition
57m Normal Pulling pod/nginx-deployment-585449566-9bqp7 Pulling image "nginx:latest"
56m Normal Pulled pod/nginx-deployment-585449566-9bqp7 Successfully pulled image "nginx:latest" in 12.092210534s
56m Normal Created pod/nginx-deployment-585449566-9bqp7 Created container nginx
56m Normal Started pod/nginx-deployment-585449566-9bqp7 Started container nginx
57m Normal Scheduled pod/nginx-deployment-585449566-9hlhz Successfully assigned default/nginx-deployment-585449566-9hlhz to ip-10-1-1-4
57m Warning FailedMount pod/nginx-deployment-585449566-9hlhz MountVolume.SetUp failed for volume "default-token-twhht" : failed to sync secret cache: timed out waiting for the condition
57m Normal Pulling pod/nginx-deployment-585449566-9hlhz Pulling image "nginx:latest"
56m Normal Pulled pod/nginx-deployment-585449566-9hlhz Successfully pulled image "nginx:latest" in 15.127984291s
56m Normal Created pod/nginx-deployment-585449566-9hlhz Created container nginx
56m Normal Started pod/nginx-deployment-585449566-9hlhz Started container nginx
57m Normal Scheduled pod/nginx-deployment-585449566-ffkwf Successfully assigned default/nginx-deployment-585449566-ffkwf to ip-10-1-1-4
57m Warning FailedMount pod/nginx-deployment-585449566-ffkwf MountVolume.SetUp failed for volume "default-token-twhht" : failed to sync secret cache: timed out waiting for the condition
57m Normal Pulling pod/nginx-deployment-585449566-ffkwf Pulling image "nginx:latest"
56m Normal Pulled pod/nginx-deployment-585449566-ffkwf Successfully pulled image "nginx:latest" in 9.459864756s
56m Normal Created pod/nginx-deployment-585449566-ffkwf Created container nginx

You can add an additional RBAC role permission to your Pod's service account, reference 1 2 3.
Assure as well that you have the workload identity set up, reference 4.
This can also happen when apiserver is on high load, you could have more smaller nodes to spread your pods and increase your resource requests.

This error message is a bit misleading, since it suggests a K8s cluster internal connectivity problem. In reality it is an RBAC permission problem.
The default service account within the namespace you are deploying to is not authorized to mount the secret that you are trying to mount into your Pod.
To solve this, you have to add additional RBAC role permission to your Pod's service account.

Related

istio bookinfo sample application deployment failed for "ImagePullBackOff"

I am trying to deploy istio's sample bookinfo application using the below command:
kubectl apply -f samples/bookinfo/platform/kube/bookinfo.yaml
from here
but each time I am getting ImagePullBackoff error like this:
NAME READY STATUS RESTARTS AGE
details-v1-c74755ddf-m878f 2/2 Running 0 6m32s
productpage-v1-778ddd95c6-pdqsk 2/2 Running 0 6m32s
ratings-v1-5564969465-956bq 2/2 Running 0 6m32s
reviews-v1-56f6655686-j7lb6 1/2 ImagePullBackOff 0 6m32s
reviews-v2-6b977f8ff5-55tgm 1/2 ImagePullBackOff 0 6m32s
reviews-v3-776b979464-9v7x5 1/2 ImagePullBackOff 0 6m32s
For error details, I have run :
kubectl describe pod reviews-v1-56f6655686-j7lb6
Which returns these:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m41s default-scheduler Successfully assigned default/reviews-v1-56f6655686-j7lb6 to minikube
Normal Pulled 7m39s kubelet Container image "docker.io/istio/proxyv2:1.15.3" already present on machine
Normal Created 7m39s kubelet Created container istio-init
Normal Started 7m39s kubelet Started container istio-init
Warning Failed 5m39s kubelet Failed to pull image "docker.io/istio/examples-bookinfo-reviews-v1:1.17.0": rpc error: code = Unknown desc = context deadline exceeded
Warning Failed 5m39s kubelet Error: ErrImagePull
Normal Pulled 5m39s kubelet Container image "docker.io/istio/proxyv2:1.15.3" already present on machine
Normal Created 5m39s kubelet Created container istio-proxy
Normal Started 5m39s kubelet Started container istio-proxy
Normal BackOff 5m36s (x3 over 5m38s) kubelet Back-off pulling image "docker.io/istio/examples-bookinfo-reviews-v1:1.17.0"
Warning Failed 5m36s (x3 over 5m38s) kubelet Error: ImagePullBackOff
Normal Pulling 5m25s (x2 over 7m38s) kubelet Pulling image "docker.io/istio/examples-bookinfo-reviews-v1:1.17.0"
Do I need to build dockerfile first and push it to the local repository? There are no clear instructions there or I failed to find any.
Can anybody help?
If you check in dockerhub the image is there:
https://hub.docker.com/r/istio/examples-bookinfo-reviews-v1/tags
So the error that you need to deal with is context deadline exceeded while trying to pull it from dockerhub. This is likely a networking error (a generic Go error saying it took too long), depending on where your cluster is running you can do manually a docker pull from the nodes and that should work.
EDIT: for minikube do a minikube ssh and then a docker pull
Solved the problem by below command :
minikube ssh docker pull istio/examples-bookinfo-reviews-v1:1.17.0
from this git issues here
Also How to use local docker images with Minikube?
Hope this may help somebody

Pod cannot mount Persistent Volume created by ozone CSI provisioner

I am using kubernetes to deploy ozone (a sub for hdfs), and basically followed instructions from here and here (just a few steps).
First I created few pvs with hostpath to my local dir, then I slightly edited the yamls from ozone/kubernetes/example/ozone by changing nfs claim to host path claim:
volumeClaimTemplates:
- metadata:
name: data
spec:
storageClassName: manual
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5Gi
selector:
matchLabels:
type: local
and I commented out the nodeAffinity settings in datanode-stateful.yaml since my kubernetes only had master node.
The deployment was succesful.
Then I applied the csi and pv-test as the instructions in csi protocol said, the pv (bucket in s3v) was automatically established, and pvc did bound the pv, but the test pod stopped at containerCreating.
Attaching the pv-test pod desc:
Name: ozone-csi-test-webserver-778c8c87b7-rngfk
Namespace: default
Priority: 0
Node: k8s-master/192.168.100.202
Start Time: Fri, 18 Jun 2021 14:23:54 +0800
Labels: app=ozone-csi-test-webserver
pod-template-hash=778c8c87b7
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/ozone-csi-test-webserver-778c8c87b7
Containers:
web:
Container ID:
Image: python:3.7.3-alpine3.8
Image ID:
Port: <none>
Host Port: <none>
Args:
python
-m
http.server
--directory
/www
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-gqknv (ro)
/www from webroot (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
webroot:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: ozone-csi-test-webserver
ReadOnly: false
default-token-gqknv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-gqknv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 7m7s (x58 over 122m) kubelet, k8s-master MountVolume.SetUp failed for volume "pvc-1913bd70-09fd-4eba-a459-73fe3bd397b8" : rpc error: code = Unknown desc =
Warning FailedMount 31s (x54 over 120m) kubelet, k8s-master Unable to mount volumes for pod "ozone-csi-test-webserver-778c8c87b7-rngfk_default(b1a59143-00b9-47f6-94fe-1845c29aee93)": timeout expired waiting for volumes to attach or mount for pod "default"/"ozone-csi-test-webserver-778c8c87b7-rngfk". list of unmounted volumes=[webroot]. list of unattached volumes=[webroot default-token-gqknv]
Attach events for the whole process:
7m51s Normal SuccessfulCreate statefulset/s3g create Claim data-s3g-0 Pod s3g-0 in StatefulSet s3g success
7m51s Warning FailedScheduling pod/s3g-0 pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
7m51s Normal SuccessfulCreate statefulset/scm create Pod scm-0 in StatefulSet scm successful
7m51s Normal SuccessfulCreate statefulset/om create Pod om-0 in StatefulSet om successful
7m51s Warning FailedScheduling pod/om-0 pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
7m51s Warning FailedScheduling pod/datanode-0 pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
7m51s Normal SuccessfulCreate statefulset/datanode create Pod datanode-0 in StatefulSet datanode successful
7m51s Normal SuccessfulCreate statefulset/datanode create Claim data-datanode-0 Pod datanode-0 in StatefulSet datanode success
7m51s Normal SuccessfulCreate statefulset/scm create Claim data-scm-0 Pod scm-0 in StatefulSet scm success
7m51s Normal SuccessfulCreate statefulset/s3g create Pod s3g-0 in StatefulSet s3g successful
7m51s Normal SuccessfulCreate statefulset/om create Claim data-om-0 Pod om-0 in StatefulSet om success
7m51s Warning FailedScheduling pod/scm-0 pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
7m50s Normal Scheduled pod/s3g-0 Successfully assigned default/s3g-0 to hadoop104
7m50s Normal Scheduled pod/datanode-0 Successfully assigned default/datanode-0 to hadoop103
7m50s Normal Scheduled pod/scm-0 Successfully assigned default/scm-0 to hadoop104
7m50s Normal Scheduled pod/om-0 Successfully assigned default/om-0 to hadoop103
7m49s Normal Created pod/datanode-0 Created container datanode
7m49s Normal Started pod/datanode-0 Started container datanode
7m49s Normal Pulled pod/datanode-0 Container image "apache/ozone:1.1.0" already present on machine
7m48s Normal SuccessfulCreate statefulset/datanode create Claim data-datanode-1 Pod datanode-1 in StatefulSet datanode success
7m48s Warning FailedScheduling pod/datanode-1 pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
7m48s Normal Pulled pod/scm-0 Container image "apache/ozone:1.1.0" already present on machine
7m48s Normal Created pod/scm-0 Created container init
7m48s Normal Started pod/scm-0 Started container init
7m48s Normal Pulled pod/s3g-0 Container image "apache/ozone:1.1.0" already present on machine
7m48s Normal Created pod/s3g-0 Created container s3g
7m48s Normal Started pod/s3g-0 Started container s3g
7m48s Normal SuccessfulCreate statefulset/datanode create Pod datanode-1 in StatefulSet datanode successful
7m46s Normal Scheduled pod/datanode-1 Successfully assigned default/datanode-1 to hadoop104
7m45s Normal Created pod/datanode-1 Created container datanode
7m45s Normal Pulled pod/datanode-1 Container image "apache/ozone:1.1.0" already present on machine
7m44s Normal Created pod/scm-0 Created container scm
7m44s Normal Started pod/scm-0 Started container scm
7m44s Normal Started pod/datanode-1 Started container datanode
7m44s Normal Pulled pod/scm-0 Container image "apache/ozone:1.1.0" already present on machine
7m43s Warning FailedScheduling pod/datanode-2 pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
7m43s Normal SuccessfulCreate statefulset/datanode create Pod datanode-2 in StatefulSet datanode successful
7m43s Normal SuccessfulCreate statefulset/datanode create Claim data-datanode-2 Pod datanode-2 in StatefulSet datanode success
7m42s Normal Scheduled pod/datanode-2 Successfully assigned default/datanode-2 to hadoop103
7m38s Normal Pulled pod/datanode-2 Container image "apache/ozone:1.1.0" already present on machine
7m38s Normal Created pod/datanode-2 Created container datanode
7m38s Normal Started pod/datanode-2 Started container datanode
7m23s Normal ScalingReplicaSet deployment/csi-provisioner Scaled up replica set csi-provisioner-5649bc9474 to 1
7m23s Warning FailedCreate daemonset/csi-node Error creating: pods "csi-node-" is forbidden: error looking up service account default/csi-ozone: serviceaccount "csi-ozone" not found
7m22s Normal Scheduled pod/csi-node-nbfnw Successfully assigned default/csi-node-nbfnw to hadoop104
7m22s Normal Scheduled pod/csi-provisioner-5649bc9474-n5jf2 Successfully assigned default/csi-provisioner-5649bc9474-n5jf2 to hadoop103
7m22s Normal SuccessfulCreate replicaset/csi-provisioner-5649bc9474 Created pod: csi-provisioner-5649bc9474-n5jf2
7m22s Normal Scheduled pod/csi-node-c97fz Successfully assigned default/csi-node-c97fz to hadoop103
7m22s Normal SuccessfulCreate daemonset/csi-node Created pod: csi-node-c97fz
7m22s Normal SuccessfulCreate daemonset/csi-node Created pod: csi-node-nbfnw
7m14s Normal Pulling pod/csi-node-c97fz Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.0.2"
7m14s Normal Pulling pod/csi-provisioner-5649bc9474-n5jf2 Pulling image "quay.io/k8scsi/csi-provisioner:v1.0.1"
7m13s Normal Pulling pod/csi-node-nbfnw Pulling image "quay.io/k8scsi/csi-node-driver-registrar:v1.0.2"
6m56s Warning Unhealthy pod/om-0 Liveness probe failed: dial tcp 10.244.1.7:9862: connect: connection refused
6m56s Normal Killing pod/om-0 Container om failed liveness probe, will be restarted
6m55s Normal Created pod/om-0 Created container om
6m55s Normal Started pod/om-0 Started container om
6m55s Normal Pulled pod/om-0 Container image "apache/ozone:1.1.0" already present on machine
6m48s Normal Pulled pod/csi-provisioner-5649bc9474-n5jf2 Successfully pulled image "quay.io/k8scsi/csi-provisioner:v1.0.1"
6m48s Normal Started pod/csi-provisioner-5649bc9474-n5jf2 Started container ozone-csi
6m48s Normal Created pod/csi-provisioner-5649bc9474-n5jf2 Created container ozone-csi
6m48s Normal Pulled pod/csi-provisioner-5649bc9474-n5jf2 Container image "apache/ozone:1.1.0" already present on machine
6m48s Normal Started pod/csi-provisioner-5649bc9474-n5jf2 Started container csi-provisioner
6m48s Normal Created pod/csi-provisioner-5649bc9474-n5jf2 Created container csi-provisioner
6m45s Normal Pulled pod/csi-node-nbfnw Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.0.2"
6m44s Normal Started pod/csi-node-nbfnw Started container driver-registrar
6m44s Normal Started pod/csi-node-nbfnw Started container csi-node
6m44s Normal Created pod/csi-node-nbfnw Created container csi-node
6m44s Normal Created pod/csi-node-nbfnw Created container driver-registrar
6m44s Normal Pulled pod/csi-node-nbfnw Container image "apache/ozone:1.1.0" already present on machine
6m25s Normal Pulled pod/csi-node-c97fz Successfully pulled image "quay.io/k8scsi/csi-node-driver-registrar:v1.0.2"
6m25s Normal Pulled pod/csi-node-c97fz Container image "apache/ozone:1.1.0" already present on machine
6m25s Normal Started pod/csi-node-c97fz Started container csi-node
6m25s Normal Created pod/csi-node-c97fz Created container csi-node
6m17s Normal Created pod/csi-node-c97fz Created container driver-registrar
6m17s Normal Pulled pod/csi-node-c97fz Container image "quay.io/k8scsi/csi-node-driver-registrar:v1.0.2" already present on machine
6m17s Normal Started pod/csi-node-c97fz Started container driver-registrar
6m3s Normal Provisioning persistentvolumeclaim/ozone-csi-test-webserver External provisioner is provisioning volume for claim "default/ozone-csi-test-webserver"
6m3s Normal ScalingReplicaSet deployment/ozone-csi-test-webserver Scaled up replica set ozone-csi-test-webserver-7cbdc5d65c to 1
6m3s Normal SuccessfulCreate replicaset/ozone-csi-test-webserver-7cbdc5d65c Created pod: ozone-csi-test-webserver-7cbdc5d65c-dpzhc
6m3s Normal ExternalProvisioning persistentvolumeclaim/ozone-csi-test-webserver waiting for a volume to be created, either by external provisioner "org.apache.hadoop.ozone" or manually created by system administrator
6m2s Warning FailedScheduling pod/ozone-csi-test-webserver-7cbdc5d65c-dpzhc pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
6m1s Normal ProvisioningSucceeded persistentvolumeclaim/ozone-csi-test-webserver Successfully provisioned volume pvc-cd01c58d-793f-41ce-9e12-057ade02e07c
5m59s Normal Scheduled pod/ozone-csi-test-webserver-7cbdc5d65c-dpzhc Successfully assigned default/ozone-csi-test-webserver-7cbdc5d65c-dpzhc to hadoop104
97s Warning FailedMount pod/ozone-csi-test-webserver-7cbdc5d65c-dpzhc Unable to attach or mount volumes: unmounted volumes=[webroot], unattached volumes=[webroot default-token-l9lng]: timed out waiting for the condition
94s Warning FailedMount pod/ozone-csi-test-webserver-7cbdc5d65c-dpzhc MountVolume.SetUp failed for volume "pvc-cd01c58d-793f-41ce-9e12-057ade02e07c" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Unknown desc =

helm3 installation fails with: failed post-install: timed out waiting for the condition

I'm new to Kubernetes and Helm. I have installed k3d and helm:
k3d version v1.7.0
k3s version v1.17.3-k3s1
helm version
version.BuildInfo{Version:"v3.2.4", GitCommit:"0ad800ef43d3b826f31a5ad8dfbb4fe05d143688", GitTreeState:"clean", GoVersion:"go1.13.12"}
I do have a cluster created with 10 worker nodes. When I try to install stackstorm-ha on the cluster I see the following issues:
helm install stackstorm/stackstorm-ha --generate-name --debug
client.go:534: [debug] stackstorm-ha-1592860860-job-st2-apikey-load: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Error: failed post-install: timed out waiting for the condition
helm.go:84: [debug] failed post-install: timed out waiting for the condition
njbbmacl2813:~ gangsh9$ kubectl get pods
Unable to connect to the server: net/http: TLS handshake timeout
kubectl describe pods either shows :
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/stackstorm-ha-1592857897-st2api-7f6c877b9c-dtcp5 to k3d-st2hatest-worker-5
Warning Failed 23m kubelet, k3d-st2hatest-worker-5 Error: context deadline exceeded
Normal Pulling 17m (x5 over 37m) kubelet, k3d-st2hatest-worker-5 Pulling image "stackstorm/st2api:3.3dev"
Normal Pulled 17m (x5 over 28m) kubelet, k3d-st2hatest-worker-5 Successfully pulled image "stackstorm/st2api:3.3dev"
Normal Created 17m (x5 over 28m) kubelet, k3d-st2hatest-worker-5 Created container st2api
Normal Started 17m (x4 over 28m) kubelet, k3d-st2hatest-worker-5 Started container st2api
Warning BackOff 53s (x78 over 20m) kubelet, k3d-st2hatest-worker-5 Back-off restarting failed container
or
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/stackstorm-ha-1592857897-st2timersengine-c847985d6-74h5k to k3d-st2hatest-worker-2
Warning Failed 6m23s kubelet, k3d-st2hatest-worker-2 Failed to pull image "stackstorm/st2timersengine:3.3dev": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/stackstorm/st2timersengine:3.3dev": failed to resolve reference "docker.io/stackstorm/st2timersengine:3.3dev": failed to authorize: failed to fetch anonymous token: Get https://auth.docker.io/token?scope=repository%3Astackstorm%2Fst2timersengine%3Apull&service=registry.docker.io: net/http: TLS handshake timeout
Warning Failed 6m23s kubelet, k3d-st2hatest-worker-2 Error: ErrImagePull
Normal BackOff 6m22s kubelet, k3d-st2hatest-worker-2 Back-off pulling image "stackstorm/st2timersengine:3.3dev"
Warning Failed 6m22s kubelet, k3d-st2hatest-worker-2 Error: ImagePullBackOff
Normal Pulling 6m10s (x2 over 6m37s) kubelet, k3d-st2hatest-worker-2 Pulling image "stackstorm/st2timersengine:3.3dev"
Kind of stuck here.
Any help would be greatly appreciated.
The TLS handshake timeout error is very common when the machine that you are running your deployment on is running out of resources. Alternative issue is caused by slow internet connection or some proxy settings but we ruled out that since you can pull and run docker images locally and deploy small nginx webserver in your cluster.
As you may notice in the stackstorm helm chart it installs a big amount of services/pods inside your cluster which can take up a lot of resources.
It will install 2 replicas for each component of StackStorm
microservices for redundancy, as well as backends like RabbitMQ HA,
MongoDB HA Replicaset and etcd cluster that st2 replies on for MQ, DB
and distributed coordination respectively.
I deployed stackstorm on both k3d and GKE but I had to use fast machines in order to deploy this quickly and successfully.
NAME: stackstorm
LAST DEPLOYED: Mon Jun 29 15:25:52 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
Congratulations! You have just deployed StackStorm HA!

Troubles while installing GlusterFS on Kubernetes cluster using Heketi

I try to install GlusterFS on my kubernetes cluster using heketi. I start gk-deploy but it shows that pods aren't found:
Using Kubernetes CLI.
Using namespace "default".
Checking for pre-existing resources...
GlusterFS pods ... not found.
deploy-heketi pod ... not found.
heketi pod ... not found.
gluster-s3 pod ... not found.
Creating initial resources ... Error from server (AlreadyExists): error when creating "/heketi/gluster-kubernetes/deploy/kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
Error from server (AlreadyExists): clusterrolebindings.rbac.authorization.k8s.io "heketi-sa-view" already exists
clusterrolebinding.rbac.authorization.k8s.io/heketi-sa-view not labeled
OK
node/sapdh2wrk1 not labeled
node/sapdh2wrk2 not labeled
node/sapdh2wrk3 not labeled
daemonset.extensions/glusterfs created
Waiting for GlusterFS pods to start ... pods not found.
I've started gk-deploy more than once.
I have 3 nodes in my kubernetes cluster and it seems like pods can't start up on none of them, but I don't understand why.
Pods are created but aren't ready:
kubectl get pods
NAME READY STATUS RESTARTS AGE
glusterfs-65mc7 0/1 Running 0 16m
glusterfs-gnxms 0/1 Running 0 16m
glusterfs-htkmh 0/1 Running 0 16m
heketi-754dfc7cdf-zwpwn 0/1 ContainerCreating 0 74m
Here is a log of one GlusterFS Pod, it ends with a warning:
Events:
Type Reason Age From Message
Normal Scheduled 19m default-scheduler Successfully assigned default/glusterfs-65mc7 to sapdh2wrk1
Normal Pulled 19m kubelet, sapdh2wrk1 Container image "gluster/gluster-centos:latest" already present on machine
Normal Created 19m kubelet, sapdh2wrk1 Created container
Normal Started 19m kubelet, sapdh2wrk1 Started container
Warning Unhealthy 13m (x12 over 18m) kubelet, sapdh2wrk1 Liveness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Warning Unhealthy 3m58s (x35 over 18m) kubelet, sapdh2wrk1 Readiness probe failed: /usr/local/bin/status-probe.sh
failed check: systemctl -q is-active glusterd.service
Glusterfs-5.8-100.1 is installed and started up on every node including master.
What is the reason why Pods don't start up?

Kubernetes pod deployment error FailedSync| Error syncing pod

Env:
Vbox on a windows 10 desktop machine
Two ubuntu VMs, one VM is master and the other one is k8s(1.7) worker.
I can see two nodes are "ready" when get nodes. But even deploy a very simple nginx pod, I got the error message from pod describe
"norm | SandboxChanged |Pod sandbox changed, it will be killed and re-created." and "warning | FailedSync| Error syncing pod".
But if I run the docker container directly on the worker, the container can be up and running. Anyone has a suggestion what I can check for?
k8s-master#k8smaster-VirtualBox:~$ **kubectl get pods** NAME
READY STATUS RESTARTS AGE
movie-server-1517284798-lbb01 0/1 CrashLoopBackOff 6
16m
k8s-master#k8smaster-VirtualBox:~$ **kubectl describe pod
movie-server-1517284798-lbb01**
--- clip --- kubelet, master-virtualbox spec.containers{movie-server} Warning FailedError: failed to start
container "movie-server": Error response from daemon:
{"message":"cannot join network of a non running container:
3f59947dbd404ecf2f6dd0b65dd9dad8b25bf0c418aceb8cf666ad0761402b53"}
kubelet, master-virtualbox spec.containers{movie-server}
Warning BackOffBack-off restarting failed container
kubelet, master-virtualbox Normal
SandboxChanged Pod sandbox changed, it will be killed and
re-created.
kubelet, master-virtualbox spec.containers{movie-server} Normal
PulledContainer image "nancyfeng/movie-server:0.1.0" already present
on machine
kubelet, master-virtualbox spec.containers{movie-server}
Normal CreatedCreated container
kubelet, master-virtualbox
Warning FailedSync Error syncing pod
kubelet, master-virtualbox spec.containers{movie-server}
Warning FailedError: failed to start container "movie-server": Error
response from daemon: {"message":"cannot join network of a non running
container:
72ba77b25b6a3969e8921214f0ca73ffaab4c82d8a2852e3d1b1f3ac5dde6ce1"}
--- clip ---