Kubernetes NFS volume mount fail with exit status 32 - kubernetes

I have a Kubernetes setup installed in my Ubuntu machine. I'm trying to setup a nfs volume and mount it to a container according to this http://kubernetes.io/v1.1/examples/nfs/ document.
nfs service and pod configurations
kind: Service
apiVersion: v1
metadata:
name: nfs-server
spec:
ports:
- port: 2049
selector:
role: nfs-server
---
apiVersion: v1
kind: Pod
metadata:
name: nfs-server
labels:
role: nfs-server
spec:
containers:
- name: nfs-server
image: jsafrane/nfs-data
ports:
- name: nfs
containerPort: 2049
securityContext:
privileged: true
pod configuration to mount nfs volume
apiVersion: v1
kind: Pod
metadata:
name: nfs-web
spec:
containers:
- name: web
image: nginx
ports:
- name: web
containerPort: 80
volumeMounts:
# name must match the volume name below
- name: nfs
mountPath: "/usr/share/nginx/html"
volumes:
- name: nfs
nfs:
# FIXME: use the right hostname
server: 192.168.3.201
path: "/"
When I run kubectl describe nfs-web I get following output mentioning it was unable to mount nfs volume. What could be the reason for that?
Name: nfs-web
Namespace: default
Image(s): nginx
Node: 192.168.1.114/192.168.1.114
Start Time: Sun, 06 Dec 2015 08:31:06 +0530
Labels: <none>
Status: Pending
Reason:
Message:
IP:
Replication Controllers: <none>
Containers:
web:
Container ID:
Image: nginx
Image ID:
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment Variables:
Conditions:
Type Status
Ready False
Volumes:
nfs:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: 192.168.3.201
Path: /
ReadOnly: false
default-token-nh698:
Type: Secret (a secret that should populate this volume)
SecretName: default-token-nh698
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
───────── ──────── ───── ──── ───────────── ────── ───────
36s 36s 1 {scheduler } Scheduled Successfully assigned nfs-web to 192.168.1.114
36s 2s 5 {kubelet 192.168.1.114} FailedMount Unable to mount volumes for pod "nfs-web_default": exit status 32
36s 2s 5 {kubelet 192.168.1.114} FailedSync Error syncing pod, skipping: exit status 32

I had the same problem, and I solved it by installing nfs-common in every Kubernetes nodes.
apt-get install -y nfs-common
My nodes were installed without nfs-common. Kubernetes will ask each node to mount the NFS into a specific directory to be available to the pod. As mount.nfs was not found, the mounting process failed.
Good luck!

It looks like volumes.nfs.server=192.168.3.201 is incorrectly configured on your client. It should be set to the ClusterIP address of your nfs-server Service.

Had the same issue with NFS which only allowed root mounts.
fixed by:
a. allow non-root users to mount NFS (on the server).
or
b. in PersistentVolume add
mountOptions:
- nfsvers=4.1

I fixed this issue by installing nfs-utils on the worker nodes.

In my case the issue was that i hadn't declared the host server of the nfs in the /etc/exports file. After adding an entry in there for my host server, the volume was working correctly.
if you modify the file in anyway then you need restart the service too;
sudo systemctl restart nfs-kernel-server
An example of an entry in the /etc/exports file;
/var/nfs/home 192.111.222.333(rw,sync,no_subtree_check)

In my case, the issue was the folder defined in volume hostPath was not created in the local. Once the folder was created in the worker node server, the issue was addressed.
Warning FailedMount 3m18s kubelet Unable to attach or mount volumes: unmounted volumes=[temp-volume], unattached volumes=[nfsvol-vre-data temp1-volume consumer1-serviceaccount-token-sdfsdf nfsvol]: timed out waiting for the condition
Warning FailedMount 71s (x10 over 5m20s) kubelet MountVolume.SetUp failed for volume "temp-volume" : hostPath type check failed: /tmp/folder is not a directory
Warning FailedMount 63s kubelet Unable to attach or mount volumes: unmounted volumes=[temp-volume], unattached volumes=[nfsvol nfsvol-vre-data temp1-volume consumer1-serviceaccount-token-sdfsdf]: timed out waiting for the condition

You need to execute the following on each master and node
sudo yum install nfs-utils -y

Related

Back-off restarting failed container In Azure AKS

Linux container pod, with docker images from Azure Container registry, keeps restarting with restartPolicy as Always. Pod description is as below.
kubectl describe pod example-pod
...
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 11 Jun 2020 03:27:11 +0000
Finished: Thu, 11 Jun 2020 03:27:12 +0000
...
Back-off restarting failed container
This pod is created with secret to access ACR registry repository.
Reason is that pod completes execution successfully with exit code 0. However, It should keep listening at particular port number. Microsoft document link is at this URL Container Group Runtime under header "Container continually exits and restarts"
deployment-example.yml file content is as below.
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
namespace: development
labels:
app: example
spec:
replicas: 1
selector:
matchLabels:
app: example
template:
metadata:
labels:
app: example
spec:
containers:
- name: example
image: contentocr.azurecr.io/example:latest
#command: ["ping -t localhost"]
imagePullPolicy: Always
ports:
- name: http-port
containerPort: 3000
imagePullSecrets:
- name: regpass
restartPolicy: Always
nodeSelector:
agent: linux
---
apiVersion: v1
kind: Service
metadata:
name: example
namespace: development
labels:
app: example
spec:
ports:
- name: http-port
port: 3000
targetPort: 3000
selector:
app: example
type: LoadBalancer
Output of kubectl get events is as below.
3m39s Normal Scheduled pod/example-deployment-5dc964fcf8-gbm5t Successfully assigned development/example-deployment-5dc964fcf8-gbm5t to aks-agentpool-18342716-vmss000000
2m6s Normal Pulling pod/example-deployment-5dc964fcf8-gbm5t Pulling image "contentocr.azurecr.io/example:latest"
2m5s Normal Pulled pod/example-deployment-5dc964fcf8-gbm5t Successfully pulled image "contentocr.azurecr.io/example:latest"
2m5s Normal Created pod/example-deployment-5dc964fcf8-gbm5t Created container example
2m49s Normal Started pod/example-deployment-5dc964fcf8-gbm5t Started container example
2m20s Warning BackOff pod/example-deployment-5dc964fcf8-gbm5t Back-off restarting failed container
6m6s Normal SuccessfulCreate replicaset/example-deployment-5dc964fcf8 Created pod: example-deployment-5dc964fcf8-2fdt5
3m39s Normal SuccessfulCreate replicaset/example-deployment-5dc964fcf8 Created pod: example-deployment-5dc964fcf8-gbm5t
6m6s Normal ScalingReplicaSet deployment/example-deployment Scaled up replica set example-deployment-5dc964fcf8 to 1
3m39s Normal ScalingReplicaSet deployment/example-deployment Scaled up replica set example-deployment-5dc964fcf8 to 1
3m38s Normal EnsuringLoadBalancer service/example Ensuring load balancer
3m34s Normal EnsuredLoadBalancer service/example Ensured load balancer
Docker file entry point is like ENTRYPOINT ["npm", "start"] with CMD ["tail -f /dev/null/"]
It runs locally. Implicitly, it assigns CI="true" flag. However, in docker-compose stdin_open: true or tty: true is to be set and in Kubernetes deployment file, ENV named variable CI is to be set up with value "true".
The below command solved my problem:-
az aks update -n aks-nks-k8s-cluster -g aks-nks-k8s-rg --attach-acr aksnksk8s
After executing the above command, below will be displayed:-
Add ROLE Propagation done [###############] 100.0000%
and then,
Running.. followed by Response trail after some time.
Here,
aks-nks-k8s-cluster : Cluster name I have created and using
aks-nks-k8s-rg : Resource Group have created and using
aksnksk8s : Container Registries which I have created and using

Kubernetes Pod's containers not running when using sh commands

Pod containers are not ready and stuck under Waiting state over and over every single time after they run sh commands (/bin/sh as well).
As example all pod's containers seen at https://v1-17.docs.kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#define-container-environment-variables-with-data-from-multiple-configmaps they just go on "Complete" status after executing the sh command, or if I set "restartPolicy: Always" they have the "Waiting" state for the reason CrashLoopBackOff.
(Containers work fine if I do not set any command on them.
If I use the sh command within container, after creating them I can read using "kubectl logs" the env variable was set correctly.
The expected behaviour is to get pod's containers running after they execute the sh command.
I cannot find references regarding this particular problem and I need little help if possible, thank you very much in advance!
Please disregard I tried different images, the problem happens either way.
environment: Kubernetes v 1.17.1 on qemu VM
yaml:
apiVersion: v1
kind: ConfigMap
metadata:
name: special-config
data:
how: very
---
apiVersion: v1
kind: Pod
metadata:
name: dapi-test-pod
spec:
containers:
- name: test-container
image: nginx
ports:
- containerPort: 88
command: [ "/bin/sh", "-c", "env" ]
env:
# Define the environment variable
- name: SPECIAL_LEVEL_KEY
valueFrom:
configMapKeyRef:
# The ConfigMap containing the value you want to assign to SPECIAL_LEVEL_KEY
name: special-config
# Specify the key associated with the value
key: how
restartPolicy: Always
describe pod:
kubectl describe pod dapi-test-pod
Name: dapi-test-pod
Namespace: default
Priority: 0
Node: kw1/10.1.10.31
Start Time: Thu, 21 May 2020 01:02:17 +0000
Labels: <none>
Annotations: cni.projectcalico.org/podIP: 192.168.159.83/32
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"dapi-test-pod","namespace":"default"},"spec":{"containers":[{"command...
Status: Running
IP: 192.168.159.83
IPs:
IP: 192.168.159.83
Containers:
test-container:
Container ID: docker://63040ec4d0a3e78639d831c26939f272b19f21574069c639c7bd4c89bb1328de
Image: nginx
Image ID: docker-pullable://nginx#sha256:30dfa439718a17baafefadf16c5e7c9d0a1cde97b4fd84f63b69e13513be7097
Port: 88/TCP
Host Port: 0/TCP
Command:
/bin/sh
-c
env
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 21 May 2020 01:13:21 +0000
Finished: Thu, 21 May 2020 01:13:21 +0000
Ready: False
Restart Count: 7
Environment:
SPECIAL_LEVEL_KEY: <set to the key 'how' of config map 'special-config'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-zqbsw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-zqbsw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-zqbsw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned default/dapi-test-pod to kw1
Normal Pulling 12m (x4 over 13m) kubelet, kw1 Pulling image "nginx"
Normal Pulled 12m (x4 over 13m) kubelet, kw1 Successfully pulled image "nginx"
Normal Created 12m (x4 over 13m) kubelet, kw1 Created container test-container
Normal Started 12m (x4 over 13m) kubelet, kw1 Started container test-container
Warning BackOff 3m16s (x49 over 13m) kubelet, kw1 Back-off restarting failed container
You can use this manifest; The command ["/bin/sh", "-c"] says "run a shell, and execute the following instructions". The args are then passed as commands to the shell. Multiline args make it simple and easy to read. Your pod will display its environment variables and also start the NGINX process without stopping:
apiVersion: v1
kind: ConfigMap
metadata:
name: special-config
data:
how: very
---
apiVersion: v1
kind: Pod
metadata:
name: dapi-test-pod
spec:
containers:
- name: test-container
image: nginx
ports:
- containerPort: 88
command: ["/bin/sh", "-c"]
args:
- env;
nginx -g 'daemon off;';
env:
# Define the environment variable
- name: SPECIAL_LEVEL_KEY
valueFrom:
configMapKeyRef:
# The ConfigMap containing the value you want to assign to SPECIAL_LEVEL_KEY
name: special-config
# Specify the key associated with the value
key: how
restartPolicy: Always
This happens because the process in the container you are running has completed and the container shuts down, and so kubernetes marks the pod as completed.
If the command that is defined in the docker image as part of CMD, or if you've added your own command as you have done, then the container shuts down after the command completed. It's the same reason why when you run Ubuntu using plain docker, it starts up then shuts down directly afterwards.
For pods, and their underlying docker container to continue running, you need to start a process that will continue running. In your case, running the env command completes right away.
If you set the pod to restart Always, then kubernetes will keep trying to restart it until it's reached it's back off threshold.
One off commands like you're running are useful for utility type things. I.e. do one thing then get rid of the pod.
For example:
kubectl run tester --generator run-pod/v1 --image alpine --restart Never --rm -it -- /bin/sh -c env
To run something longer, start a process that continues running.
For example:
kubectl run tester --generator run-pod/v1 --image alpine -- /bin/sh -c "sleep 30"
That command will run for 30 seconds, and so the pod will also run for 30 seconds. It will also use the default restart policy of Always. So after 30 seconds the process completes, Kubernetes marks the pod as complete, and then restarts it to do the same things again.
Generally pods will start a long running process, like a web server. For Kubernetes to know if that pod is healthy, so it can do it's high availability magic and restart it if it cashes, it can use readiness and liveness probes.

Kubernetes master doesn't attach FlexVolume

I'm trying to attach the dummy-attachable FlexVolume sample for Kubernetes which seems to initialize normally according to my logs on both the nodes and master:
Loaded volume plugin "flexvolume-k8s/dummy-attachable
But when I try to attach the volume to a pod, the attach method never gets called from the master. The logs from the node read:
flexVolume driver k8s/dummy-attachable: using default GetVolumeName for volume dummy-attachable
operationExecutor.VerifyControllerAttachedVolume started for volume "dummy-attachable"
Operation for "\"flexvolume-k8s/dummy-attachable/dummy-attachable\"" failed. No retries permitted until 2019-04-22 13:42:51.21390334 +0000 UTC m=+4814.674525788 (durationBeforeRetry 500ms). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"dummy-attachable\" (UniqueName: \"flexvolume-k8s/dummy-attachable/dummy-attachable\") pod \"nginx-dummy-attachable\"
Here's how I'm attempting to mount the volume:
apiVersion: v1
kind: Pod
metadata:
name: nginx-dummy-attachable
namespace: default
spec:
containers:
- name: nginx-dummy-attachable
image: nginx
volumeMounts:
- name: dummy-attachable
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: dummy-attachable
flexVolume:
driver: "k8s/dummy-attachable"
Here is the ouput of kubectl describe pod nginx-dummy-attachable:
Name: nginx-dummy-attachable
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: [node id]
Start Time: Wed, 24 Apr 2019 08:03:21 -0400
Labels: <none>
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container nginx-dummy-attachable
Status: Pending
IP:
Containers:
nginx-dummy-attachable:
Container ID:
Image: nginx
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/data from dummy-attachable (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-hcnhj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
dummy-attachable:
Type: FlexVolume (a generic volume resource that is provisioned/attached using an exec based plugin)
Driver: k8s/dummy-attachable
FSType:
SecretRef: nil
ReadOnly: false
Options: map[]
default-token-hcnhj:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-hcnhj
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 41s (x6 over 11m) kubelet, [node id] Unable to mount volumes for pod "nginx-dummy-attachable_default([id])": timeout expired waiting for volumes to attach or mount for pod "default"/"nginx-dummy-attachable". list of unmounted volumes=[dummy-attachable]. list of unattached volumes=[dummy-attachable default-token-hcnhj]
I added debug logging to the FlexVolume, so I was able to verify that the attach method was never called on the master node. I'm not sure what I'm missing here.
I don't know if this matters, but the cluster is being launched with KOPS. I've tried with both k8s 1.11 and 1.14 with no success.
So this is a fun one.
Even though kubelet initializes the FlexVolume plugin on master, kube-controller-manager, which is containerized in KOPs, is the application that's actually responsible for attaching the volume to the pod. KOPs doesn't mount the default plugin directory /usr/libexec/kubernetes/kubelet-plugins/volume/exec into the kube-controller-manager pod, so it doesn't know anything about your FlexVolume plugins on master.
There doesn't appear to be a non-hacky way to do this other than to use a different Kubernetes deployment tool until KOPs addresses this problem.

Failed to mount a volume on gcePersistentDisk for mongo pod on gke

I try to run a pod on gke containing a mongo container and mount a persistent volume for data using gcePersistentDisk but it fails to mount.
First, I created the persistent disk by issuing :
gcloud compute disks create --size=1GiB --zone=europe-west3-a mongodb
Then, I created the pod using the following code:
apiVersion: v1
kind: Pod
metadata:
name: mongodb
spec:
volumes:
- name: mongodb-data
gcePersistentDisk:
pdName: mongodb
fsType: nfs4
containers:
- image: mongo
name: mongodb
volumeMounts:
- name: mongodb-data
mountPath: /data/db
ports:
- containerPort: 27017
protocol: TCP
After a while, when I list pods I get that as a result:
NAME mongodb
READY 0/1
STATUS ContainerCreating
RESTARTS 0
AGE 23m
And as a description of what's happened I get:
Warning FailedMount 5m (x18 over 26m) kubelet, gke-mongo-default-pool-02c59988-vmhz MountVolume.MountDevice failed for volume "mongodb-data" : executable file not found in $PATH
Warning FailedMount 4m (x10 over 24m) kubelet, gke-mongo-default-pool-02c59988-vmhz Unable to mount volumes for pod "mongodb_default(f1625bde-579d-11e9-a35f-42010a8a00a0)": timeout expired waiting for volumes to attach or mount for pod "default"/"mongodb". list of unmounted volumes=[mongodb-data]. list of unattached volumes=[mongodb-data default-token-5dxps]
I still can't figure out why it's still not ready ! Any suggestion please ?
fsType: ext4 instead of fsType: nfs4, that was the problem !

No nodes available to schedule pods, using google container engine

I'm having an issue where a container I'd like to run doesn't appear to be getting started on my cluster.
I've tried searching around for possible solutions, but there's a surprising lack of information out there to assist with this issue or anything of it's nature.
Here's the most I could gather:
$ kubectl describe pods/elasticsearch
Name: elasticsearch
Namespace: default
Image(s): my.image.host/my-project/elasticsearch
Node: /
Labels: <none>
Status: Pending
Reason:
Message:
IP:
Replication Controllers: <none>
Containers:
elasticsearch:
Image: my.image.host/my-project/elasticsearch
Limits:
cpu: 100m
State: Waiting
Ready: False
Restart Count: 0
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
Mon, 19 Oct 2015 10:28:44 -0500 Mon, 19 Oct 2015 10:34:09 -0500 12 {scheduler } failedScheduling no nodes available to schedule pods
I also see this:
$ kubectl get pod elasticsearch -o wide
NAME READY STATUS RESTARTS AGE NODE
elasticsearch 0/1 Pending 0 5s
I guess I'd like to know: What prerequisites exist so that I can be confident that my container is going to run in container engine? What do I need to do in this scenario to get it running?
Here's my yml file:
apiVersion: v1
kind: Pod
metadata:
name: elasticsearch
spec:
containers:
- name: elasticsearch
image: my.image.host/my-project/elasticsearch
ports:
- containerPort: 9200
resources:
volumeMounts:
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch
volumes:
- name: elasticsearch-data
gcePersistentDisk:
pdName: elasticsearch-staging
fsType: ext4
Here's some more output about my node:
$ kubectl get nodes
NAME LABELS STATUS
gke-elasticsearch-staging-00000000-node-yma3 kubernetes.io/hostname=gke-elasticsearch-staging-00000000-node-yma3 NotReady
You only have one node in your cluster and its status in NotReady. So you won't be able to schedule any pods. You can try to determine why your node isn't ready by looking in /var/log/kubelet.log. You can also add new nodes to your cluster (scale the cluster size up to 2) or delete the node (it will be automatically replaced by the instance group manager) to see if either of those options get you a working node.
It appears that scheduler couldn't see any nodes in your cluster. You can run kubectl get nodes and gcloud compute instances list to confirm whether you have any nodes in the cluster. Did you correctly specify number of nodes (--num-nodes) when creating the cluster?