CSI CEPH FS cannot mount success on K8s - kubernetes

Could anyone help me? I cannot mount successfully.
I use this csi plugin to mount cephfs into pod: https://github.com/ceph/ceph-csi
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 11m (x26 over 169m) kubelet, sprl-pbkh-kubenode03 Unable to attach or mount volumes: unmounted volumes=[cephfs-pvc], unattached volumes=[default-token-bms74 cephfs-pvc]: timed out waiting for the condition
Warning FailedMount 6m53s (x47 over 163m) kubelet, sprl-pbkh-kubenode03 Unable to attach or mount volumes: unmounted volumes=[cephfs-pvc], unattached volumes=[cephfs-pvc default-token-bms74]: timed out waiting for the condition
Warning FailedMount 58s (x92 over 172m) kubelet, sprl-pbkh-kubenode03 MountVolume.MountDevice failed for volume "pvc-c266c4e3-9ea2-4b26-9759-b73a5ba3516a" : rpc error: code = Internal desc = an error (exit status 1) occurred while running nsenter args: [--net=/ -- ceph-fuse /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c266c4e3-9ea2-4b26-9759-b73a5ba3516a/globalmount -m 172.18.4.26,172.18.4.31,172.18.4.32 -c /etc/ceph/ceph.conf -n client.admin --keyfile=***stripped*** -r /volumes/csi/csi-vol-83e27006-59a6-11ed-97f7-7e2180fc1e5e/66900fdf-648b-49ba-ac19-cf3f32cb874e -o nonempty --client_mds_namespace=cephfs] stderr: nsenter: reassociate to namespace 'ns/net' failed: Invalid argument
I have used this https://github.com/ceph/ceph-csi
Creating PVC and Storage Class.
Then use pod to mount PVC but cannot mount success.
I confirm I can mount successfully from my local machine using Ceph-Fuse

Related

Kubernetes - how to list conditions that are not met

I upgraded k8s version on GCP to 1.21.6-gke.1500. Some of my pods are stuck in the status "ContainerCreating". When I describe them, I see these errors:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12m default-scheduler Successfully assigned gamma/xxxx-d58747f46-j7fzs to gke-us-east4-gke-us-east4-xxxx--6c23c312-p5q2
Warning FailedMount 10m kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-data], unattached volumes=[my-license kube-api-access-b32js nfs-data]: timed out waiting for the condition
Warning FailedMount 3m56s (x2 over 6m13s) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-data], unattached volumes=[nfs-data my-license kube-api-access-b32js]: timed out waiting for the condition
Warning FailedMount 100s (x2 over 8m31s) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-data], unattached volumes=[kube-api-access-b32js nfs-data my-license]: timed out waiting for the condition
How to list conditions which are not met and pods are waiting for them?
Try following commands:
kubectl describe pod <name>
kubectl get nodes -o wide
kubectl get volumeattachments
kubectl get componentstatus
You can also check your GKE Logs.

ceph rook deployment issue found at mount pvc

I am warren . try to setup ceph by rook in my k8s environment . i followed the offical document
https://rook.io/docs/rook/v1.4/ceph-quickstart.html. almaost everythings looks well during the ceph setup. I also verified it by
ceph status
cluster:
id: 356efdf1-a1a7-4365-9ee6-b65ecf8481f9
health: HEALTH_OK
But failed at examples https://rook.io/docs/rook/v1.4/ceph-block.html, try to use block storage in the k8s environment. my k8s env is v1.18.2.
after deploy mysql and workpress. found error at pod . like below. I also checked the pv and pvc. all of them created success and bounded. so I thinks something error about mount compatibility. please help.
-----------------------------------------------------
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler running "VolumeBinding" filter plugin for pod "wordpress-mysql-764fc64f97-qwtjd": pod has unbound immediate PersistentVolumeClaims
Warning FailedScheduling <unknown> default-scheduler running "VolumeBinding" filter plugin for pod "wordpress-mysql-764fc64f97-qwtjd": pod has unbound immediate PersistentVolumeClaims
Normal Scheduled <unknown> default-scheduler Successfully assigned default/wordpress-mysql-764fc64f97-qwtjd to master1
Normal SuccessfulAttachVolume 7m14s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-dc8567bb-c2e3-44a4-a56a-c74616059db4"
Warning FailedMount 5m11s kubelet, master1 Unable to attach or mount volumes: unmounted volumes=[mysql-persistent-storage], unattached volumes=[default-token-czg9j mysql-persistent-storage]: timed out waiting for the condition
Warning FailedMount 40s (x2 over 2m54s) kubelet, master1 Unable to attach or mount volumes: unmounted volumes=[mysql-persistent-storage], unattached volumes=[mysql-persistent-storage default-token-czg9j]: timed out waiting for the condition
Warning FailedMount 6s (x4 over 6m6s) kubelet, master1 MountVolume.MountDevice failed for volume "pvc-dc8567bb-c2e3-44a4-a56a-c74616059db4" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 110) occurred while running rbd args: [--id csi-rbd-node -m 10.109.63.94:6789,10.96.135.241:6789,10.110.131.193:6789 --keyfile=***stripped*** map replicapool/csi-vol-5ccc546b-0914-11eb-9135-62dece6c0d98 --device-type krbd], rbd error output: rbd: sysfs write failed
-------------------------------------------------

Why would Kubelet say a file exists during mounting when it doesn't?

I have modified the Kubernetes hostpath CSI driver to use the https://github.com/mesosphere/csilvm driver. My deployment creates the LV for the PVC but fails to mount it into the pod with this description:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 23m (x1127 over 8d) kubelet, node2 Unable to attach or mount volumes: unmounted volumes=[clvm-volume], unattached volumes=[default-token-bzfcz clvm-volume]: timed out waiting for the condition
Warning FailedMount 9m6s (x6021 over 8d) kubelet, node2 MountVolume.SetUp failed for volume "pvc-3df13ff4-0342-4522-a70d-f424dd6c6d5f" : rpc error: code = Internal desc = Cannot create mount target /var/lib/kubelet/pods/901a8f98-7162-4168-a822-5c973bd0e6dc/volumes/kubernetes.io~csi/pvc-3df13ff4-0342-4522-a70d-f424dd6c6d5f/mount: err=mkdir /var/lib/kubelet/pods/901a8f98-7162-4168-a822-5c973bd0e6dc/volumes/kubernetes.io~csi/pvc-3df13ff4-0342-4522-a70d-f424dd6c6d5f/mount: file exists
Warning FailedMount 3m12s (x4278 over 8d) kubelet, node2 Unable to attach or mount volumes: unmounted volumes=[clvm-volume], unattached volumes=[clvm-volume default-token-bzfcz]: timed out waiting for the condition
I think this is telling me kubelet issued a mount and it failed because there is a file with that name already at the mount point. When I go on to the node I see nothing in the the directly specified.
[root#node2 kubernetes.io~csi]# pwd
/var/lib/kubelet/pods/901a8f98-7162-4168-a822-5c973bd0e6dc/volumes/kubernetes.io~csi
[root#node2 kubernetes.io~csi]# ls -lta
total 0
drwxr-x---. 2 root root 6 Jul 30 10:06 .
drwxr-x---. 4 root root 59 Jul 22 20:25 ..
[root#node2 kubernetes.io~csi]#
I'm confused on how I could get kubelet confused.

failed to mount volume /dev/rbd1 [xfs]

I install kubernetes with 1 master and 3 worker and deploy rook on it and the ceph status in rook-tools shows ok.
there is wordpress template in https://github.com/rook/rook.git
and when i create it the pod does not created and
#kubectl describe pods wordpress-mysql-b78774f44-m548z -n default
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 15m (x51 over 128m) kubelet, ubuntu2 Unable to mount volumes for pod "test-pod-rbd_rook-ceph(15abe007-53a4-11e9-abd9-7c8bca00f216)": timeout expired waiting for volumes to attach or mount for pod "rook-ceph"/"test-pod-rbd". list of unmounted volumes=[data]. list of unattached volumes=[data default-token-8p9br]
Warning FailedMount 18s (x72 over 130m) kubelet, ubuntu2 MountVolume.SetUp failed for volume "pvc-fd3fdbc4-53b7-11e9-abd9-7c8bca00f216" : mount command failed, status: Failure, reason: failed to mount volume /dev/rbd1 [xfs] to /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph-system/mounts/pvc-fd3fdbc4-53b7-11e9-abd9-7c8bca00f216, error executable file not found in $PATH
#kubectl get events
18h Warning FailedMount pod/wordpress-mysql-b78774f44-m548z MountVolume.SetUp failed for volume "pvc-e1e758a8-53b6-11e9-abd9-7c8bca00f216" : mount command failed, status: Failure, reason: failed to mount volume /dev/rbd0 [xfs] to /var/lib/kubelet/plugins/ceph.rook.io/rook-ceph-system/mounts/pvc-e1e758a8-53b6-11e9-abd9-7c8bca00f216, error executable file not found in $PATH
18h Warning FailedMount pod/wordpress-mysql-b78774f44-m548z Unable to mount volumes for pod "wordpress-mysql-b78774f44-m548z_default(e1f6de90-53b6-11e9-abd9-7c8bca00f216)": timeout expired waiting for volumes to attach or mount for pod "default"/"wordpress-mysql-b78774f44-m548z". list of unmounted volumes=[mysql-persistent-storage]. list of unattached volumes=[mysql-persistent-storage default-token-bktfl]
35m Warning FailedMount pod/wordpress-mysql-b78774f44-m548z MountVolume.SetUp failed for volume "pvc-e1e758a8-53b6-11e9-abd9-7c8bca00f216" : mount command failed, status: Failure, reason: Rook: Error getting RPC client: error connecting to socket /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ceph.rook.io~rook-ceph-system/.rook.sock: dial unix /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ceph.rook.io~rook-ceph-system/.rook.sock: connect: connection refused
6m32s Warning FailedMount pod/wordpress-mysql-b78774f44-m548z Unable to mount volumes for pod "wordpress-mysql-b78774f44-m548z_default(e1f6de90-53b6-11e9-abd9-7c8bca00f216)": timeout expired waiting for volumes to attach or mount for pod "default"/"wordpress-mysql-b78774f44-m548z". list of unmounted volumes=[mysql-persistent-storage]. list of unattached volumes=[mysql-persistent-storage default-token-bktfl]
4m17s Warning FailedMount pod/wordpress-mysql-b78774f44-m548z MountVolume.SetUp failed for volume "pvc-e1e758a8-53b6-11e9-abd9-7c8bca00f216" : mount command failed, status: Failure, reason: Rook: Mount volume failed: failed to attach volume replicapool/pvc-e1e758a8-53b6-11e9-abd9-7c8bca00f216: failed to map image replicapool/pvc-e1e758a8-53b6-11e9-abd9-7c8bca00f216 cluster rook-ceph. failed to map image replicapool/pvc-e1e758a8-53b6-11e9-abd9-7c8bca00f216: Failed to complete 'rbd': signal: interrupt. . output:
How can i solve it?
Is it a bug?
This looks like there is a problem with pv,pvc. Please run more basics checks, like:
kubectl get pods,pv,pvc --all-namespaces
kubectl describe pvc-fd3fdbc4-53b7-11e9-abd9-7c8bca00f216
kubectl get pvc pvc-fd3fdbc4-53b7-11e9-abd9-7c8bca00f216 -o yaml
kubectl get pod wordpress-mysql-b78774f44-m548z -o yaml
kubectl -n rook-ceph get all
kubectl get storageclasses --all-namespaces
You can find more helpful information about troubleshooting techniques and commands when you are working with Rook
Hope this help and please share what you were able to find so we can troubleshoot this.
to resolve this issue, you must change fstype: xfs -> ext4 in your StorageClass

Error ICP 3.1.1 Grafana Prometheus Kubernetes Status Pods Always 'Init'

I Was Complete Installing ICP with VA. Using 1 Master, 1 Proxy, 1 Management, 1 VA, and 3 Workers with GlusterFS Inside.
This List Kubernetes Pods Not Running
Storage - PersistentVolume GlusterFS on ICP
This Describe Kubernetes Pods Error Evenet
custom-metrics-adapter
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17m default-scheduler Successfully assigned kube-system/custom-metrics-adapter-5d5b694df7-cggz8 to 192.168.10.126
Normal Pulled 17m kubelet, 192.168.10.126 Container image "swgcluster.icp:8500/ibmcom/curl:4.0.0" already present on machine
Normal Created 17m kubelet, 192.168.10.126 Created container
Normal Started 17m kubelet, 192.168.10.126 Started container
monitoring-grafana
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18m default-scheduler Successfully assigned kube-system/monitoring-grafana-799d7fcf97-sj64j to 192.168.10.126
Warning FailedMount 1m (x8 over 16m) kubelet, 192.168.10.126 (combined from similar events): MountVolume.SetUp failed for volume "pvc-251f69e3-fd60-11e8-9779-000c2914ff99" : mount failed: mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e2c85434-fd67-11e8-822b-000c2914ff99/volumes/kubernetes.io~glusterfs/pvc-251f69e3-fd60-11e8-9779-000c2914ff99 --scope -- mount -t glusterfs -o log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-251f69e3-fd60-11e8-9779-000c2914ff99/monitoring-grafana-799d7fcf97-sj64j-glusterfs.log,backup-volfile-servers=192.168.10.115:192.168.10.116:192.168.10.119,auto_unmount,log-level=ERROR 192.168.10.115:vol_946f98c8a92ce2930acd3181d803943c /var/lib/kubelet/pods/e2c85434-fd67-11e8-822b-000c2914ff99/volumes/kubernetes.io~glusterfs/pvc-251f69e3-fd60-11e8-9779-000c2914ff99
Output: Running scope as unit run-r6ba2425d0e7f437d922dbe0830cd5a97.scope.
mount: unknown filesystem type 'glusterfs'
the following error information was pulled from the glusterfs log to help diagnose this issue: could not open log file for pod monitoring-grafana-799d7fcf97-sj64j
Warning FailedMount 50s (x8 over 16m) kubelet, 192.168.10.126 Unable to mount volumes for pod "monitoring-grafana-799d7fcf97-sj64j_kube-system(e2c85434-fd67-11e8-822b-000c2914ff99)": timeout expired waiting for volumes to attach or mount for pod "kube-system"/"monitoring-grafana-799d7fcf97-sj64j". list of unmounted volumes=[grafana-storage]. list of unattached volumes=[grafana-storage config-volume dashboard-volume dashboard-config ds-job-config router-config monitoring-ca-certs monitoring-certs router-entry default-token-f6d9q]
monitoring-prometheus
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 19m default-scheduler Successfully assigned kube-system/monitoring-prometheus-85546d8575-jr89h to 192.168.10.126
Warning FailedMount 4m (x6 over 17m) kubelet, 192.168.10.126 Unable to mount volumes for pod "monitoring-prometheus-85546d8575-jr89h_kube-system(e2ca91a8-fd67-11e8-822b-000c2914ff99)": timeout expired waiting for volumes to attach or mount for pod "kube-system"/"monitoring-prometheus-85546d8575-jr89h". list of unmounted volumes=[storage-volume]. list of unattached volumes=[config-volume rules-volume etcd-certs storage-volume router-config monitoring-ca-certs monitoring-certs monitoring-client-certs router-entry lua-scripts-config-config default-token-f6d9q]
Warning FailedMount 55s (x11 over 17m) kubelet, 192.168.10.126 (combined from similar events): MountVolume.SetUp failed for volume "pvc-252001ed-fd60-11e8-9779-000c2914ff99" : mount failed: mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e2ca91a8-fd67-11e8-822b-000c2914ff99/volumes/kubernetes.io~glusterfs/pvc-252001ed-fd60-11e8-9779-000c2914ff99 --scope -- mount -t glusterfs -o auto_unmount,log-level=ERROR,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-252001ed-fd60-11e8-9779-000c2914ff99/monitoring-prometheus-85546d8575-jr89h-glusterfs.log,backup-volfile-servers=192.168.10.115:192.168.10.116:192.168.10.119 192.168.10.115:vol_f101b55d8b1dc3021ec7689713a74e8c /var/lib/kubelet/pods/e2ca91a8-fd67-11e8-822b-000c2914ff99/volumes/kubernetes.io~glusterfs/pvc-252001ed-fd60-11e8-9779-000c2914ff99
Output: Running scope as unit run-r638272b55bca4869b271e8e4b1ef45cf.scope.
mount: unknown filesystem type 'glusterfs'
the following error information was pulled from the glusterfs log to help diagnose this issue: could not open log file for pod monitoring-prometheus-85546d8575-jr89h
monitoring-prometheus-alertmanager
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 20m default-scheduler Successfully assigned kube-system/monitoring-prometheus-alertmanager-65445b66bd-6bfpn to 192.168.10.126
Warning FailedMount 1m (x9 over 18m) kubelet, 192.168.10.126 (combined from similar events): MountVolume.SetUp failed for volume "pvc-251ed00f-fd60-11e8-9779-000c2914ff99" : mount failed: mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/e2cbe5e7-fd67-11e8-822b-000c2914ff99/volumes/kubernetes.io~glusterfs/pvc-251ed00f-fd60-11e8-9779-000c2914ff99 --scope -- mount -t glusterfs -o backup-volfile-servers=192.168.10.115:192.168.10.116:192.168.10.119,auto_unmount,log-level=ERROR,log-file=/var/lib/kubelet/plugins/kubernetes.io/glusterfs/pvc-251ed00f-fd60-11e8-9779-000c2914ff99/monitoring-prometheus-alertmanager-65445b66bd-6bfpn-glusterfs.log 192.168.10.115:vol_7766e36a77cbd2c0afe3bd18626bd2c4 /var/lib/kubelet/pods/e2cbe5e7-fd67-11e8-822b-000c2914ff99/volumes/kubernetes.io~glusterfs/pvc-251ed00f-fd60-11e8-9779-000c2914ff99
Output: Running scope as unit run-r35994e15064e48e2a36f69a88009aa5d.scope.
mount: unknown filesystem type 'glusterfs'
the following error information was pulled from the glusterfs log to help diagnose this issue: could not open log file for pod monitoring-prometheus-alertmanager-65445b66bd-6bfpn
Warning FailedMount 23s (x9 over 18m) kubelet, 192.168.10.126 Unable to mount volumes for pod "monitoring-prometheus-alertmanager-65445b66bd-6bfpn_kube-system(e2cbe5e7-fd67-11e8-822b-000c2914ff99)": timeout expired waiting for volumes to attach or mount for pod "kube-system"/"monitoring-prometheus-alertmanager-65445b66bd-6bfpn". list of unmounted volumes=[storage-volume]. list of unattached volumes=[config-volume storage-volume router-config monitoring-ca-certs monitoring-certs router-entry default-token-f6d9q]
Just Got Resolve Problem this Issues, After Reinstall the ICP (IBM Cloud Private).
And I Checking Few Possibilities Error, then Got on Few Nodes Not Completely Installing the GlusterFS Client.
I Checking Commands 'GlusterFS Client on ALL Nodes' : (Using Ubuntu for the OS)
sudo apt-get install glusterfs-client -y