Kubernetes provisioning GCE Persistent disk sometimes fails - kubernetes

I'm currently using the GCE standard container cluster with lot of success and pleasure. But I had a question about the provisioning of GCE Persistent disks.
As described in this document form Kubernetes. I created two YAML files:
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
annotations:
storageclass.beta.kubernetes.io/is-default-class: "true"
name: slow
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-standard
and
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
If I now create a following Volume Claim:
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "claim-test",
"annotations": {
"volume.beta.kubernetes.io/storage-class": "hdd"
}
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"resources": {
"requests": {
"storage": "3Gi"
}
}
}
}
The disk gets created perfectly!
And if I now start following unit
apiVersion: v1
kind: ReplicationController
metadata:
name: nfs-server
spec:
replicas: 1
selector:
role: nfs-server
template:
metadata:
labels:
role: nfs-server
spec:
containers:
- name: nfs-server
image: gcr.io/google_containers/volume-nfs
ports:
- name: nfs
containerPort: 2049
- name: mountd
containerPort: 20048
- name: rpcbind
containerPort: 111
securityContext:
privileged: true
volumeMounts:
- mountPath: /exports
name: mypvc
volumes:
- name: mypvc
persistentVolumeClaim:
claimName: claim-test
The disk gets mounted perfectly but many times I stumble upon the following error (not more can be found in the kubelet.log file):
Failed to attach volume "claim-test" on node "...." with: GCE persistent disk not found: diskName="....." zone="europe-west1-b"
Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "....". list of unattached/unmounted volumes=[....]
Sometimes the pod boots perfectly, but sometimes it crashes. The only thing I could find is that there needs to be enough time between creating the PVC and the RC itself. I tried this many times but with the same uncertain results.
I hope someone can give me some kind of suggestion or help.
Thanks in advance!
Best regards,
Hacor

Thanks in advance for your comments! After a few days of searching I was finally able to determine what the problem was, I'm posting it because it may be useful other users.
I was using the NFS example for Kubernetes as a replication controller to provide my apps with NFS storage, but it seems that when the NFS server and the PV,PVC get deleted sometimes the NFS share gets stuck on the node itself and I think it has to do with the fact that I didn't delete this elements in a particular order and therefore the node got stuck with the share becoming incapable of mounting new shares to itself or the pod!
I noticed that the problem always occurred after I deleted some app (NFS, PV, PVC and other components) from the cluster. If I created a new cluster on GCE it works perfectly to create apps, until I delete one and it goes wrong...
What the correct deletion order is I don't know for sure, but I think:
Pods using the NFS share
PV, PVC of the NFS share
NFS server
If the pod takes longer to delete, and it isn't completely gone before PV is deleted, the node hangs with a mount it can't delete because it's in use, and that's where the problems occur.
I must honestly say that now I'm moving to an externally provisioned GlusterFS cluster.
Hope it helps someone!
Regards,
Hacor

Related

pod has unbound immediate PersistentVolumeClaims, I deployed milvus server using

etcd:
enabled: true
name: etcd
replicaCount: 3
pdb:
create: false
image:
repository: "milvusdb/etcd"
tag: "3.5.0-r7"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 2379
peerPort: 2380
auth:
rbac:
enabled: false
persistence:
enabled: true
storageClass:
accessMode: ReadWriteOnce
size: 10Gi
Enable auto compaction
compaction by every 1000 revision
autoCompactionMode: revision
autoCompactionRetention: "1000"
Increase default quota to 4G
extraEnvVars:
name: ETCD_QUOTA_BACKEND_BYTES
value: "4294967296"
name: ETCD_HEARTBEAT_INTERVAL
value: "500"
name: ETCD_ELECTION_TIMEOUTenter code here
value: "2500"
Configuration values for the pulsar dependency
ref: https://github.com/apache/pulsar-helm-chart
enter image description here
I am trying to run the milvus cluster using kubernete in ubuntu server.
I used helm menifest https://milvus-io.github.io/milvus-helm/
Values.yaml
https://raw.githubusercontent.com/milvus-io/milvus-helm/master/charts/milvus/values.yaml
I checked PersistentValumeClaim their was an error
no persistent volumes available for this claim and no storage class is set
This error comes because you dont have a Persistent Volume.
A pvc needs a a pv with at least the same capacity of the pvc.
This can be done manually or with a Volume provvisioner.
The most easy way someone would say is to use the local storageClass, which uses the diskspace from the node where the pod is instanciated, adds a pod affinity so that the pod starts allways on the same node and can use the volume on that disk. In your case you are using 3 replicas. Allthough its possible to start all 3 instances on the same node, this is mostlikly not what you want to achieve with Kubernetes. If that node breaks you wont have any other instance running on another node.
You need first to thing about the infrastructure of your cluster. Where should the data of the volumes be stored?
An Network File System, nfs, might be a could solution.
In this case you have an nfs somewhere in your infrastructure and all the nodes can reach it.
So you can create a PV which is accessible from all your node.
To not allocate a PV always manualy you can install a Volumeprovisioner inside your cluster.
I use in some cluster this one here:
https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner
As i said you must have already an nfs and configure the provvisioner.yaml with the path.
it looks like this:
# patch_nfs_details.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: nfs-client-provisioner
name: nfs-client-provisioner
spec:
template:
spec:
containers:
- name: nfs-client-provisioner
env:
- name: NFS_SERVER
value: <YOUR_NFS_SERVER_IP>
- name: NFS_PATH
value: <YOUR_NFS_SERVER_SHARE>
volumes:
- name: nfs-client-root
nfs:
server: <YOUR_NFS_SERVER_IP>
path: <YOUR_NFS_SERVER_SHARE>
If you use an nfs without provvisioner, you need to define a storageClass which is linked to your nfs.
There are a lot of solutions to hold persitent volumes.
Here you can find a list of StorageClasses:
https://kubernetes.io/docs/concepts/storage/storage-classes/
At the end it depends also where your cluster is provvisioned if you are not managing it by yourself.

AKS SMB Mount works on Node, but not on Pod

We are trying to debug an issue where the following works on an Azure AKS node, but not within a pod (this is a share through a private endpoint wrapping a failover solution, this is NOT a Storage Account):
mount -t cifs //pename/datadir /opt/testing -o credentials=/root/creds
The sample pod I am testing in is using (it's also Fedora 38, if that makes any different):
securityContext:
capabilities:
add:
- SYS_ADMIN
- DAC_READ_SEARCH
The error from this "hands-on" pod is:
mount error(13): Permission denied
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log messages (dmesg)
The problem is that when we try and use the kubernetes-csi driver for pods, things are not working. Is there something in Kubernetes pods configurations that prevent it from mounting?
The secret setup is simple:
apiVersion: v1
kind: Secret
...
type: microsoft.com/smb
stringData:
username: blah
password: blah
Storage class is set up as:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadta:
name: testing
provisioner: smb.csi.k8s.io
parameters:
source: "//pename/datadir"
csi.storage.k8s.io/node-stage-secret-name: 'my-secret-name'
csi.storage.k8s.io/node-stage-secret-namespace: 'default'
reclaimPolicy: Retain
volumeBindingMode: Immediate
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=1001
- gid=1001
PVC:
apiVersion: v1
kind: PersistentVolumeClaim
...
spec:
storageClassName: testing
...
"Hands-off" test Pod:
...
spec:
volumes:
- name: testing-smb
persistentVolumeClaim:
claimName: testing-smb
containers:
- name: test
...
volumeMounts:
- name: testing-smb
mountPath: /opt/testing-smb
...
I think it's potentially "invalid" to test the mount in the pod, but I'm not sure why. It's odd that we can mount it on the Kubernetes node directly, but not in the configuration nor directly within a pod. We tested firewall rules and everything. Any pointers or links to knowledgebase articles about this would be helpful.

Why is my Host Path Persistent Volume reachable from all pods?

I'm pretty stuck with this learning step of Kubernetes named PV and PVC.
What I'm trying to do here is understand how to handle shared read-write volume on multiple pods.
What I understood here is that a PVC cannot be shared between pods unless a NFS-like storage class has been configured.
I'm still with my hostPath Storage Class and I tried the following (Docker Desktop and 3 nodes microK8s cluster) :
This PVC with dynamic Host Path provisionning
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-desktop
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
Deployment with 3 replicated pods writing on the same PVC.
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
spec:
replicas: 3
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- name: busybox
image: library/busybox:stable
command: ["/bin/sh"]
args:
["-c", 'while true; do echo "1: $(hostname)" >> /root/index.html; sleep 2; done;',]
volumeMounts:
- mountPath: /root
name: vol-desktop
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
Nginx server for serving volume content
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:stable
volumeMounts:
- mountPath: /usr/share/nginx/html
name: vol-desktop
ports:
- containerPort: 80
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
Following what I understood on the documentation, this could not be possible, but in reality everything run pretty smoothly and my Nginx server displayed the up to date index.html file pretty well.
It actually worked on a single-node cluster and multi-node cluster.
What am I not getting here? Why this thing works?
Is every pod mounting is own host path volume on start?
How can a hostPath storage works between multiple nodes?
EDIT: For the multi-node case, a network folder has been created between the same storage path of each machine this is why everything has been replicated successfully. I didn't understand that the same host path is created on each node with that PVC mounted.
To anyone with the same problem: each node mounting this hostpath PVC will have is own folder created at the PV path.
So without network replication between nodes, only pods of the same node will share the same folder.
This is why it's discouraged on a multi-node cluster due to the unpredictable location of a pod on the cluster.
Thanks!
how to handle shared read-write volume on multiple pods.
Redesign your application to avoid it. It tends to be fragile and difficult to manage multiple writers safely; you depend on both your application correctly performing things like file locking, the underlying shared filesystem implementation handling things properly, and the system being tolerant of any sort of network hiccup that might happen.
The example you give is something that frequently appears in Docker Compose setups: have an application with a mix of backend code and static files, and then try to publish the static files at runtime through a volume to a reverse proxy. Instead, you can build an image that copies the static files at build time:
FROM nginx
ARG app_version=latest
COPY --from=my/app:${app_version} /app/static /usr/share/nginx/html
Have your CI system build this and push it immediately after the backend image is built. The resulting image serves the corresponding static files, but doesn't require a shared volume or any manual management of the volume contents.
For other types of content, consider storing data in a database, or use an object-storage service that maintains its own backing store and can handle the concurrency considerations. Then most of your pods can be totally stateless, and you can manage the data separately (maybe even outside Kubernetes).
How can a hostPath storage works between multiple nodes?
It doesn't. It's an instruction to Kubernetes, on whichever node the pod happens to be scheduled on, to mount that host directory into the container. There's no management of any sort of the directory content; if two pods get scheduled on the same node, they'll share the directory, and if not, they won't; and if your pod's Deployment is updated and the pod is deleted and recreated somewhere else, it might not be the same node and might not have the same data.
With some very specific exceptions you shouldn't use hostPath volumes at all. The exceptions are things like log collectors run as DaemonSets, where there is exactly one pod on every node and you're interested in picking up the host-directory content that is different on each node.
In your specific setup either you're getting lucky with where the data producers and consumers are getting colocated, or there's something about your MicroK8s setup that's causing the host directories to be shared. It is not in general reliable storage.

Kubernetes persistent volume data corrupted after multiple pod deletions

I am stuggling with a simple one replica deployment of the official event store image on a Kubernetes cluster. I am using a persistent volume for the data storage.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-eventstore
spec:
strategy:
type: Recreate
replicas: 1
template:
metadata:
labels:
app: my-eventstore
spec:
imagePullSecrets:
- name: runner-gitlab-account
containers:
- name: eventstore
image: eventstore/eventstore
env:
- name: EVENTSTORE_DB
value: "/usr/data/eventstore/data"
- name: EVENTSTORE_LOG
value: "/usr/data/eventstore/log"
ports:
- containerPort: 2113
- containerPort: 2114
- containerPort: 1111
- containerPort: 1112
volumeMounts:
- name: eventstore-storage
mountPath: /usr/data/eventstore
volumes:
- name: eventstore-storage
persistentVolumeClaim:
claimName: eventstore-pv-claim
And this is the yaml for my persistent volume claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: eventstore-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
The deployments work fine. It's when I tested for durability that I started to encounter a problem. I delete a pod to force actual state from desired state and see how Kubernetes reacts.
It immediately launched a new pod to replace the deleted one. And the admin UI was still showing the same data. But after deleting a pod for the second time, the new pod did not come up. I got an error message that said "record too large" that indicated corrupted data according to this discussion. https://groups.google.com/forum/#!topic/event-store/gUKLaxZj4gw
I tried again for a couple of times. Same result every time. After deleting the pod for the second time the data is corrupted. This has me worried that an actual failure will cause similar result.
However, when deploying new versions of the image or scaling the pods in the deployment to zero and back to one no data corruption occurs. After several tries everything is fine. Which is odd since that also completely replaces pods (I checked the pod id's and they changed).
This has me wondering if deleting a pod using kubectl delete is somehow more forcefull in the way that a pod is terminated. Do any of you have similar experience? Of insights on if/how delete is different? Thanks in advance for your input.
Regards,
Oskar
I was refered to this pull request on Github that stated the the proces was not killed properly: https://github.com/EventStore/eventstore-docker/pull/52
After building a new image with the Docker file from the pull request put this image in the deployment. I am killing pods left and right, no data corruption issues anymore.
Hope this helps someone facing the same issue.

Unable to mount Amazon Web Services (AWS) EBS Volume into Kubernetes pod

I created a volume using the following command.
aws ec2 create-volume --size 10 --region us-east-1 --availability-zone us-east-1c --volume-type gp2
Then I used the file below to create a pod that uses the volume. But when I login to the pod, I don't see the volume. Is there something that I might be doing wrong? Did I miss a step somewhere? Thanks for any insights.
---
kind: "Pod"
apiVersion: "v1"
metadata:
name: "nginx"
labels:
name: "nginx"
spec:
containers:
-
name: "nginx"
image: "nginx"
volumeMounts:
- mountPath: /test-ebs
name: test-volume
volumes:
- name: test-volume
# This AWS EBS volume must already exist.
awsElasticBlockStore:
volumeID: aws://us-east-1c/vol-8499707e
fsType: ext4
I just stumbled across the same thing and found out after some digging, that they actually changed the volume mount syntax. Based on that knowledge I created this PR for documentation update. See https://github.com/kubernetes/kubernetes/pull/17958 for tracking that and more info, follow the link to the bug and the original change which doesn't include the doc update. (SO prevents me from posting more than two links apparently.)
If that still doesn't do the trick for you (as it does for me) it's probably because of https://stackoverflow.com/a/32960312/3212182 which will be fixed in one of the next releases I guess. At least I can't see it in the latest release notes.
Ensure that the volume is in the same availability zone as the node.
http://kubernetes.io/docs/user-guide/volumes/
If that's not the issue, are there any events in kubectl describe pod nginx?