Share a volume in KubernetesPodOperator? - kubernetes

I'm using a ffmpeg docker image from a KubernetesPodOperator() inside Airflow for extracting frames from a video.
It works fine, but I am not able to retrieve the frames stored: how can store the frames generated by the Pod directly into my file system (host-machine)?
Update:
From https://airflow.apache.org/kubernetes.html# I think I figured out that I need to work on the volume_mount, volume_config and volume parameters, but still no luck.
Error message:
"message":"Not found: \"test-volume\"","field":"spec.containers[0].volumeMounts[0].name"
PV and PVC:
command kubectl get pv,pvc test-volume gives:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/test-volume 10Gi RWO Retain Bound default/test-volume manual 3m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/test-volume Bound test-volume 10Gi RWO manual 3m
Code:
volume_mount = VolumeMount('test-volume',
mount_path='/',
sub_path=None,
read_only=False)
volume_config= {
'persistentVolumeClaim':
{
'claimName': 'test-volume' # uses the persistentVolumeClaim given in the Kube yaml
}
}
volume = Volume(name="test-volume", configs=volume_config)
with DAG('test_kubernetes',
default_args=default_args,
schedule_interval=schedule_interval,
) as dag:
extract_frames = KubernetesPodOperator(namespace='default',
image="jrottenberg/ffmpeg:3.4-scratch",
arguments=[
"-i", "http://www.jell.yfish.us/media/jellyfish-20-mbps-hd-hevc-10bit.mkv",
"test_%04d.jpg"
],
name="extract-frames",
task_id="extract_frames",
volume=[volume],
volume_mounts=[volume_mount],
get_logs=True
)

Here's some speculation as to what may be wrong:
(Where your error is most likely coming from)
KubernetesPodOperator expects parameter "volumes", not "volume"
In general, it's bad practice to mount onto "/" since you will be deleting everything that comes on the image you're running. i.e. you should probably change "mount_path" in your VolumeMount object to something else like "/stored_frames"

You should create a test pod to verify your k8s objects (volumes, pod, configmap, secrets,etc) before wrapping that pod creation in the DAG with KubernetesPodOperator. Based from your code above, it can look like this:
apiVersion: v1
kind: Pod
metadata:
name: "extract-frames-pod"
namespace: "default"
spec:
containers:
- name: "extract-frames"
image: "jrottenberg/ffmpeg:3.4-scratch"
command:
args: ["-i", "http://www.jell.yfish.us/media/jellyfish-20-mbps-hd-hevc-10bit.mkv", "test_%04d.jpg"]
imagePullPolicy: IfNotPresent
volumeMounts:
- name: "test-volume"
# do not use "/" for mountPath.
mountPath: "/images"
restartPolicy: Never
volumes:
- name: "test-volume"
persistentVolumeClaim:
claimName: "test-volume"
serviceAccountName: default
I expect you will get the same error that you had: "message":"Not found: \"test-volume\"","field":"spec.containers[0].volumeMounts[0].name"
Which I think is an issue with your PersistentVolume manifest file.
Did you set the path test-volume? Something like:
path: /test-volume
and does the path exists in the target volume? If not create that directory/folder. That might solve your problem.

Related

Unable to mount NFS on Kubernetes Pod

I am working on deploying Hyperledger Fabric test network on Kubernetes minikube cluster. I intend to use PersistentVolume to share cytpo-config and channel artifacts among various peers and orderers. Following is my PersistentVolume.yaml and PersistentVolumeClaim.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
name: persistent-volume
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
nfs:
path: "/nfsroot"
server: "3.128.203.245"
readOnly: false
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: persistent-volume-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Following is the pod where the above claim is mount on /data
kind: Pod
apiVersion: v1
metadata:
name: test-shell
labels:
name: test-shell
spec:
containers:
- name: shell
image: ubuntu
command: ["/bin/bash", "-c", "while true ; do sleep 10 ; done"]
volumeMounts:
- mountPath: "/data"
name: pv
volumes:
- name: pv
persistentVolumeClaim:
claimName: persistent-volume-claim
NFS is setup on my EC2 instance. I have verified NFS server is working fine and I was able to mount it inside minikube. I am not understanding what wrong am I doing, but any file present inside 3.128.203.245:/nfsroot is not present in test-shell:/data
What point am I missing. I even tried hostPath mount but to no avail. Please help me out.
I think you should check the following things to verify that NFS is mounted successfully or not
run this command on the node where you want to mount.
$showmount -e nfs-server-ip
like in my case $showmount -e 172.16.10.161
Export list for 172.16.10.161:
/opt/share *
use $df -hT command see that Is NFS is mounted or not like in my case it will give output 172.16.10.161:/opt/share nfs4 91G 32G 55G 37% /opt/share
if not mounted then use the following command
$sudo mount -t nfs 172.16.10.161:/opt/share /opt/share
if the above commands show an error then check firewall is allowing nfs or not
$sudo ufw status
if not then allow using the command
$sudo ufw allow from nfs-server-ip to any port nfs
I made the same setup I don't face any issues. My k8s cluster of fabric is running successfully . The hf k8s yaml files can be found at my GitHub repo. There I have deployed the consortium of Banks on hyperledger fabric which is a dynamic multihost blockchain network that means you can add orgs, peers, join peers, create channels, install and instantiate chaincode on the go in an existing running blockchain network.
By default in minikube you should have default StorageClass:
Each StorageClass contains the fields provisioner, parameters, and reclaimPolicy, which are used when a PersistentVolume belonging to the class needs to be dynamically provisioned.
For example, NFS doesn't provide an internal provisioner, but an external provisioner can be used. There are also cases when 3rd party storage vendors provide their own external provisioner.
Change the default StorageClass
In your example this property can lead to problems.
In order to list enabled addons in minikube please use:
minikube addons list
To list all StorageClasses in your cluster use:
kubectl get sc
NAME PROVISIONER
standard (default) k8s.io/minikube-hostpath
Please note that at most one StorageClass can be marked as default. If two or more of them are marked as default, a PersistentVolumeClaim without storageClassName explicitly specified cannot be created.
In your example the most probable scenario is that you have already default StorageClass. Applying those resources caused: new PV creation (without StoraglClass), new PVC creation (with reference to existing default StorageClass). In this situation there is no reference between your custom pv/pvc binding) as an example please take a look:
kubectl get pv,pvc,sc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/nfs 3Gi RWX Retain Available 50m
persistentvolume/pvc-8aeb802f-cd95-4933-9224-eb467aaa9871 1Gi RWX Delete Bound default/pvc-nfs standard 50m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/pvc-nfs Bound pvc-8aeb802f-cd95-4933-9224-eb467aaa9871 1Gi RWX standard 50m
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/standard (default) k8s.io/minikube-hostpath Delete Immediate false 103m
This example will not work due to:
new persistentvolume/nfs has been created (without reference to pvc)
new persistentvolume/pvc-8aeb802f-cd95-4933-9224-eb467aaa9871 has been created using default StorageClass. In the Claim section we can notice that this pv has been created due to dynamic pv provisioning using default StorageClass with reference to default/pvc-nfs claim (persistentvolumeclaim/pvc-nfs ).
Solution 1.
According to the information from the comments:
Also I am able to connect to it within my minikube and also my actual ubuntu system.
I you are able to mount from inside minikube host this nfs share
If you mounted nfs share into your minikube node, please try to use this example with hostpath volume directly from your pod:
apiVersion: v1
kind: Pod
metadata:
name: test-shell
namespace: default
spec:
volumes:
- name: pv
hostPath:
path: /path/shares # path to nfs mount point on minikube node
containers:
- name: shell
image: ubuntu
command: ["/bin/bash", "-c", "sleep 1000 "]
volumeMounts:
- name: pv
mountPath: /data
Solution 2.
If you are using PV/PVC approach:
kind: PersistentVolume
apiVersion: v1
metadata:
name: persistent-volume
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
storageClassName: "" # Empty string must be explicitly set otherwise default StorageClass will be set / or custom storageClassName name
nfs:
path: "/nfsroot"
server: "3.128.203.245"
readOnly: false
claimRef:
name: persistent-volume-claim
namespace: default
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: persistent-volume-claim
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: "" # Empty string must be explicitly set otherwise default StorageClass will be set / or custom storageClassName name
volumeName: persistent-volume
Note:
If you are not referencing any provisioner associated with your StorageClass
Helper programs relating to the volume type may be required for consumption of a PersistentVolume within a cluster. In this example, the PersistentVolume is of type NFS and the helper program /sbin/mount.nfs is required to support the mounting of NFS filesystems.
Please keep in mind that when you are creating pvc kubernetes persistent-controller is trying to bind pvc with proper pv. During this process different factors are take into account like: storageClassName (default/custom), accessModes, claimRef, volumeName.
In this case you can use:
PersistentVolume.spec.claimRef.name: persistent-volume-claim PersistentVolumeClaim.spec.volumeName: persistent-volume
Note:
The control plane can bind PersistentVolumeClaims to matching PersistentVolumes in the cluster. However, if you want a PVC to bind to a specific PV, you need to pre-bind them.
By specifying a PersistentVolume in a PersistentVolumeClaim, you declare a binding between that specific PV and PVC. If the PersistentVolume exists and has not reserved PersistentVolumeClaims through its claimRef field, then the PersistentVolume and PersistentVolumeClaim will be bound.
The binding happens regardless of some volume matching criteria, including node affinity. The control plane still checks that storage class, access modes, and requested storage size are valid.
Once the PV/pvc were created or in case of any problem with pv/pvc binding please use the following commands to figure current state:
kubectl get pv,pvc,sc
kubectl describe pv
kubectl describe pvc
kubectl describe pod
kubectl get events

How do I run "ls" and then print the output to the console?

I am testing a connection between a persistent volume and a kubernetes pod by running busybox, but am getting "can't open" "no such file or directory. In order to do further testing, I tried running
echo ls /mntpoint/filename
This is obviously not the correct command. I have tried a few other iterations - too many to list here.
I want to run ls of the mountpoint and print to the console. How do I do this?
EDIT
My code was closest to Rohit's suggestion (below), so I made the following edits, but the code still does not work. Please help.
Persistent Volume
apiVersion: v1
kind: PersistentVolume
metadata:
name: data
labels:
type: local
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
hostPath:
path: "/mnt/data"
storageClassName: test
Persistent Volume Claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: test
Pod
apiVersion: v1
kind: Pod
metadata:
name: persistent-volume
spec:
containers:
- name: busybox
command: ['tail', '-f', '/dev/null']
image: busybox
volumeMounts:
- name: data
mountPath: "/data"
volumes:
- name: data
persistentVolumeClaim:
claimName: data
EDIT 2
So, after taking a day off, I came back to my (still running) pod and the command (ls) worked. It works as expected on any directory (e.g. "ls /" or "ls /data").
My current interpretation is that I did not wait long enough before running the command - although that does not seem to explain it since I had been monitoring with "kubectl describe pod ." Also I have run the same test several times with short latency between the "apply" and "exec" commands and the behavior has been consistent today.
I am going to continue to keep playing with this, but I think the current problem has been solved. Than you!
We cannot directly access volume mount on the POD without create a claim. You are missing some steps here.
Create a Persistent Volume. I think you have done this par.
Create Persistent Volume Claim. This will bind your claim to your volume.
Attach your Persistent Volume Claim to the pound.
Once these steps are done you can access the files from your volume in the pod.
Go through the link for detailed information and steps.
Steps you need to follow when dealing with volumes and Kubernetes resources:
Create a Persistent volume.
Create a Persistent volume claim and make sure the state is bound.
Once the PV and PVC are bounded, try to use the PV from a pod/deployment through PVC.
Check the logs of the pod/deployment. You might see the entry of command execution.
Reference: https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/
Hope this helps, please try to elaborate more and paste the logs of above-mentioned steps.

Creating a link to an NFS share in K3s Kubernetes

I'm very new to Kubernetes, and trying to get node-red running on a small cluster of raspberry pi's
I happily managed that, but noticed that once the cluster is powered down, next time I bring it up, the flows in node-red have vanished.
So, I've create a NFS share on a freenas box on my local network and can mount it from another RPI, so I know the permissions work.
However I cannot get my mount to work in a kubernetes deployment.
Any help as to where I have gone wrong please?
apiVersion: apps/v1
kind: Deployment
metadata:
name: node-red
labels:
app: node-red
spec:
replicas: 1
selector:
matchLabels:
app: node-red
template:
metadata:
labels:
app: node-red
spec:
containers:
- name: node-red
image: nodered/node-red:latest
ports:
- containerPort: 1880
name: node-red-ui
securityContext:
privileged: true
volumeMounts:
- name: node-red-data
mountPath: /data
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: TZ
value: Europe/London
volumes:
- name: node-red-data
nfs:
server: 192.168.1.96
path: /mnt/Pool1/ClusterStore/nodered
The error I am getting is
error: error validating "node-red-deploy.yml": error validating data:
ValidationError(Deployment.spec.template.spec): unknown field "nfs" in io.k8s.api.core.v1.PodSpec; if
you choose to ignore these errors, turn validation off with --validate=false
New Information
I now have the following
apiVersion: v1
kind: PersistentVolume
metadata:
name: clusterstore-nodered
labels:
type: nfs
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
nfs:
path: /mnt/Pool1/ClusterStore/nodered
server: 192.168.1.96
persistentVolumeReclaimPolicy: Recycle
claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: clusterstore-nodered-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
Now when I start the deployment it waits at pending forever and I see the following the the events for the PVC
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 5m47s (x7 over 7m3s) persistentvolume-controller waiting for first consumer to be created before binding
Normal Provisioning 119s (x5 over 5m44s) rancher.io/local-path_local-path-provisioner-58fb86bdfd-rtcls_506528ac-afd0-11ea-930d-52d0b85bb2c2 External provisioner is provisioning volume for claim "default/clusterstore-nodered-claim"
Warning ProvisioningFailed 119s (x5 over 5m44s) rancher.io/local-path_local-path-provisioner-58fb86bdfd-rtcls_506528ac-afd0-11ea-930d-52d0b85bb2c2 failed to provision volume with StorageClass "local-path": Only support ReadWriteOnce access mode
Normal ExternalProvisioning 92s (x19 over 5m44s) persistentvolume-controller
waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
I assume that this is becuase I don't have a nfs provider, in fact if I do kubectl get storageclass I only see local-path
New question, how do I a add a storageclass for NFS? A little googleing around has left me without a clue.
Ok, solved the issue. Kubernetes tutorials are really esoteric and missing lots of assumed steps.
My problem was down to k3s on the pi only shipping with a local-path storage provider.
I finally found a tutorial that installed an nfs client storage provider, and now my cluster works!
This was the tutorial I found the information in.
In the stated Tutorial there are basically these steps to fulfill:
1.
showmount -e 192.168.1.XY
to check if the share is reachable from outside the NAS
2.
helm install nfs-provisioner stable/nfs-client-provisioner --set nfs.server=192.168.1.**XY** --set nfs.path=/samplevolume/k3s --set image.repository=quay.io/external_storage/nfs-client-provisioner-arm
Whereas you replace the IP with your NFS Server and the NFS path with your specific Path on your synology (both should be visible from your showmount -e IP command
Update 23.02.2021
It seems that you have to use another Chart and Image too:
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner --set nfs.server=192.168.1.**XY** --set nfs.path=/samplevolume/k3s --set image.repository=gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner
kubectl get storageclass
To check if the storageclass now exists
4.
kubectl patch storageclass nfs-client -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' && kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
To configure the new Storage class as "default". Replace nfs-client and local-path with what kubectl get storageclass tells
5.
kubectl get storageclass
Final check, if it's marked as "default"
This is a validation error pointing at the very last part of your Deployment yaml, therefore making it an invalid object. It looks like you've made a mistake with indentations. It should look more like this:
volumes:
- name: node-red-data
nfs:
server: 192.168.1.96
path: /mnt/Pool1/ClusterStore/nodered
Also, as you are new to Kubernetes, I strongly recommend getting familiar with the concepts of PersistentVolumes and its claims. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV.
Please let me know if that helped.

How do I mount and format new google compute disk to be mounted in a GKE pod?

I have created a new disk in Google Compute Engine.
gcloud compute disks create --size=10GB --zone=us-central1-a dane-disk
It says I need to format it. But I have no idea how could I mount/access the disk?
gcloud compute disks list
NAME LOCATION LOCATION_SCOPE SIZE_GB TYPE STATUS
notowania-disk us-central1-a zone 10 pd-standard READY
New disks are unformatted. You must format and mount a disk before it
can be used. You can find instructions on how to do this at:
https://cloud.google.com/compute/docs/disks/add-persistent-disk#formatting
I tried instruction above but lsblk is not showing the disk at all
Do I need to create a VM and somehow attach it to it in order to use it? My goal was to mount the disk as a persistent GKE volume independent of the VM (last time GKE upgrade caused recreation of VM and data loss)
Thanks for the clarification of what you are trying to do in the comments.
I have 2 different answers here.
The first is that my testing shows that the Kubernetes GCE PD documentation is exactly right, and the warning about formatting seems like it can be safely ignored.
If you just issue:
gcloud compute disks create --size=10GB --zone=us-central1-a my-test-data-disk
And then use it in a pod:
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: nginx
name: nginx-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
# This GCE PD must already exist.
gcePersistentDisk:
pdName: my-test-data-disk
fsType: ext4
It will be formatted when it is mounted. This is likely because the fsType parameter instructs the system how to format the disk. You don't need to do anything with a separate GCE instance. The disk is retained even if you delete the pod or even the entire cluster. It is not reformatted on uses after the first and the data is kept around.
So, the warning message from gcloud is confusing, but can be safely ignored in this case.
Now, in order to dynamically create a persistent volume based on GCE PD that isn't automatically deleted, you will need to create a new StorageClass that sets the Reclaim Policy to Retain, and then create a PersistentVolumeClaim based on that StorageClass. This also keeps basically the entire operation inside of Kubernetes, without needing to do anything with gcloud. Likewise, a similar approach is what you would want to use with a StatefulSet as opposed to a single pod, as described here.
Most of what you are looking to do is described in this GKE documentation about dynamically allocating PVCs as well as the Kubernetes StorageClass documentation. Here's an example:
gce-pd-retain-storageclass.yaml:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gce-pd-retained
reclaimPolicy: Retain
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-standard
replication-type: none
The above storage class is basically the same as the 'standard' GKE storage class, except with the reclaimPolicy set to Retain.
pvc-demo.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-demo-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: gce-pd-retained
resources:
requests:
storage: 10Gi
Applying the above will dynamically create a disk that will be retained when you delete the claim.
And finally a demo-pod.yaml that mounts the PVC as a volume (this is really a nonsense example using nginx, but it demonstrates the syntax):
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: nginx
name: nginx-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
persistentVolumeClaim:
claimName: pvc-demo-disk
Now, if you apply these three in order, you'll get a container running using the PersistentVolumeClaim which has automatically created (and formatted) a disk for you. When you delete the pod, the claim keeps the disk around. If you delete the claim the StorageClass still keeps the disk from being deleted.
Note that the PV that is left around after this won't be automatically reused, as the data is still on the disk. See the Kubernetes documentation about what you can do to reclaim it in this case. Really, this mostly says that you shouldn't delete the PVC unless you're ready to do work to move the data off the old volume.
Note that these disks will even continue to exist when the entire GKE cluster is deleted as well (and you will continue to be billed for them until you delete them).

Volume is already exclusively attached to one node and can't be attached to another

I have a pretty simple Kubernetes pod. I want a stateful set and want the following process:
I want to have an initcontainer download and uncompress a tarball from s3 into a volume mounted to the initcontainer
I want to mount that volume to my main container to be used
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: app
namespace: test
labels:
name: app
spec:
serviceName: app
replicas: 1
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
initContainers:
- name: preparing
image: alpine:3.8
imagePullPolicy: IfNotPresent
command:
- "sh"
- "-c"
- |
echo "Downloading data"
wget https://s3.amazonaws.com/.........
tar -xvzf xxxx-........ -C /root/
volumeMounts:
- name: node-volume
mountPath: /root/data/
containers:
- name: main-container
image: ecr.us-west-2.amazonaws.com/image/:latest
imagePullPolicy: Always
volumeMounts:
- name: node-volume
mountPath: /root/data/
volumeClaimTemplates:
- metadata:
name: node-volume
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: gp2-b
resources:
requests:
storage: 80Gi
I continue to get the following error:
At first I run this and I can see the logs flowing of my tarball being downloaded by the initcontainer. About half way done it terminates and gives me the following error:
Multi-Attach error for volume "pvc-faedc8" Volume is
already exclusively attached to one node and can't be
attached to another
Looks like you have a dangling PVC and/or PV that is attached to one of your nodes. You can ssh into the node and run a df or mount to check.
If you look at this the PVCs in a StatefulSet are always mapped to their pod names, so it may be possible that you still have a dangling pod(?)
If you have a dangling pod:
$ kubectl -n test delete pod <pod-name>
You may have to force it:
$ kubectl -n test delete pod <pod-name> --grace-period=0 --force
Then, you can try deleting the PVC and it's corresponding PV:
$ kubectl delete pvc pvc-faedc8
$ kubectl delete pv <pv-name>
I had the same issue right now and the problem was, that the node on which the pod is usually running on was down and another one took over (which didn't work as expected for whatever reason). Had the "node down" scenario a few times before already and it never caused any issues. Couldn't get the StatefulSet and Deployment back up and running without booting the node that was down. But as soon as the node was up and running again the StatefulSet and Deployment immediately came back to life as well.
I had a similar error:
The volume pvc-2885ea01-f4fb-11eb-9528-00505698bd8b
cannot be attached to the node node1 since it is already attached to the node node2*
I use longhorn as a storage provisioner and manager. So I just detached this pv in the error and restarted the stateful set. It automatically was able to attach to the pv correctly this time.
I'll add an answer that will prevent this from happening again.
Short answer
Access modes: Switch from ReadWriteOnce to ReadWriteMany.
In a bit more details
You're usng a StatefulSet where each replica has its own state, with a unique persistent volume claim (PVC) created for each pod.
Each PVC is referring to a Persistent Volume where you decided that the access mode is ReadWriteOnce.
Which as you can see from here:
ReadWriteOnce
the volume can be mounted as read-write by a single
node. ReadWriteOnce access mode still can allow multiple pods to
access the volume when the pods are running on the same node.
So in case K8S Scheduler (due to priorities or resource calculations or due to a Cluster autoscaler which decided to shift the pod to a different node) - you will receive an error that the volume is already exclusively attached to one node and can't be
attached to another node.
Please consider using ReadWriteMany where the volume can be mounted as read-write by many nodes.