kubevirt had volume node affinity conflict - kubernetes

I am trying to set up kubevirt following https://kubevirt.io/2019/How-To-Import-VM-into-Kubevirt.html and https://github.com/kubevirt/containerized-data-importer
The instructions are not fully complete so I had to manually create PV
The error I get is
0/1 nodes are available: 1 node(s) had volume node affinity conflict. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
I have tried this https://www.datree.io/resources/kubernetes-troubleshooting-fixing-persistentvolumeclaims-error#anchor0 but still same issue

Related

Pod creation in EKS cluster fails with FailedScheduling error

I have created a new EKS cluster with 1 worker node in a public subnet. I am able to query node, connect to the cluster, and run pod creation command, however, when I am trying to create a pod it fails with the below error got by describing the pod. Please guide.
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 81s default-scheduler 0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Warning FailedScheduling 16m default-scheduler 0/2 nodes are available: 2 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
Warning FailedScheduling 16m default-scheduler 0/3 nodes are available: 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable, 3 Too many pods. preemption: 0/3 nodes are available: 1 No preemption victims found for incoming pod, 2 Preemption is not helpful for scheduling.
Warning FailedScheduling 14m (x3 over 22m) default-scheduler 0/2 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 1 node(s) were unschedulable, 2 Too many pods. preemption: 0/2 nodes are available: 1 No preemption victims found for incoming pod, 1 Preemption is not helpful for scheduling.
Warning FailedScheduling 12m default-scheduler 0/2 nodes are available: 1 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/unschedulable: }, 2 node(s) were unschedulable. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
Warning FailedScheduling 7m14s default-scheduler no nodes available to schedule pods
Warning FailedScheduling 105s (x5 over 35m) default-scheduler 0/1 nodes are available: 1 Too many pods. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
I am able to get status of the node and it looks ready:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-12-61.ec2.internal Ready <none> 15m v1.24.7-eks-fb459a0
While troubleshooting I tried below options:
recreate the complete demo cluster - still the same error
try recreating pods with different images - still the same error
trying to increase to instance type to t3.micro - still the same error
reviewed security groups and other parameters in a cluster - Couldnt come to RCA
it's due to the node's POD limit or IP limit on Nodes.
So if we see official Amazon doc, t3.micro maximum 2 interface you can use and 2 private IP. Roughly you might be getting around 4 IPs to use and 1st IP get used by Node etc, There will be also default system PODs running as Daemon set and so.
Add new instance or upgrade to larger instance who can handle more pods.

List nodes under nodePool

I am trying to re-deploy jenkins pod on kubernetes.
After I tried it, I am getting error and the pod is not initializing.
After I describe the pod, I can see
Warning FailedScheduling 46s default-scheduler 0/12 nodes are available: 12 node(s) didn't match node selector.
Normal NotTriggerScaleUp 7s (x4 over 38s) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 2 node(s) didn't match node selector
Also, I see that there is a defined nodePool, which is Node-Selectors: nodePool=default
I have a kubernetes deployment called jenkins where I can see that this value is defined.
I am not sure what should be the nodePool, since I am not sure how can I list all nodePools that I have available.
I can list all nodes using kubectl get nodes, but I do not see any info about nodePool there.
Any advice how to do this?
please check the lables using kubectl get nodes --show-labels looks like your deployment has wrong labels

AWS has per node Pod IP restrictions, pods are stuck at ContainerCreating state

As we all know, AWS has per node Pod IP restriction and kubernetes doesn't care this while scheduling, pods get scheduled in nodes where no pod IPs can be allocated and pods get stuck at ContainerCreating state as following:
Normal Scheduled 114s default-scheduler Successfully assigned default/whoami-deployment-9f9c86c4f-r4flx to ip-192-168-15-248.ec2.internal
Warning FailedCreatePodSandBox 111s kubelet, ip-192-168-15-248.ec2.internal Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "8d4b5f98f9b600ad9ec486f994fa2f9223d5224842df7f78802616f014b52970" network for pod "whoami-deployment-9f9c86c4f-r4flx": NetworkPlugin cni failed to set up pod "whoami-deployment-9f9c86c4f-r4flx_default" network: add cmd: failed to assign an IP address to container
Normal SandboxChanged 86s (x12 over 109s) kubelet, ip-192-168-15-248.ec2.internal Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 61s (x4 over 76s) kubelet, ip-192-168-15-248.ec2.internal (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "e2a3c54ba7d9a33a45248f7c276f4a2d5b0c8ba6c3deb5184392156b35638553" network for pod "whoami-deployment-9f9c86c4f-r4flx": NetworkPlugin cni failed to set up pod "whoami-deployment-9f9c86c4f-r4flx_default" network: add cmd: failed to assign an IP address to container
So I tried overcoming the issue by tainting nodes with key=value:NoSchedule, so that default scheduler doesn't schedule pods to the nodes which already reached pod IP limit and deleted all pods which were stuck at ContainerCreating state. I was hoping that it will make the scheduler not to schedule any more pods to tainted nodes and that's what happened but, since pods are not scheduled I was also hoping, cluster-autoscaler will scale ASG and my pods will run on new nodes and that's what didn't happen.
When I do describe pod I see:
Warning FailedScheduling 40s (x5 over 58s) default-scheduler 0/5 nodes are available: 5 node(s) had taints that the pod didn't tolerate.
Normal NotTriggerScaleUp 5s (x6 over 56s) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) had taints that the pod didn't tolerate
When I look at cluster-autoscaler logs I see:
I1108 16:30:47.521026 1 event.go:209] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"whoami-deployment-9f9c86c4f-x5h4d", UID:"158cc806-0245-11ea-a67a-0efb4254edc4", APIVersion:"v1", ResourceVersion:"2483839", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) had taints that the pod didn't tolerate
Now, I tried an alternative way to mark my nodes unschedulable by removing the above NoSchedule taint and patching nodes by:
kubectl patch nodes node1.internal -p '{"spec": {"unschedulable": true}}'
And this is the logs I see in cluster-autoscaler:
I1109 10:47:50.894680 1 static_autoscaler.go:138] Starting main loop
W1109 10:47:50.894719 1 static_autoscaler.go:562] Cluster has no ready nodes.
I1109 10:47:50.901157 1 event.go:209] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"7c949105-0153-11ea-9a39-12e5fc698b6e", APIVersion:"v1", ResourceVersion:"2629645", FieldPath:""}): type: 'Warning' reason: 'ClusterUnhealthy' Cluster has no ready nodes.
So, my idea of overcoming the issue made no sense. How shall I overcome this?
Kubernetes version: 1.14
Cluster Autoscaler: 1.14.6
Let me know if you guys need more details.

default-scheduler 0/5 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 4 node(s) had volume node affinity conflict

Please could someone kindly advise me what the issue is:
Warning FailedScheduling 78s (x31 over 40m) default-scheduler 0/5 nodes are available: 1 node(s) had taints that the pod didn't tolerate, 4 node(s) had volume node affinity conflict and my node and ebs volume are in same aws zone.
My nodes are in pending status.

istio-pilot on minikube is always in pending state

istio-pilot pod on minikube kubernetes cluster is always in Pending state. Increased CPU=4 and memory=8GB. Still the status of istio-pilot pod is Pending.
Is specific change required to run istio on minikube other than the ones mentioned in documentation?
Resolved the issue . Im running minikube with Virtual box and running minikube with higher memory and CPU does not reflect until minikube is deleted and started with new parameters. Without this it was resulting in Insufficient memory.
I saw istio-pilot in 1.1 rc3 consume a lot of CPU and was in Pending state due to the following message in kubectl describe <istio-pilot pod name> -n=istio-system:
Warning FailedScheduling 1m (x25 over 3m) default-scheduler 0/2 nodes are available:
1 Insufficient cpu, 1 node(s) had taints that the pod didn't tolerate.
I was able to reduce it by doing --set pilot.resources.requests.cpu=30m when installing istio using helm.
https://github.com/istio/istio/blob/1.1.0-rc.3/install/kubernetes/helm/istio/charts/pilot/values.yaml#L16