flexVolume plugin not working on gke windows node - kubernetes

I'm trying to run a simple flexvolume plugin driver on windows node to enable connectivity with an external SMB share. I followed the steps listed out here
https://github.com/microsoft/K8s-Storage-Plugins/tree/master/flexvolume/windows
Placed the driver plugin in the mentioned path but the problem is the plugin is not getting picked up by gke. The error details are as below.
Warning FailedMount 8s (x2 over 21s) kubelet, gke-windows-node-pool-e4e7a7bf-f2pc Unable to attach or mount volumes: unmounted volumes=[smb-volume], unattached volumes=[default-token-jf28b smb-volume]: failed to get Plugin from volumeSpec for volume "smb-volume" err=no volume plugin matched
Not sure what I'm missing here. Any help would be great. Thanks in Advance.

Just faced with a similar issue on a kubeadm on prem configuration, have used Process Monitor to find the proper location the kubelet.exe process looks for volume plugins.
As result my actual windows node SMB preparation:
curl -L https://github.com/microsoft/K8s-Storage-Plugins/releases/download/V0.0.3/flexvolume-windows.zip -o flexvolume-windows.zip
Expand-Archive flexvolume-windows.zip C:\var\lib\kubelet\usr\libexec\kubernetes\kubelet-plugins\volume\exec\

Related

kubernetes issue : runtime network not ready

I am beginner in kubernetes and I'm trying to set up my first cluster , my worker node has joined to my cluster successfully but when I run kubectl get nodes it is in NotReady status .
and this massesge exists when I run
kubectl describe node k8s-node-1
runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
I have run this command to install a a Pod network add-on:
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
how can I solve this issue?
Adding this answer as community wiki for better visibility. OP already solved the problem with rebooting the machine.
Worth to remember that going thru all the steps with bootstrapping cluster and installing all the prerequisites will make your cluster running successfully . If you had any previous installations please remember to perform kubeadm reset and remove .kube folder from the home or root directory.
I`m also linking this github case with the same issue whereas people provide solution for this problem.

fluentd daemon set container for papertrail failing to start in kubernetes cluster

Am trying to setup fluentd in kubernetes cluster to aggregate logs in papertrail, as per the documentation provided here.
The configuration file is fluentd-daemonset-papertrail.yaml
It basically creates a daemon set for fluentd container and a config map for fluentd configuration.
When I apply the configuration, the pod is assigned to a node and the container is created. However, its either not completing the initialization or the pod gets killed immediately after it is started.
As the pods are getting killed, am loosing the logs too. Couldn't investigate the cause of the issue.
Looking through the events for kube-system namespace has below errors,
Error: failed to start container "fluentd": Error response from daemon: OCI runtime create failed: container_linux.go:338: creating new parent process caused "container_linux.go:1897: running lstat on namespace path \"/proc/75026/ns/ipc\" caused \"lstat /proc/75026/ns/ipc: no such file or directory\"": unknown
Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "9559643bf77e29d270c23bddbb17a9480ff126b0b6be10ba480b558a0733161c" network for pod "fluentd-papertrail-b9t5b": NetworkPlugin kubenet failed to set up pod "fluentd-papertrail-b9t5b_kube-system" network: Error adding container to network: failed to open netns "/proc/111610/ns/net": failed to Statfs "/proc/111610/ns/net": no such file or directory
Am not sure whats causing these errors. Appreciate any help to understand and troubleshoot these errors.
Also, is it possible to look at logs/events that could tell us why a pod is given a terminate signal?
Please ensure that /etc/cni/net.d and its /opt/cni/bin friend both exist and are correctly populated with the CNI configuration files and binaries on all Nodes.
Take a look: sandbox.
With help from papertrail support team, I was able to resolve the issue by removing below entry from manifest file.
kubernetes.io/cluster-service: "true"
Above annotation seems to have been deprecated.
Relevant github issues:
https://github.com/fluent/fluentd-kubernetes-daemonset/issues/296
https://github.com/kubernetes/kubernetes/issues/72757

Failed to create pod sandbox kubernetes cluster

I have an weave network plugin.
inside my folder /etc/cni/net.d there is a 10-weave.conf
{
"name": "weave",
"type": "weave-net",
"hairpinMode": true
}
My weave pods are running and the dns pod is also running
But when i want to run a pod like a simple nginx wich will pull an nginx image
The pod stuck at container creating , describe pod gives me the error , failed create pod sandbox.
When i run journalctl -u kubelet i get this error
cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
is my network plugin not good configured ?
i used this command to configure my weave network
kubectl apply -f https://git.io/weave-kube-1.6
After this won't work i also tried this command
kubectl apply -f “https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d ‘\n’)”
I even tried flannel and that gives me the same error.
The system i am setting kubernetes on is a raspberry pi.
I am trying to build a raspberry pi cluster with 3 nodes and 1 master with kubernetes
Dose anyone have ideas on this?
Thank you all for responding to my question. I solved my problem now. For anyone who has come to my question in the future the solution was as followed.
I cloned my raspberry pi images because i wanted a basicConfig.img for when i needed to add a new node to my cluster of when one gets down.
Weave network (the plugin i used) got confused because on every node and master the os had the same machine-id. When i deleted the machine id and created a new one (and reboot the nodes) my error got fixed. The commands to do this was
sudo rm /etc/machine-id
sudo rm /var/lib/dbus/machine-id
sudo dbus-uuidgen --ensure=/etc/machine-id
Once again my patience was being tested. Because my kubernetes setup was normal and my raspberry pi os was normal. I founded this with the help of someone in the kubernetes community. This again shows us how important and great are IT community is. To the people of the future who will come to this question. I hope this solution will fix your error and will decrease the amount of time you will be searching after a stupid small thing.
Looking at the pertinent code in Kubernetes and in CNI, the specific error you see seems to indicate that it cannot find any files ending in .json, .conf or .conflist in the directory given.
This makes me think it could be something as the conf file not being present on all the hosts, so I would verify that as a first step.

CoreOS v.1.6.1 not starting

I am working on setting up a new Kubernetes cluster using the CoreOS documentation. This one uses the CoreOS v1.6.1 image. I am following this documentation from link CoreOS Master setup. I looked in the journalctl logs and I see that the kubeapi-server seems to exit and restart.
The following is a journalctl log indicating on the kube-apiserver :
checking backoff for container "kube-apiserver" in pod "kube-apiserver-10.138.192.31_kube-system(16c7e04edcd7e775efadd4bdcb1940c4)"
Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-10.138.192.31_kube-system(16c7e04edcd7e775efadd4bdcb1940c4)
Error syncing pod 16c7e04edcd7e775efadd4bdcb1940c4 ("kube-apiserver-10.138.192.31_kube-system(16c7e04edcd7e775efadd4bdcb1940c4)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-10.138.192.31_kube-system(16c7e04edcd7e775efadd4bdcb1940c4)"
I am wondering if it's because I need to start the new etcd3 version instead of the etcd2? Any hints or suggestion is appreciated.
The following is my cloud-config:
coreos:
etcd2:
# generate a new token for each unique cluster from https://discovery.etcd.io/new:
discovery: https://discovery.etcd.io/33e3f7c20be0b57daac4d14d478841b4
# multi-region deployments, multi-cloud deployments, and Droplets without
# private networking need to use $public_ipv4:
advertise-client-urls: http://$private_ipv4:2379,http://$private_ipv4:4001
initial-advertise-peer-urls: http://$private_ipv4:2380
# listen on the official ports 2379, 2380 and one legacy port 4001:
listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001
listen-peer-urls: http://$private_ipv4:2380
fleet:
public-ip: $private_ipv4 # used for fleetctl ssh command
units:
- name: etcd2.service
command: start
However, I have tried with CoreOS v1.5 images and they work fine. It's the CoreOS v1.6 images that I am not able to get the kube-apiserver running for some reason.
You use etcd2, so you need to pass the flag '--storage-backend=etcd2' to your kube-apiserver in your manifest.
You are using etcd2, I think maybe you can try etcd3.
You said:
I am wondering if it's because I need to start the new etcd3 version instead of the etcd2? Any hints or suggestion is appreciated.
I would like to recommend that you can read this doc to learn how to upgrade the etcd.

Could not attach GCE PD, Timeout waiting for mount paths

this is getting out of hand... have good specs of GKE, yet, I'm getting timeout for mount paths, I have posted this issue in github, but they said, it would be better if posted in SO. please fix this..
2m 2m 1 {scheduler } Scheduled Successfully assigned mongodb-shard1-master-gp0qa to gke-cluster-1-micro-a0f27b19-node-0p2j
1m 1m 1 {kubelet gke-cluster-1-micro-a0f27b19-node-0p2j} FailedMount Unable to mount volumes for pod "mongodb-shard1-master-gp0qa_default": Could not attach GCE PD "shard1-node1-master". Timeout waiting for mount paths to be created.
1m 1m 1 {kubelet gke-cluster-1-micro-a0f27b19-node-0p2j} FailedSync Error syncing pod, skipping: Could not attach GCE PD "shard1-node1-master". Timeout waiting for mount paths to be created.
This problem has been documented several times, for example here https://github.com/kubernetes/kubernetes/issues/14642. Kubernetes v1.3.0 should have a fix.
As a workaround (in GCP) you can restart your VMs.
Hope this helps!
It's possible that your GCE service account may not be authorized on your project. Try re-adding $YOUR_PROJECT_NUMBER-compute#developer.gserviceaccount.com as "Can-edit" on the Permissions page of the Developers Console.
I ran into this recently, and the issue ended up being that the application running inside the docker container was actually shutting down immediately - this caused gce to try and restart it, but it would fail when GCE tried to attach the disk (already attached).
So, seems like a bit of a bug in GCE, but don't run down the rabbit hole trying to figure that out, I ended up running things locally and debugging the crash using local volume mounts.
this is an old question, but I like to share how I fixed the problem. I manually un-mount the problematic disks from its host via the google cloud console.