Install GPU Driver on autoscaling Node in GKE (Cloud Composer) - kubernetes

I'm running a google cloud composer GKE cluster. I have a default node pool of 3 normal CPU nodes and one nodepool with a GPU node. The GPU nodepool has autoscaling activated.
I want to run a script inside a docker container on that GPU node.
For the GPU operating system I decided to go with cos_containerd instead of ubuntu.
I've followed https://cloud.google.com/kubernetes-engine/docs/how-to/gpus and ran this line:
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/cos/daemonset-preloaded.yaml
The GPU now shows up when I run "kubectl describe" on the GPU node, however my test scripts debug information tells me, that the GPU is not being used.
When I connect to the autoprovisioned GPU node via ssh, I can see, that I still need to run the
cos extensions gpu install
in order to use the GPU.
I now want to make my cloud composer GKE cluster to run "cos-extensions gpu install" whenever a node is being created by the autoscaler feature.
I would like to apply something like this yaml:
#cloud-config
runcmd:
- cos-extensions install gpu
to my cloud composer GKE cluster.
Can i do that with kubectl apply ? Ideally I would like to only run that yaml code onto the GPU node. How can I achieve that?
I'm new to Kubernetes and I've already spent a lot of time on this without success. Any help would be much appreciated.
Best,
Phil
UPDATE:
ok thx to Harsh I realized I have to go via Daemonset + ConfigMap like here:
https://github.com/GoogleCloudPlatform/solutions-gke-init-daemonsets-tutorial
My GPU node has the label
gpu-type=t4
so I've created and kubectl applied this ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: phils-init-script
labels:
gpu-type: t4
data:
entrypoint.sh: |
#!/usr/bin/env bash
ROOT_MOUNT_DIR="${ROOT_MOUNT_DIR:-/root}"
chroot "${ROOT_MOUNT_DIR}" cos-extensions gpu install
and here is my DaemonSet (I also kubectl applied this one):
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: phils-cos-extensions-gpu-installer
labels:
gpu-type: t4
spec:
selector:
matchLabels:
gpu-type: t4
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: phils-cos-extensions-gpu-installer
gpu-type: t4
spec:
volumes:
- name: root-mount
hostPath:
path: /
- name: phils-init-script
configMap:
name: phils-init-script
defaultMode: 0744
initContainers:
- image: ubuntu:18.04
name: phils-cos-extensions-gpu-installer
command: ["/scripts/entrypoint.sh"]
env:
- name: ROOT_MOUNT_DIR
value: /root
securityContext:
privileged: true
volumeMounts:
- name: root-mount
mountPath: /root
- name: phils-init-script
mountPath: /scripts
containers:
- image: "gcr.io/google-containers/pause:2.0"
name: pause
but nothing happens, i get the message "Pods are pending".
During the run of the script I connect via ssh to the GPU node and can see that the ConfigMap shell code didn't get applied.
What am I missing here?
I'm desperately trying to make this work.
Best,
Phil
Thanks for all your help so far!

Can i do that with kubectl apply ? Ideally I would like to only run
that yaml code onto the GPU node. How can I achieve that?
Yes, You can run the Deamon set on each node which will run the command on Nodes.
As you are on GKE and Daemon set will also run the command or script on New nodes also which are getting scaled up also.
Daemon set is mainly for running applications or deployment on each available node in the cluster.
We can leverage this deamon set and run the command on each node that exist and is also upcoming.
Example YAML :
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-initializer
labels:
app: default-init
spec:
selector:
matchLabels:
app: default-init
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: node-initializer
app: default-init
spec:
volumes:
- name: root-mount
hostPath:
path: /
- name: entrypoint
configMap:
name: entrypoint
defaultMode: 0744
initContainers:
- image: ubuntu:18.04
name: node-initializer
command: ["/scripts/entrypoint.sh"]
env:
- name: ROOT_MOUNT_DIR
value: /root
securityContext:
privileged: true
volumeMounts:
- name: root-mount
mountPath: /root
- name: entrypoint
mountPath: /scripts
containers:
- image: "gcr.io/google-containers/pause:2.0"
name: pause
Github link for example : https://github.com/GoogleCloudPlatform/solutions-gke-init-daemonsets-tutorial
Exact deployment step : https://cloud.google.com/solutions/automatically-bootstrapping-gke-nodes-with-daemonsets#deploying_the_daemonset
Full article : https://cloud.google.com/solutions/automatically-bootstrapping-gke-nodes-with-daemonsets

If you've installed the driver so many times and nvidia-smi is still failing to communicate, take a look into prime-select.
Run prime-select query, this way you are going to get all possible options, it must show at least nvidia | intel.
Select prime-select nvidia.
Then, if you see nvidia is already selected, choose a different one, e.g. prime-select intel. Next, switch back to nvidia prime-select nvidia.
Reboot and check nvidia-smi.
Plus, it could be a good idea to run again:
sudo apt install nvidia-cuda-toolkit
When it finishes, reboot the machine, and nvidia-smi has to work then.
Now, in other cases it works to follow these instructions to install CuDNn and Cuda on VMs cuda_11.2_installation_on_Ubuntu_20.04.
And finally, in some other cases, it is caused by unattended-upgrades. Take a look into the settings and adjust them if it is causing unexpected results. This URL has the documentation for Debian, and I was able to see that you already tested with that distro UnattendedUpgrades.

Related

How can I run a cli app in a pod inside a Kubernetes cluster?

I have a cli app written in NodeJS [not by me].
I want to deploy this on a k8s cluster like I have done many times with web servers.
I have not deployed something like this before, so I am in a kind of a loss.
I have worked with dockerized cli apps [like Terraform] before, and i know how to use them in a CICD.
But how should I deploy them in a pod so they are always available for usage from another app in the cluster?
Or is there a completely different approach that I need to consider?
#EDIT#
I am using this in the end of my Dockerfile ..
# the main executable
ENTRYPOINT ["sleep", "infinity"]
# a default command
CMD ["mycli help"]
That way the pod does not restart and the cli inside is waiting for commands like mycli do this
Is it a hacky way that is frowned upon or a legit solution?
Your edit is one solution, another one if you do not want or cannot change the Docker image is to Define a Command for a Container to loop infinitely, this would achieve the same as the Dockerfile ENTRYPOINT but without having to rebuild the image.
Here's an example of such implementation:
apiVersion: v1
kind: Pod
metadata:
name: command-demo
labels:
purpose: demonstrate-command
spec:
containers:
- name: command-demo-container
image: debian
command: ["/bin/sh", "-ec", "while :; do echo '.'; sleep 5 ; done"]
restartPolicy: OnFailure
As for your question about if this is a legit solution, this is hard to answer; I would say it depends on what your application is designed to do. Kubernetes Pods are designed to be ephemeral, so a good solution would be one that is running until the job is completed; for a web server, for example, the job is never completed because it should be constantly listening to requests.
If your pods are in the same cluster they are already available to other pods through Core-DNS. An internal DNS service which allows you to access them by their internal DNS name. Something like my-cli-app.my-namespace.svc.cluster. DNS for service and pods
You would then create a deployment file with all your apps. Note this doesn't need ports to work and also doesn't include communication through the internet.
#deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80

Kubernetes in GCP: How a pod can access its parent node to perform some operation e.g. iptables update in node

Scenario is like this:
I have a pod running in a node in K8s cluster in GCP. cluster is created using kops and pod is created using kne_cli.
I know only the name of the pod e.g. "test-pod".
My requirement is to configure something in the node where this pod is running. e.g. I want to update "iptables -t nat" table in node.
how to access the node and configure it from within a pod?
any suggestion will be helpful.
You the Job or deployment or POD, not sure how POD is getting managed. If you just want to run that task Job is good fir for you.
One option is to use SSH way :
You can run one POD inside that you get a list of Nodes or specific node as per need and run SSH command to connect with that node.
That way you will be able to access Node from POD and run commands top of Node.
You can check this document for ref : https://alexei-led.github.io/post/k8s_node_shell/
Option two :
You can mount sh file on Node with IP table command and invoke that shell script from POD to execute which will run the command whenever you want.
Example :
apiVersion: v1
kind: ConfigMap
metadata:
name: command
data:
command.sh: |
#!/bin/bash
echo "running sh script on node..!"
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: command
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: cron-namespace-admin
containers:
- name: command
image: IMAGE:v1
imagePullPolicy: IfNotPresent
volumeMounts:
- name: commandfile
mountPath: /test/command.sh
subPath: command.sh
- name: script-dir
mountPath: /test
restartPolicy: OnFailure
volumes:
- name: commandfile
configMap:
name: command
defaultMode: 0777
- name: script-dir
hostPath:
path: /var/log/data
type: DirectoryOrCreate
Use privileged mode
securityContext:
privileged: true
Privileged - determines if any container in a pod can enable
privileged mode. By default a container is not allowed to access any
devices on the host, but a "privileged" container is given access to
all devices on the host. This allows the container nearly all the same
access as processes running on the host. This is useful for containers
that want to use linux capabilities like manipulating the network
stack and accessing devices.
Read more : https://kubernetes.io/docs/concepts/security/pod-security-policy/#privileged
You might be better off using GKE and configuring the ip-masq-agent as described here: https://cloud.google.com/kubernetes-engine/docs/how-to/ip-masquerade-agent
In case you stick with kops on GCE, I would suggest following the guide for ip-masq-agent here instead of the GKE docs: https://kubernetes.io/docs/tasks/administer-cluster/ip-masq-agent/
In case you really need to run custom iptables rules on the host then your best option is to create a DaemonSet with pods that are privileged and have hostNetwork: true. That should allow you to modify iptable rules directly on the host from the pod.

Kubectl error upon applying agones fleet: ensure CRDs are installed first

I am using minikube (docker driver) with kubectl to test an agones fleet deployment. Upon running kubectl apply -f lobby-fleet.yml (and when I try to apply any other agones yaml file) I receive the following error:
error: resource mapping not found for name: "lobby" namespace: "" from "lobby-fleet.yml": no matches for kind "Fleet" in version "agones.dev/v1"
ensure CRDs are installed first
lobby-fleet.yml:
apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
name: lobby
spec:
replicas: 2
scheduling: Packed
template:
metadata:
labels:
mode: lobby
spec:
ports:
- name: default
portPolicy: Dynamic
containerPort: 7600
container: lobby
template:
spec:
containers:
- name: lobby
image: gcr.io/agones-images/simple-game-server:0.12 # Modify to correct image
I am running this on WSL2, but receive the same error when using the windows installation of kubectl (through choco). I have minikube installed and running for ubuntu in WSL2 using docker.
I am still new to using k8s, so apologies if the answer to this question is clear, I just couldn't find it elsewhere.
Thanks in advance!
In order to create a resource of kind Fleet, you have to apply the Custom Resource Definition (CRD) that defines what is a Fleet first.
I've looked into the YAML installation instructions of agones, and the manifest contains the CRDs. you can find it by searching kind: CustomResourceDefinition.
I recommend you to first try to install according to the instructions in the docs.

Monitoring Kubernetes PVC disk usage

I'm trying to monitor Kubernetes PVC disk usage. I need the memory that is in use for Persistent Volume Claim. I found the command:
kubectl get --raw / api / v1 / persistentvolumeclaims
Return:
"status":{
"phase":"Bound",
"accessModes":[
"ReadWriteOnce"
],
"capacity":{
"storage":"1Gi"
}
}
But it only brings me the full capacity of the disk, and as I said I need the used one
Does anyone know which command could return this information to me?
I don't have a definitive anwser, but I hope this will help you. Also, I would be interested if someone has a better anwser.
Get current usage
The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed.
-- Persistent Volume | Kubernetes
As stated in the Kubernetes documentation, PV (PersistentVolume) and PVC (PersistentVolumeClaim) are abstractions over storage. As such, I do not think you can inspect PV or PVC, but you can inspect the storage medium.
To get the usage, create a debugging pod which will use your PVC, from which you will check the usage. This should work depending on your storage provider.
# volume-size-debugger.yaml
kind: Pod
apiVersion: v1
metadata:
name: volume-size-debugger
spec:
volumes:
- name: debug-pv
persistentVolumeClaim:
claimName: <pvc-name>
containers:
- name: debugger
image: busybox
command: ["sleep", "3600"]
volumeMounts:
- mountPath: "/data"
name: debug-pv
Apply the above manifest with kubectl apply -f volume-size-debugger.yaml, and run a shell inside it with kubectl exec -it volume-size-debugger sh. Inside the shell run du -sh to get the usage in a human readable format.
Monitoring
As I am sure you have noticed, this is not especially useful for monitoring. It may be useful for a one-time check from time to time, but not for monitoring or low disk space alerts.
One way to setup monitoring would be to have a similar sidecar pod like ours above and gather our metrics from there. One such example seems to be the node_exporter.
Another way would be to use CSI (Container Storage Interface). I have not used CSI and do not know enough about it to really explain more. But here are a couple of related issues and related Kubernetes documentation:
Monitoring Kubernetes PersistentVolumes - prometheus-operator
Volume stats missing - csi-digitalocean
Storage Capacity | Kubernetes
+1 to touchmarine's answer however I'd like to expand it a bit and add also my three cents.
But it only brings me the full capacity of the disk, and as I said I
need the used one
PVC is an abstraction which represents a request for a storage and simply doesn't store such information as disk usage. As a higher level abstraction it doesn't care at all how the underlying storage is used by its consumer.
#touchmarine, Instead of using a Pod whose only function is to sleep and every time you need to check the disk usage you need to attach to it maually, I would propose to use something like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: media
persistentVolumeClaim:
claimName: media
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/data"
name: media
- name: busybox
image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do du -sh /data; sleep 10;done"]
volumeMounts:
- mountPath: "/data"
name: media
It can be of course a single-container busybox Pod as in #touchmarine's example but here I decided to to show also how it can be used as a sidecar running next to nginx container within a single Pod.
As it runs a simple bash script - an infinite while loop, which prints out current disk usage to the standard output it can be read with kubectl logs without a need of using kubectl exec and attaching to the Pod:
$ kubectl logs nginx-deployment-56bb5c87f6-dqs5h busybox
20.0K /data
20.0K /data
20.0K /data
I guess it can be also used more effectively to configure some sort of monitoring of disk usage.

Container not maintaining its state using kubernetes?

I have a service which runs in apache. The container status is showing as completed and restarting. Why container is not maintaining its state as running even though the arguments passed does not have issues?
apiVersion: apps/v1
kind: Deployment
metadata:
name: ***
spec:
selector:
matchLabels:
app: ***
replicas: 1
template:
metadata:
labels:
app: ***
spec:
containers:
- name: ***
image: ****
command: ["/bin/sh", "-c"]
args: ["echo\ sid\ |\ sudo\ -S\ service\ mysql\ start\ &&\ sudo\ service\ apache2\ start"]
volumeMounts:
- mountPath: /var/log/apache2/
name: apache
- mountPath: /var/log/***/
name: ***
imagePullSecrets:
- name: regcred
volumes:
- name: apache
hostPath:
path: "/home/sandeep/logs/apache"
- name: vusmartmaps
hostPath:
path: "/home/sandeep/logs/***"
Soon after executing this arguments it is showing its status as completed and going to a loop. What we can do to maintain it status as running.
Please be advised this is not a good practice.
If you really want this working that way your last process must not end.
For example add sleep 9999 to your container.args
Best options would be splitting those into 2 separate Deployments.
First, would be easy to scale them independently.
Second, image would be smaller for each Deployment.
Third, Kubernetes would have a full control over those Deployments and you could utilize self-healing and rolling-updates.
There is a really good guide and examples on Deploying WordPress and MySQL with Persistent Volumes, which I think would be perfect for you.
But if you prefer to use just one pod then you would need to split you image or using official Docker images and your pod might look like this:
apiVersion: v1
kind: Pod
metadata:
name: app
labels:
app: test
spec:
containers:
- name: mysql
image: mysql:5.6
- name: apache
image: httpd:alpine
ports:
- containerPort: 80
volumeMounts:
- name: apache
mountPath: /var/log/apache2/
volumes:
- name: apache
hostPath:
path: "/home/sandeep/logs/apache"
You would need to expose the pod using Service:
$ kubectl expose pod app --type=NodePort --port=80
service "app" exposed
Checking what port it has:
$ kubectl describe service app
...
NodePort: <unset> 31418/TCP
...
Also you should read Communicate Between Containers in the Same Pod Using a Shared Volume.
You want to start apache and mysql in the same container and keep it running, aren't you?
Well, lets break down why it exits first. Kubernetes, just like Docker, will run whatever command you would give inside the container. If that command finishes, container would stop. echo sid | sudo -S service mysql start && sudo service apache2 start will ask your init process to start both mysql and apache, but the thing is that Kubernetes is not aware of your init inside the container.
In fact, the command statement will become instead of init process with pid 1, overriding whatever default startup command you have in your container image. Whenever process with pid 1 exits, container stops.
Therefore in your case you have to start whatever init system you have in your container.
However we come closer to another problem - Kubernetes already acts as init system. It starts your pods and supervises them. Therefore all you need is to start two containers instead - one for mysql and another one for apache.
For example you could use official dockerhub images from https://hub.docker.com//httpd/ and https://hub.docker.com//mysql. They already come with both services configured to startup correctly, therefore you don't even have to specify command and args in your deployment manifest.
Containers are not tiny VMs. You need two in this case, one running MySQL and another running Apache. Both have standard community images available, which I would probably start with.