We’re providing our own AMI node images for EKS using the self-managed node feature.
The challenge I’m currently having is how to fetch the kubernetes version from within the EKS node as it starts up.
I’ve tried IMDS - which unfortunately doesn’t seem to have it:
root#ip-10-5-158-249:/# curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/
ami-id
ami-launch-index
ami-manifest-path
autoscaling/
block-device-mapping/
events/
hostname
iam/
identity-credentials/
instance-action
instance-id
instance-life-cycle
instance-type
local-hostname
local-ipv4
mac
metrics/
network/
placement/
profile
reservation-id
It also doesn’t seem to be passed in by the EKS bootstrap script - seems AWS is baking a single K8s version into each AMI. (install-worker.sh).
This is different from Azure’s behaviour of baking a bunch of Kubelets into a single VHD.
I’m hoping for something like IMDS or a passed in user-data param which can be used at runtime to symlink kubelet to the correct kubelet version binary.
Assumed you build your AMI base on the EKS optimized AMI; one of the possible way is use kubelet --version to capture the K8s version in your custom built; as you knew EKS AMI is coupled with the control plane version. If you are not using EKS AMI, you will need to make aws eks describe-cluster call to get cluster information in order to join the cluster; which the version is provided at cluster.version.
Related
I am trying to reproduce an issue that requires me to use containerd v1.4.4 for my container-runtime and kubernetes v1.19.8. When I try to use minikube to create a multi-node cluster locally, it allows me to specify the kubernetes version but I am unable to specify the containerd version(i.e. it always uses v1.4.9) and based on this github discussion, it doesn't seem to support it. I then turned to kind but was unable to find a way to specify the same from the documentation. Is there a way either in kind or in minikube to specify the containerd version?
I ended up using kubeadm and set up a master and worker node using 2 VMs. This allowed me to specify the versions I want on the worker node. Building a base image on kind should also work as user Mikolaj.S mentioned
I've a EKS setup (v1.16) with 2 ASG: one for compute ("c5.9xlarge") and the other gpu ("p3.2xlarge").
Both are configured as Spot and set with desiredCapacity 0.
K8S CA works as expected and scale out each ASG when necessary, the issue is that the newly created gpu instance is not recognized by the master and running kubectl get nodes emits nothing.
I can see that the ec2 instance was in Running state and also I could ssh the machine.
I double checked the the labels and tags and compared them to the "compute".
Both are configured almost similarly, the only difference is that the gpu nodegroup has few additional tags.
Since I'm using eksctl tool (v.0.35.0) and the compute nodeGroup vs. gpu nodeGroup is basically copy&paste, I can't figured out what could be the problem.
UPDATE:
ssh the instance I could see the following error (/var/log/messages)
failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
and the kubelet service crashed.
would it possible the my GPU uses wrong AMI (amazon-eks-gpu-node-1.18-v20201211)?
As a simple you can use this preBootstrapCommands in eksctl yaml config file:
- name: test-node-group
preBootstrapCommands:
- "sed -i 's/cgroupDriver:.*/cgroupDriver: cgroupfs/' /etc/eksctl/kubelet.yaml"
There is some issue with EKS 1.16, even the graviton processors machine won't join the cluster. To fix it first you try upgrading your CNI version. Please refer the documentation here:
https://docs.aws.amazon.com/eks/latest/userguide/cni-upgrades.html
And if that doesn't work, then upgrade your EKS version to the latest available version then should work.
I've found out the issue. It seems to be mis-alignment between eksctl (v0.35.0) and the AL2-GPU AMI.
AWS team change the control group in docker to be "systemd" instead of "cgroup" (github) while the eksctl tool I used didn't absorb the changes.
A temporary solution is to edit the /etc/eksctl/kubelet.yaml file using preBootstrapCommands
How to get Kubernetes cluster name from K8s API mentions that
curl http://metadata/computeMetadata/v1/instance/attributes/cluster-name -H "Metadata-Flavor: Google"
(from within the cluster), or
kubectl run curl --rm --restart=Never -it --image=appropriate/curl -- -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/cluster-name
(from outside the cluster), can be used to retrieve the cluster name. That works.
Is there a way to perform the same programmatically using the k8s client-go library? Maybe using the RESTClient()? I've tried but kept getting the server could not find the requested resource.
UPDATE
What I'm trying to do is to get the cluster-name from an app that runs either in a local computer or within a k8s cluster. the k8s client-go allows to initialise the clientset via in cluster or out of cluster authentication.
With the two commands mentioned at the top that is achievable. I was wondering if there was a way from the client-go library to achieve the same, instead of having to do kubectl or curl depending on where the service is run from.
The data that you're looking for (name of the cluster) is available at GCP level. The name itself is a resource within GKE, not Kubernetes. This means that this specific information is not available using the client-go.
So in order to get this data, you can use the Google Cloud Client Libraries for Go, designed to interact with GCP.
As a starting point, you can consult this document.
First you have to download the container package:
➜ go get google.golang.org/api/container/v1
Before you will launch you code you will have authenticate to fetch the data:
Google has a very good document how to achieve that.
Basically you have generate a ServiceAccount key and pass it in GOOGLE_APPLICATION_CREDENTIALS environment:
➜ export GOOGLE_APPLICATION_CREDENTIALS=sakey.json
Regarding the information that you want, you can fetch the cluster information (including name) following this example.
Once you do do this you can launch your application like this:
➜ go run main.go -project <google_project_name> -zone us-central1-a
And the result would be information about your cluster:
Cluster "tom" (RUNNING) master_version: v1.14.10-gke.17 -> Pool "default-pool" (RUNNING) machineType=n1-standard-2 node_version=v1.14.10-gke.17 autoscaling=false%
Also it is worth mentioning that if you run this command:
curl http://metadata/computeMetadata/v1/instance/attributes/cluster-name -H "Metadata-Flavor: Google"
You are also interacting with the GCP APIs and can go unauthenticated as long as it's run within a GCE machine/GKE cluster. This provided automatic authentication.
You can read more about it under google`s Storing and retrieving instance metadata document.
Finally, one great advantage of doing this with the Cloud Client Libraries, is that it can be launched externally (as long as it's authenticated) or internally within pods in a deployment.
Let me know if it helps.
If you're running inside GKE, you can get the cluster name through the instance attributes: https://pkg.go.dev/cloud.google.com/go/compute/metadata#InstanceAttributeValue
More specifically, the following should give you the cluster name:
metadata.InstanceAttributeValue("cluster-name")
The example shared by Thomas lists all the clusters in your project, which may not be very helpful if you just want to query the name of the GKE cluster hosting your pod.
There was recently a Kubernetes security hole that was patched in v1.10.11 (among other versions), so I would like to upgrade to that version. I am currently on v1.10.9. However, when running the command gcloud container get-server-config to get the list of valid node versions, v1.10.11 doesn't show up. Instead, it jumps straight from v1.10.9 to v1.11.2.
Does anyone have any idea why I cannot seem to use the usual gcloud container clusters upgrade [CLUSTER_NAME] --cluster-version [CLUSTER_VERSION] to upgrade to this version?
Thanks in advance!
Based on:
https://cloud.google.com/kubernetes-engine/docs/security-bulletins#december-3-2018
If you have Kubernetes in v1.10.9 you should (to patch this security hole) update your GKE Cluster to 1.10.9-gke.5.
The following Kubernetes versions are now available for new clusters and for opt-in master upgrades for existing clusters:
1.9.7-gke.11,
1.10.6-gke.11,
1.10.7-gke.11,
1.10.9-gke.5,
1.11.2-gke.18
Please validate your Scheduled master auto-upgrades option in GKE.
If it's enabled your cluster masters were auto-upgraded by Google and the next possible version to update is further version so v1.11.2, what is showing by GKE for you.
I am working with Kube-Aws by coreos to generate a cloud formation script and deploy it as part of my stack,
I would like to upgrade my kubernetes cluster to a newer version.
I don't mind creating a new cluster, but what I do mind is recreating all the deployments/services etc...
Is there any way to take the configuration and replace/transfer them to the new cluster? maybe copy the entire etcd data? will that help?
Use kubectl get --export=true on all the resources that you want to move into a new cluster and then restore them that way.
kubectl get <pods,services,deployments,whatever> --export=true --all-namespaces=true