I am trying to upgrade my EKS cluster version and node group version via CDK.
For EKS cluster version, I bumped the version for the eks cluster in cdk.
this.cluster = new eks.Cluster(this, 'eks-cluster', {
vpc: props.vpc,
clusterName: props.clusterName,
version: eks.KubernetesVersion.V1_22,
});
This change deployed successfully and I can observe the cluster version have been updated (v1.22). However, the node group version did not get updated (v1.21).
I was only able to find doc to upgrade node group version using eksctl or aws console, but these are manual and I would have to do it for each node group.
reference doc - https://docs.aws.amazon.com/eks/latest/userguide/update-managed-node-group.html
How can I upgrade my node group version using cdk?
I used releaseVersion in NodegroupProps to specify the EKS version.
The string for releaseVersion is in the form of k8s_major_version.k8s_minor_version.k8s_patch_version-release_date according to this doc. The list of AMI version is found in the changelogs.
const nodeGroup = new eks.Nodegroup(this, 'myNodeGroup', {
cluster: this.cluster,
forceUpdate: false,
amiType: eks.NodegroupAmiType.AL2_X86_64,
releaseVersion: '<AMI ID obtained from changelog>',
capacityType: eks.CapacityType.ON_DEMAND,
desiredSize: 5,
});
Related
We’re providing our own AMI node images for EKS using the self-managed node feature.
The challenge I’m currently having is how to fetch the kubernetes version from within the EKS node as it starts up.
I’ve tried IMDS - which unfortunately doesn’t seem to have it:
root#ip-10-5-158-249:/# curl -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/
ami-id
ami-launch-index
ami-manifest-path
autoscaling/
block-device-mapping/
events/
hostname
iam/
identity-credentials/
instance-action
instance-id
instance-life-cycle
instance-type
local-hostname
local-ipv4
mac
metrics/
network/
placement/
profile
reservation-id
It also doesn’t seem to be passed in by the EKS bootstrap script - seems AWS is baking a single K8s version into each AMI. (install-worker.sh).
This is different from Azure’s behaviour of baking a bunch of Kubelets into a single VHD.
I’m hoping for something like IMDS or a passed in user-data param which can be used at runtime to symlink kubelet to the correct kubelet version binary.
Assumed you build your AMI base on the EKS optimized AMI; one of the possible way is use kubelet --version to capture the K8s version in your custom built; as you knew EKS AMI is coupled with the control plane version. If you are not using EKS AMI, you will need to make aws eks describe-cluster call to get cluster information in order to join the cluster; which the version is provided at cluster.version.
I'm trying to upgrade the VM size of my AKS cluster using this approach with Terraform. Basically I create a new nodepool with the required amount of nodes, then I cordon the old node to disallow scheduling of new pods. Next, I drain the old node to reschedule all the pods in the newly created node. Then, I proceed to upgrade the VM size.
The problem I am facing is that azurerm_kubernetes_cluster resource allow for the creation of the default_node_pool and another resource, azurerm_kuberentes_cluster_node_pool allows me to create new nodepools with extra nodes.
Everything works until I create the new nodepool, cordon and drain the old one. However when I change the default_nod_pool.vm_size and run terraform apply, it tells me that the whole resource has to be recreated, including the new nodepool I just created, because it's linked to the cluster id, which will be replaced.
How should I manage this upgrade from the documentation with Terraform if upgrading the default node pool always forces replacement even if a new nodepool is in place?
terraform version
terraform v1.1.7
on linux_amd64
+ provider registry.terraform.io/hashicorp/azurerm v2.82.0
+ provider registry.terraform.io/hashicorp/local v2.2.2
I have an EKS cluster that has gone through an upgrade from 1.17 to 1.18.
The cluster has 2 node groups (updated using the AWS console).
EKS control plane and one of the node groups upgrades were ok.
The last node group the upgrade is failing due to a health issue - AsgInstanceLaunchFailures - One or more target groups not found. Validating load balancer configuration failed. and now the node group is marked as Degraded.
when I access the update ID I see the following error:
NodeCreationFailure - Couldn't proceed with upgrade process as new nodes are not joining node group {NODE_GROUP_NAME}
I tried accessing the ASG with that ID and I can see it has several load-balancing target groups attached to it.
I could not find any way to fix this in the AWS docs.
Any advice?
Issue resolved.
it appears there was an empty target group added manually to the cluster (there were 3 other target groups created automatically). Once the empty target group was deleted, the upgrade was completed successfully.
I am still unclear as to how EKS chooses the proper target group to update when there is more than one.
Are you able to launch new nodes which are coming up in Ready State and joining cluster? Based on EKS public doc, the upgrade request would succeed only when ASG can launch new instances in Ready state in all the AZs of the node group.
To debug this further, you can trigger a new upgrade request and check the health of new nodes which EKS brings up in your cluster.
I want to create a cluster under EKS in a version that got recently deprecated 1.15 to test something version specific.
my below command is failing
eksctl create cluster --name playgroundkubernetes --region us-east-1 --version 1.15 --nodegroup-name standard-workers --node-type t2.medium --managed
is there a workaround where i can create a cluster in version 1.15.
No it's not possible to create a brand new EKS cluster with a deprecated version. The only option would be to deploy your own cluster (DIY) with something like KOPS or the like.
In addition to mreferre's comment, if you're trying to just create a Kubernetes cluster and don't need it to be in AWS, you could use Kind (https://kind.sigs.k8s.io/docs/user/quick-start/) or similar to create something much more quickly and probably more cheaply.
There was recently a Kubernetes security hole that was patched in v1.10.11 (among other versions), so I would like to upgrade to that version. I am currently on v1.10.9. However, when running the command gcloud container get-server-config to get the list of valid node versions, v1.10.11 doesn't show up. Instead, it jumps straight from v1.10.9 to v1.11.2.
Does anyone have any idea why I cannot seem to use the usual gcloud container clusters upgrade [CLUSTER_NAME] --cluster-version [CLUSTER_VERSION] to upgrade to this version?
Thanks in advance!
Based on:
https://cloud.google.com/kubernetes-engine/docs/security-bulletins#december-3-2018
If you have Kubernetes in v1.10.9 you should (to patch this security hole) update your GKE Cluster to 1.10.9-gke.5.
The following Kubernetes versions are now available for new clusters and for opt-in master upgrades for existing clusters:
1.9.7-gke.11,
1.10.6-gke.11,
1.10.7-gke.11,
1.10.9-gke.5,
1.11.2-gke.18
Please validate your Scheduled master auto-upgrades option in GKE.
If it's enabled your cluster masters were auto-upgraded by Google and the next possible version to update is further version so v1.11.2, what is showing by GKE for you.