I'm getting the following error when trying to modify the instance types of the worker/master nodes of my k8s cluster.
error reading InstanceGroup "nodes": InstanceGroup.kops.k8s.io "nodes" not found
I run the following:
kops edit ig nodes --name ${NAME}
error reading InstanceGroup "nodes": InstanceGroup.kops.k8s.io "nodes" not found
Am I missing something here?
$ kops get instancegroups --name ${NAME}
NAME ROLE MACHINETYPE MIN MAX ZONES
master-us-east-2a Master t3.medium 1 1 us-east-2a
nodes-us-east-2a Node t3.medium 1 1 us-east-2a
nodes-us-east-2b Node t3.medium 1 1 us-east-2b
This works.
Maybe Kops has changed recently and they don't group all nodes under the same name anymore as previously?
kOps did indeed change recently in that new clusters are provisioned with one instance group per availability zone (AZ) instead of having one node IG that spans all AZs.
So in your case, you want to edit both nodes-us-east-2a and nodes-us-east-2b.
As a bonus comment, I really recommend that you both
use kops get -o yaml to dump the specs and put them under version control
template your IG spec so that you ensure they are consistent.
Related
I'm provisioning EKS with managed nodes through Terraform. No issues there, it's all working fine.
My problem is that I want to add a label to one of my nodes to use as a nodeSelector in one of my deployments. I have an app that is backed by an EBS persistent volume which obviously is only available in a single AZ, so I want my pod to schedule there.
I can add a label pretty easily with:
kubectl label nodes <my node> <key>=<value>
And actually this is fine, that is until you do something like update the node group to the next version. The labels don't persist, which makes sense as they are not managed by Amazon.
Is there a way, either through terraform or something else to set these labels and make them persist. I notice that the EKS provider for Terraform has a labels option, but it seems like that will add the label to all nodes in the Node Group, and that's not what I want. I've looked around, but can't find anything.
You may not need to add a label to a specific node to solve your problem. Amazon as a cloud provider adds some Kubernetes labels to each node in a managed node group. Example:
labels:
failure-domain.beta.kubernetes.io/region: us-east-1
failure-domain.beta.kubernetes.io/zone: us-east-1a
kubernetes.io/hostname: ip-10-10-10-10.ec2.internal...
kubernetes.io/os: linux
topology.ebs.csi.aws.com/zone: us-east-1a
topology.kubernetes.io/region: us-east-1
topology.kubernetes.io/zone: us-east-1a
The exact labels available to you will depend on the version of Kubernetes you are running. Try running kubectl get nodes -o json | jq '.items[].metadata.labels' to see the labels set on each node in your cluster.
I recommend using topology.kubernetes.io/zone to match the availability zone containing your EBS volume. According to the Kubernetes documentation, both nodes and persistent volumes should have this label populated by the cloud provider.
Hope this helps. Let me know if you still have questions.
You can easily achieve that with Terraform:
resource "aws_eks_node_group" "example" {
...
labels = {
label_key = "label_value"
}
}
Add a second node group (with the desired node info) and label that node group.
While creating a cluster, kops gives us a set of arguments to configure the images to be used for the master instances and the node instances like the following as mentioned in the kops documentation for create cluster command : https://github.com/kubernetes/kops/blob/master/docs/cli/kops_create_cluster.md
--image string Set image for all instances.
--master-image string Set image for masters. Takes precedence over --image
--node-image string Set image for nodes. Takes precedence over --image
Suppose I forgot to add these parameters when I created the cluster, how can I edit the cluster and update these things?
While running kops edit cluster the cluster configuration opens up as a yaml.. but where should I add these things in there?
is there complete kops cluster yaml that I can refer to modify my cluster?
You would need to edit the instance group after the cluster is created to add/edit the image name.
kops get ig
kops edit ig <ig-name>
After the update is done for all masters and nodes, perform
kops update cluster <cluster-name>
kops update cluster <cluster-name> --yes
and then perform rolling-update or restart/stop 1 instance at a time from the cloud console
kops rolling-update cluster <cluster-name>
kops rolling-update cluster <cluster-name> --yes
in another terminal kops validate cluster <cluster-name> to validate the cluster
there are other flags we can use as well while performing the rolling-update
There are other parameters as well which you can add, update, edit in the instance group - take a look at the documentation for more information
Found a solution for this question. My intention was to update huge number of instance groups in one shot for a cluster. Editing each instance group one by one is lot of work.
run kops get <cluster name> -o yaml > cluster.yaml
edit it there, then run kops replace -f cluster.yaml
Once we increase load by using JMeter client than my deployed service is interrupted and on GCP/GKE console it says that -
Upgrading cluster master
The values shown below are going to change soon.
And my kubectl client throw this error during upgrade -
Unable to connect to the server: dial tcp 35.236.238.66:443: connectex: No connection could be made because the target machine actively refused it.
How can I stop this upgrade or prevent my service interruption ? If service will be intrupted than there is no benefit of this auto scaling. I am new to GKE, please let me know if I am missing any configuration or parameter here.
I am using this command to create my cluster-
gcloud container clusters create ajeet-gke --zone us-east4-b --node-locations us-east4-b --machine-type n1-standard-8 --num-nodes 1 --enable-autoscaling --min-nodes 4 --max-nodes 16
It is not upgrading k8s version. Because it works fine with smaller load but as I increase load than cluster starts upgrade of master. So it looks the master is resizing itself for more nodes. After upgrade I can see more nodes on GCP console. https://github.com/terraform-providers/terraform-provider-google/issues/3385
Below command says auto scaling is not enabled on instance group.
> gcloud compute instance-groups managed list
NAME AUTOSCALED LOCATION SCOPE ---
ajeet-gke-cluster- no us-east4-b zone ---
default-pool-4***0
Workaround
Sorry forget to update it here, I found a workaround to fix it - after splitting cluster creation command in to two steps cluster is auto scaling without restarting master node:
gcloud container clusters create ajeet-ggs --zone us-east4-b --node-locations us-east4-b --machine-type n1-standard-8 --num-nodes 1
gcloud container clusters update ajeet-ggs --enable-autoscaling --min-nodes 1 --max-nodes 10 --zone us-east4-b --node-pool default-pool
To prevent this you should always create your cluster with hardcoded cluster version to the last version available.
See the documentation: https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-architecture#master
This means that Goolge is managing the master, meaning that if your master is not up to date it will be updated to be in the last version and allow google to limit the number of version currently managed. https://cloud.google.com/kubernetes-engine/docs/concepts/regional-clusters
Now why do you have an interruption of service during the update: because you are in zonal mode with only one master, to prevent this you should go in regional cluster mode with more than one master, allowing for clean rolling update.
The master won't resize the node, unless the autoscaling feature is enabled in it.
As mentioned in above answer, this is a feature at the node-pool level. By looking at description of the issue, it does seems like 'autoscaling' is enabled on your node-pool and eventually a GKE's cluster autoscaler automatically resizes clusters based on the demands of the workloads you want to run(ie when there are pods that are not able to be scheduled due to resource shortages such as CPU).
Additionaly, Kubernetes cluster autoscaling does not use the Managed Instance Group autoscaler. It runs a cluster-autoscaler controller on the Kubernetes master that uses Kubernetes-specific signals to scale your nodes.
It is therefore, highly recommended not use(or rely on the autoscaling status showed by MIG) Compute Engine's autoscaling feature on instance groups created by Kubernetes Engine.
I created a cluster with kops in the AWS.
sudo kops create cluster --name=k8s.ehh.fun --state=s3://kops-state-ehh000 --zones=us-east-1a --node-count=3 --node-size=t2.micro --master-size=t2.micro --dns-zone=k8s.ehh.fun
And now a Would like to change the node-count without destroy the cluster. How can I do that?
I tried :
sudo kops update cluster --name=k8s.ehh.fun --state=s3://kops-state-ehh000 --node-count=3 --node-size=t2.micro
But I got : Error: unknown flag: --node-count
You can change the node count by editing the nodes instance group:
kops edit instancegroup nodes
This will open an editor in which you can edit your instance group's specification and increase the code count. After saving and exiting, call:
kops update cluster <cluster-name> --yes
This will automatically update your auto-scaling group and start additional instances (or terminate them if you decreased the node count).
See the documentation for more information.
I created a cluster using kops. It worked fine and the cluster is healthy. I can see my nodes using kubectl and have created some deployments and services. I tried adding a node using "kops edit ig nodes" and got an error "cluster not found". Now I get that error for all kops commands:
kops validate cluster
Using cluster from kubectl context: <clustername>
cluster "<clustername>" not found
So my question is: where does kops look for clusters and how do I configure it to see my cluster.
My KOPS_STATE_STORE environment variable got messed up. I corrected it to be the correct s3 bucket and everything is fine.
export KOPS_STATE_STORE=s3://correctbucketname
Kubectl and Kops access the configuration file from the following the location.
When the cluster is created.The configuration will be saved into a users
$HOME/.kube/config
I have attached the link for further insight for instance, If you have another config file you can EXPORT it. kube-config