I want to delete the single node of cluster
here is my problem i am create the node where 2 nodes are running only
but for sometime i need more nodes for few minutes only then after using scaling down i want delete the drain node only from cluster.
i do scaling up/down manually
here is the step i follow
create cluster with 2 node
scale up the cluster and add 2 more.
after i want to delete the 2 node with all backup pod only
i tried it with command
eksctl scale nodegroup --cluster= cluster-name --name= name --nodes=4 --nodes-min=1 --nodes-max=4
but it doesn't help it will delete random node also manager will crash.
One option is using a separate node group for the transient load, use taints/tolerations for laod to be scheduled on that node group, drain/delete that particular node group if not needed.
Do you manually scale up/down nodes? If you are using something like cluster auto scaler, there will be variables like "cluster-autoscaler.kubernetes.io/safe-to-evict": "false" to protect pods from scaling down.
Related
So this is a little bit related with this other post, but I have a determined example that I think justify this use case.
I am installing a redis-cluster using bitnami/redis-cluster helm chart. This works with 6 instances, 3 master + 3 slaves, that are deployed over 3 nodes. My idea then, its that these instances were deployed over 3 nodes, avoiding master 1 and slave 1 (for example) were not in the same node, so if one node fails I have its replica in one of the other nodes.
What I am doing actually, is set a custom scheduler for this nodes, and in the end of deployment execute an script that manually sets each instance in the node I want.
Is there any way to do this simply using nodeSelector, affinity, taints... directly from values file? I would like to stay on this chart, and don't create a yaml for each redis instance.
Upgrade Kube-aws v1.15.5 cluster to the next version 1.16.8.
Use Case:
I want to keep the Same node label for Master and Worker nodes as I'm using in v1.15 .
When I tried to upgrade the cluster to V1.16 the --node-labels is restricted to use 'node-role'
If I keep the node role as "node-role.kubernetes.io/master" the kubelet fails to start after upgrade. if I remove the label, kubectl get node output shows none for the upgraded node.
How do I reproduce?
Before the upgrade I took a backup of 'cp /etc/sysconfig/kubelet /etc/sysconfig/kubelet-bkup' have removed "-role" from it and once the upgrade is completed, I have moved the kubelet sysconfig by replacing the edited file 'mv /etc/sysconfig/kubelet-bkup /etc/sysconfig/kubelet'. Now I could able to see the Noderole as Master/Worker even after kubelet service restart.
The Problem I'm facing now?
Though I perform the upgrade on the existing cluster successfully. The cluster is running in AWS as Kube-aws model. So, the ASG would spin up a new node whenever Cluster-Autoscaler triggers it.
But, the new node fails to join to the cluster since the node label "node-role.kubernetes.io/master" exists in the code base.
How can I add the node-role dynamically in the ASG scale-in process?. Any solution would be appreciated.
Note:
(Kubeadm, kubelet, kubectl )- v1.16.8
I have sorted out the issue. I have created a Python code that watches the node events. So whenever ASG spins up a new node, after it joins to the cluster, the node wil be having a role "" , later the python code will add a appropriate label to the node dynamically.
Also, I have created a docker image with the base of python script I created for node-label and it will run as a pod. The pod will be deployed into the cluster and it does the job of labelling the new nodes.
Ref my solution given in GitHub
https://github.com/kubernetes/kubernetes/issues/91664
I have created as a docker image and it is publicly available
https://hub.docker.com/r/shaikjaffer/node-watcher
Thanks,
Jaffer
I apologize for my poor English.
I created 1 master-node and 1 worker-node in cluster, and deployed container (replicas:4).
then kubectl get all shows like as below. (omitted)
NAME NODE
pod/container1 k8s-worker-1.local
pod/container2 k8s-worker-1.local
pod/container3 k8s-worker-1.local
pod/container4 k8s-worker-1.local
next, I added 1 worker-node to this cluster. but all containers keep to be deployed to worker1.
ideally, I want 2 containers to stop, and start up on worker2 like as below.
NAME NODE
pod/container1 k8s-worker-1.local
pod/container2 k8s-worker-1.local
pod/container3 k8s-worker-2.local
pod/container4 k8s-worker-2.local
Do I need some commands after adding additional node?
Scheduling only happens when a pod is started. After that, it won't be moved. There are tools out there for deleting (evicting) pods when nodes get too imbalanced, but if you're just starting out I wouldn't go that far for now. If you delete your 4 pods and recreate them (or let the Deployment system recreate them as is more common in a real situation) they should end up more balanced (though possibly not 2 and 2 since the system isn't exact and spreading out is only one of the factors used in scheduling).
I have 3 nodes in a k8s cluster and I need exactly 2 pods to be scheduled in each node, so I would end up having 3 nodes with 2 pods each (6 replicas).
I found that k8s have Pod Affinity/Anti-Affinity feature and that seems to be the correct way of doing.
My problem is: I want to run 2 pods per node but I often use kubectl apply to upgrade my docker image version, and in this case k8s should've be able to schedule 2 new images in each node before terminating the old ones - will the newer images be scheduled if I use Pod Affinity/Anti-Affinity to allow only 2 pods per node?
How can I do this in my deployment configuration? I cannot get it to work.
I believe it is part of kubelet's setting, so you would have to look into kubelet's --max-pods flag, depending on what your cluster configuration is.
The following links could be useful:
https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/#kubelet
and
https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/
I'm wondering the graceful way to reduce nodes in a Kubernetes cluster on GKE.
I have some nodes each of which has some pods watching a shared job queue and executing a job. I also have the script which monitors the length of the job queue and increase the number of instances when the length exceeds a threshold by executing gcloud compute instance-groups managed resize command and it works ok.
But I don't know the graceful way to reduce the number of instances when the length falls below the threshold.
Is there any good way to stop the pods working on the terminating instance before the instance gets terminated? or any other good practice?
Note
Each job can take around between 30m and 1h
It is acceptable if a job gets executed more than once (in the worst case...)
I think the best approach is instead of using a pod to run your tasks, use the kubernetes job object. That way when the task is completed the job terminates the container. You would only need a small pod that could initiate kubernetes jobs based on the queue.
The more kube jobs that get created, the more resources will be consumed and the cluster auto-scaler will see that it needs to add more nodes. A kube job will need to complete even if it gets terminated, it will get re-scheduled to complete.
There is no direct information in the GKE docs about whether a downsize will happen if a Job is running on the node, but the stipulation seems to be if a pod can be easily moved to another node and the resources are under-utilized it will drain the node.
Refrences
https://cloud.google.com/container-engine/docs/cluster-autoscaler
http://kubernetes.io/docs/user-guide/kubectl/kubectl_drain/
http://kubernetes.io/docs/user-guide/jobs/
Before resizing the cluster, let's set the project context in the cloud shell by running the below commands:
gcloud config set project [PROJECT_ID]
gcloud config set compute/zone [COMPUTE_ZONE]
gcloud config set compute/region [COMPUTE_REGION]
gcloud components update
Note: You can also set project, compute zone & region as flags in the below command using --project, --zone, and --region operational flags
gcloud container clusters resize [CLUSTER_NAME] --node-pool [POOL_NAME] --num-nodes [NUM_NODES]
Run the above command for each node pool. You can omit the --node-pool flag if you have only one node pool.
Reference: https://cloud.google.com/kubernetes-engine/docs/how-to/resizing-a-cluster