How can I get kubectl to recognize the newly scaled az aks nodepool nodes? - kubernetes

I updated my Azure AKS nodepool size from within the Azure Portal to go from 2 to 4 nodes. When I run az aks nodepool show ..., I see that the count has correctly been updated. However, when I run kubectl get nodes, I still only see the two nodes that previously existed.
According to the Kubernetes documentation on node management,
There are two main ways to have Nodes added to the API server :
The kubelet on a node self-registers to the control plane
You, or another human user, manually add a Node object
(Emphasis mine)
My expectation, therefore, is that having scaled up my node pool, these new nodes should automatically register, and kubectl get nodes should just pick them up, but this appears to not be the case.
Now that my nodepool has more nodes, how do I get my AKS cluster to recognize and utilize them? Once kubectl get nodes shows them, will applying an updated manifest (with more replicas) be all I need to do to use the additional hardware?

It's difficult to see without access to your setup. But you can see:
Check that the control plane hasn't been automatically upgraded to a new version that is incompatible with the kubelet version in your nodepool when it registers with the cluster. (Best if the versions match)
Connect to the nodes that are not registering (ssh) and check the logs as to why the kubelet is not starting. i.e systectl status kubelet.
Check that you can connect to the port (i.e 8443) and IP address where your kube-apiserver is listening on from these nodes that are not registering. i.e curl <ip-address>:8443
Possible solution:
Upgrade the VM image of your node pool to use one compatible with the control plane.
Remove firewall rule preventing your nodes accessing the kube-apiserver
will applying an updated manifest (with more replicas) be all I need to do to use the additional hardware?
Should work.
✌️

Related

How to deregister a kubernetes node from a kubernetes cluster

I have a node mistakenly registered on a cluster B while it is actually serving for cluster A.
Here 'registered on a cluster B' means I can see the node from kubectl get node from cluster B.
I want to deregister this node from cluster B, but keep the node intact.
I know regular process to delete a node is:
kubectl drain node xxx
kubectl delete node xxx
# on node
kubeadm reset
But I do not want pods on the node from cluster A to be deleted or transfered. And I want to make sure the node would not self-register to cluster B afterwards.
To be clear, let's say, cluster A has Pod A on the node, cluster B has Pod B on the node as well, I want to delete node from cluster B, but keep Pod A intact. (By the way, can I see Pod A from cluster B?)
Thank you in advance!
To deregister the node without removing any pod you run below command
kubectl delete node nodename
After this is done the node will not appear in kubectl get nodes
For the node to not self register again stop the kubelet process on that node by logging into that node and using below command.
systemctl stop kubelet
As this case has been already clarified I decided to publish a Community Wiki answer based on the following comment:
#mario nvm, I thought different clusters in one node affect each
other, actually they do not, they just share container runtime which
is more like 'read-only', and they have different kubelets of
themselves listening on different port. – Li Ziyan Aug 17 at 5:29
to make it clear also for other users what was actually the issue here and how it has been solved or simply clarified.
So if you design your infrastructure in such a way that you use one physical (or virtual) machine as Node for more than one kubernetes clusters (which I believe is not very common case) the infrastructure looks as follows:
Components that are shared:
physical (or virtual) node
common container runtime environment (e.g. docker)
Components that are separate:
two separate kubelets. Although they are running on the same physical/virtual node they are configured to listen on different ports and are registered within two master Nodes (or more specifically two different kube-apiservers being part of two different kubernetes control planes)
two logically separate, independent kubernetes Nodes which, although they are configured on the same physical node/host, are logically completely separate kubernetes Nodes, being part of two completely different kubernetes clusters that don't interfere with each other in any way.
I hope it helps to clarify possible confusion about this question and maybe help someone in case they have similar doubts.

Can I make sure that a "cordoned" node is the one deleted when I downsize the cluster in Azure AKS?

I need to downsize a cluster from 3 to 2 nodes.
I have critical pods running on some nodes (0 and 1). As I found that the last node (2) in the cluster is the one that has the non critical pods, I have "cordoned" it so it won't get any new ones.
I wonder is if I can make sure that that last node (2) is the one that will be removed when I go to Azure portal and downsize my cluster to 2 nodes (it is the last node and it is cordoned).
I have read that if I manually delete the node, the system will still consider there are 3 nodes running so it's important to use the cluster management to downsize it.
You cannot control which node will be removed when scaling down the AKS cluster.
However, there are some workarounds for that:
Delete the cordoned node manually via portal and than launch upgrade. It would try to add the node but with no success because the subnet has no space left.
Another option is to:
Set up cluster autoscaler with two nodes
Scale up the number of nodes in the UI
Drain the node you want to delete and wait for autoscaler do it's job
Here are some sources and useful info:
Scale the node count in an Azure Kubernetes Service (AKS) cluster
Support selection of nodes to remove when scaling down
az aks scale
Please let me know if that helped.

Update API Server pods when more than 1 masters

I have installed a kubernetes cluster of 10 nodes(2 masters, 3 etcds, 5 minions) using Kubespray. Presently my cluster supports Token based authentication. I want to add Basic-Auth capability as well.
I couldn't find any higher level resource than Pod in kube-system namespace. So tried manually updating the pod. Added known_users.csv in specified location and updated kube-apiserver.manifest file in one of master nodes and tried updating the pod using kubectl apply, which resulted in that master node going offline.
Is there a way to update this config after deploying cluster, as i don't want to re-spin whole cluster just to enable this?
Related question

Redistribute pods after adding a node in Kubernetes

What should I do with pods after adding a node to the Kubernetes cluster?
I mean, ideally I want some of them to be stopped and started on the newly added node. Do I have to manually pick some for stopping and hope that they'll be scheduled for restarting on the newly added node?
I don't care about affinity, just semi-even distribution.
Maybe there's a way to always have the number of pods be equal to the number of nodes?
For the sake of having an example:
I'm using juju to provision small Kubernetes cluster on AWS. One master and two workers. This is just a playground.
My application is apache serving PHP and static files. So I have a deployment, a service of type NodePort and an ingress using nginx-ingress-controller.
I've turned off one of the worker instances and my application pods were recreated on the one that remained working.
I then started the instance back, master picked it up and started nginx ingress controller there. But when I tried deleting my application pods, they were recreated on the instance that kept running, and not on the one that was restarted.
Not sure if it's important, but I don't have any DNS setup. Just added IP of one of the instances to /etc/hosts with host value from my ingress.
descheduler, a kuberenets incubator project could be helpful. Following is the introduction
As Kubernetes clusters are very dynamic and their state change over time, there may be desired to move already running pods to some other nodes for various reasons:
Some nodes are under or over utilized.
The original scheduling decision does not hold true any more, as taints or labels are added to or removed from nodes, pod/node affinity requirements are not satisfied any more.
Some nodes failed and their pods moved to other nodes.
New nodes are added to clusters.
There is automatic redistribution in Kubernetes when you add a new node. You can force a redistribution of single pods by deleting them and having a host based antiaffinity policy in place​. Otherwise Kubernetes will prefer using the new node for scheduling and thus achieve a redistribution over time.
What are your reasons for a manual triggered redistribution​?

DaemonSets on Google Container Engine (Kubernetes)

I have a Google Container Engine cluster with 21 nodes, there is one pod in particular that I need to always be running on a node with a static IP address (for outbound purposes).
Kubernetes supports DaemonSets
This is a way to have a pod be deployed to a specific node (or in a set of nodes) by giving the node a label that matches the nodeSelector in the DaemonSet. You can then assign a static IP to the VM instance that the labeled node is on. However, GKE doesn't appear to support the DaemonSet kind.
$ kubectl create -f go-daemonset.json
error validating "go-daemonset.json": error validating data: the server could not find the requested resource; if you choose to ignore these errors, turn validation off with --validate=false
$ kubectl create -f go-daemonset.json --validate=false
unable to recognize "go-daemonset.json": no kind named "DaemonSet" is registered in versions ["" "v1"]
When will this functionality be supported and what are the workarounds?
If you only want to run the pod on a single node, you actually don't want to use a DaemonSet. DaemonSets are designed for running a pod on every node, not a single specific node.
To run a pod on a specific node, you can use a nodeSelector in the pod specification, as documented in the Node Selection example in the docs.
edit: But for anyone reading this that does want to run something on every node in GKE, there are two things I can say:
First, DaemonSet will be enabled in GKE in version 1.2, which is planned for March. It isn't enabled in GKE in version 1.1 because it wasn't considered stable enough at the time 1.1 was cut.
Second, if you want to run something on every node before 1.2 is out, we recommend creating a replication controller with a number of replicas greater than your number of nodes and asking for a hostPort in the container spec. The hostPort will ensure that no more than one pod from the RC will be run per node.
DaemonSets is still alpha feature and Google Container Engine supports only production Kubernetes features. Workaround: build your own Kubernetes cluster (GCE, AWS, bare metal, ...) and enable alpha/beta features.