I have a Google Container Engine cluster with 21 nodes, there is one pod in particular that I need to always be running on a node with a static IP address (for outbound purposes).
Kubernetes supports DaemonSets
This is a way to have a pod be deployed to a specific node (or in a set of nodes) by giving the node a label that matches the nodeSelector in the DaemonSet. You can then assign a static IP to the VM instance that the labeled node is on. However, GKE doesn't appear to support the DaemonSet kind.
$ kubectl create -f go-daemonset.json
error validating "go-daemonset.json": error validating data: the server could not find the requested resource; if you choose to ignore these errors, turn validation off with --validate=false
$ kubectl create -f go-daemonset.json --validate=false
unable to recognize "go-daemonset.json": no kind named "DaemonSet" is registered in versions ["" "v1"]
When will this functionality be supported and what are the workarounds?
If you only want to run the pod on a single node, you actually don't want to use a DaemonSet. DaemonSets are designed for running a pod on every node, not a single specific node.
To run a pod on a specific node, you can use a nodeSelector in the pod specification, as documented in the Node Selection example in the docs.
edit: But for anyone reading this that does want to run something on every node in GKE, there are two things I can say:
First, DaemonSet will be enabled in GKE in version 1.2, which is planned for March. It isn't enabled in GKE in version 1.1 because it wasn't considered stable enough at the time 1.1 was cut.
Second, if you want to run something on every node before 1.2 is out, we recommend creating a replication controller with a number of replicas greater than your number of nodes and asking for a hostPort in the container spec. The hostPort will ensure that no more than one pod from the RC will be run per node.
DaemonSets is still alpha feature and Google Container Engine supports only production Kubernetes features. Workaround: build your own Kubernetes cluster (GCE, AWS, bare metal, ...) and enable alpha/beta features.
Related
I updated my Azure AKS nodepool size from within the Azure Portal to go from 2 to 4 nodes. When I run az aks nodepool show ..., I see that the count has correctly been updated. However, when I run kubectl get nodes, I still only see the two nodes that previously existed.
According to the Kubernetes documentation on node management,
There are two main ways to have Nodes added to the API server :
The kubelet on a node self-registers to the control plane
You, or another human user, manually add a Node object
(Emphasis mine)
My expectation, therefore, is that having scaled up my node pool, these new nodes should automatically register, and kubectl get nodes should just pick them up, but this appears to not be the case.
Now that my nodepool has more nodes, how do I get my AKS cluster to recognize and utilize them? Once kubectl get nodes shows them, will applying an updated manifest (with more replicas) be all I need to do to use the additional hardware?
It's difficult to see without access to your setup. But you can see:
Check that the control plane hasn't been automatically upgraded to a new version that is incompatible with the kubelet version in your nodepool when it registers with the cluster. (Best if the versions match)
Connect to the nodes that are not registering (ssh) and check the logs as to why the kubelet is not starting. i.e systectl status kubelet.
Check that you can connect to the port (i.e 8443) and IP address where your kube-apiserver is listening on from these nodes that are not registering. i.e curl <ip-address>:8443
Possible solution:
Upgrade the VM image of your node pool to use one compatible with the control plane.
Remove firewall rule preventing your nodes accessing the kube-apiserver
will applying an updated manifest (with more replicas) be all I need to do to use the additional hardware?
Should work.
✌️
We have a deployment of Kubernetes in Google Cloud Platform. Recently we hit one of the well known issues related on a problem with the kube-dns that happens at high amount of requests https://github.com/kubernetes/kubernetes/issues/56903 (its more related to SNAT/DNAT and contract but the final result is out of service of kube-dns).
After a few days of digging on that topic we found that k8s already have a solution witch is currently in alpha (https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/)
The solution is to create a caching CoreDNS as a daemonset on each k8s node so far so good.
Problem is that after you create the daemonset you have to tell to kubelet to use it with --cluster-dns option and we cant find any way to do that in GKE environment. Google bootstraps the cluster with "configure-sh" script in instance metadata. There is an option to edit the instance template and "hardcode" the required values but that is not an option if you upgrade the cluster or use the horizontal autoscaling all of the modified values will be lost.
The last idea was to use custom startup script that pull configuration and update the metadata server but this is a too complicated task.
As of 2019/12/10, GKE now supports through the gcloud CLI in beta:
Kubernetes Engine
Promoted NodeLocalDNS Addon to beta. Use --addons=NodeLocalDNS with gcloud beta container clusters create. This addon can be enabled or disabled on existing clusters using --update-addons=NodeLocalDNS=ENABLED or --update-addons=NodeLocalDNS=DISABLED with gcloud container clusters update.
See https://cloud.google.com/sdk/docs/release-notes#27300_2019-12-10
You can spin up another kube-dns deployment e.g. in different node-pool and thus having 2x nameserver in the pod's resolv.conf.
This would mitigate the evictions and other failures and generally allow you to completely control your kube-dns service in the whole cluster.
In addition to what was mentioned in this answer - With beta support on GKE, the nodelocal caches now listen on the kube-dns service IP, so there is no need for a kubelet flag change.
I'm trying to deploy scalable mariadb galera cluster in kubernetes or docker swarm. Since each pod or containers needs its own galera config, how should i create my deployment so i could be able to scale it without any manual work? I think we can't use ConfigMap cause for a 10 node cluster there have to be 10 configmaps!
Example of mariadb galera config of a node:
wsrep_cluster_address="gcomm://ip_1,ip_2,ip_3"
wsrep_node_address="ip_1"
wsrep_node_name="node_1"
wsrep_cluster_name="mariadb-cluster"
For such applications which have different config for each node, what is the best way of deployment?
Note: I can create pods/containers and do the config my self (join new nodes to the cluster) but i think this isn't right way and i need it to be auto scalable.
You almost definitely want to use a StatefulSet to deploy this in Kubernetes. Among other things, this has the property that each Pod will get its own PersistentVolumeClaim for storage, and that the names of individual Pods are predictable and sequential. You should create a matching headless Service and then each Pod will have a matching DNS name.
That solves a couple of parts of the riddle:
# You pick this
wsrep_cluster_name="mariadb-cluster"
# You know what all of these DNS names will be up front
wsrep_cluster_address="gcomm://galera-0.galera.default.svc.cluster.local,...,galera-9.galera.default.svc.cluster.local"
For wsrep_node_name, the MariaDB documentation indicates that it defaults to the host name. In Kubernetes, the host name defaults to the pod name, and the pod name is one of the sequential galera-n for pods managed by a StatefulSet, so you don't need to manually set this.
wsrep_node_address is trickier. Here the documentation indicates that there are heuristics to guess it (with a specific caveat that it might not be reliable for containers). You can't know an individual pod's IP address before it's created. You can in principle use the downward API to inject a pod's IP address into an environment variable. I'd start by hoping the heuristics would guess the pod IP address and this works well enough (it is what the headless Service would ultimately resolve to).
That leaves you with the block above in the ConfigMap, and it's global across all of the replicas. The other remaining per-Galera-node values should be automatically guessable.
I have installed a kubernetes cluster of 10 nodes(2 masters, 3 etcds, 5 minions) using Kubespray. Presently my cluster supports Token based authentication. I want to add Basic-Auth capability as well.
I couldn't find any higher level resource than Pod in kube-system namespace. So tried manually updating the pod. Added known_users.csv in specified location and updated kube-apiserver.manifest file in one of master nodes and tried updating the pod using kubectl apply, which resulted in that master node going offline.
Is there a way to update this config after deploying cluster, as i don't want to re-spin whole cluster just to enable this?
Related question
What should I do with pods after adding a node to the Kubernetes cluster?
I mean, ideally I want some of them to be stopped and started on the newly added node. Do I have to manually pick some for stopping and hope that they'll be scheduled for restarting on the newly added node?
I don't care about affinity, just semi-even distribution.
Maybe there's a way to always have the number of pods be equal to the number of nodes?
For the sake of having an example:
I'm using juju to provision small Kubernetes cluster on AWS. One master and two workers. This is just a playground.
My application is apache serving PHP and static files. So I have a deployment, a service of type NodePort and an ingress using nginx-ingress-controller.
I've turned off one of the worker instances and my application pods were recreated on the one that remained working.
I then started the instance back, master picked it up and started nginx ingress controller there. But when I tried deleting my application pods, they were recreated on the instance that kept running, and not on the one that was restarted.
Not sure if it's important, but I don't have any DNS setup. Just added IP of one of the instances to /etc/hosts with host value from my ingress.
descheduler, a kuberenets incubator project could be helpful. Following is the introduction
As Kubernetes clusters are very dynamic and their state change over time, there may be desired to move already running pods to some other nodes for various reasons:
Some nodes are under or over utilized.
The original scheduling decision does not hold true any more, as taints or labels are added to or removed from nodes, pod/node affinity requirements are not satisfied any more.
Some nodes failed and their pods moved to other nodes.
New nodes are added to clusters.
There is automatic redistribution in Kubernetes when you add a new node. You can force a redistribution of single pods by deleting them and having a host based antiaffinity policy in place. Otherwise Kubernetes will prefer using the new node for scheduling and thus achieve a redistribution over time.
What are your reasons for a manual triggered redistribution?