Enable PodTolerationRestriction on gcloud k8s cluster - kubernetes

I have a namespace in k8s with the setting:
scheduler.alpha.kubernetes.io/defaultTolerations:
'[{"key": "role_va", "operator": "Exists"}]'
If I am not mistaken all pods that are created in this namespace must get this toleration.
But the pods don't get it.
I read this and understood that I must enable the PodTolerationRestriction controller.
How can I do this on gloud?

In order to enable PodTolerationRestriction you might be required to set --enable-admission-plugins flag in kube-apiserver configuration. This is according to the official documentation, as by default this plugin is not included in admission controller plugins list.
However, in GKE there is no possibility to adapt any specific flag for the current API server run-time configuration, because Kubernetes cluster engine core components are not exposed to any user purpose actions (related Stackoverflow thread).
Assuming that, you can consider using GCE and bootstrap cluster with any cluster building solutions, depending on your preference, within a particular GCE VM.

Related

How to add external GCP loadbalancer to kubespray cluster?

I deployed a kubernetes cluster on Google Cloud using VMs and Kubespray.
Right now, I am looking to expose a simple node app to external IP using loadbalancer but showing my external IP from gcloud to service does not work. It stays on pending state when I query kubectl get services.
According to this, kubespray does not have any loadbalancer mechanicsm included/integrated by default. How should I progress?
Let me start of by summarizing the problem we are trying to solve here.
The problem is that you have self-hosted kubernetes cluster and you want to be able to create a service of type=LoadBalancer and you want k8s to create a LB for you with externlIP and in fully automated way, just like it would if you used a GKE (kubernetes as a service solution).
Additionally I have to mention that I don't know much of a kubespray, so I will only describe all the steps that need to bo done to make it work, and leave the rest to you. So if you want to make changes in kubespray code, it's on you.
All the tests I did with kubeadm cluster but it should not be very difficult to apply it to kubespray.
I will start of by summarizing all that has to be done into 4 steps:
tagging the instances
enabling cloud-provider functionality
IAM and service accounts
additional info
Tagging the instances
All worker node instances on GCP have to be labeled with unique tag that is the name of an instance; these tags are later used to create a firewall rules and target lists for LB. So lets say that you have an instance called worker-0; you need to tag that instance with a tag worker-0
Otherwise it will result in an error (that can be found in controller-manager logs):
Error syncing load balancer: failed to ensure load balancer: no node tags supplied and also failed to parse the given lists of hosts for tags. Abort creating firewall rule
Enabling cloud-provider functionality
K8s has to be informed that it is running in cloud and what cloud provider that is so that it knows how to talk with the api.
controller manager logs informing you that it wont create an LB.
WARNING: no cloud provider provided, services of type LoadBalancer will fail
Controller Manager is responsible for creation of a LoadBalancer. It can be passed a flag --cloud-provider. You can manually add this flag to controller manager pod manifest file; or like in your case since you are running kubespray, you can add this flag somewhere in kubespray code (maybe its already automated and just requires you to set some env or sth, but you need to find it out yourself).
Here is how this file looks like with the flag:
apiVersion: v1
kind: Pod
metadata:
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
...
- --cloud-provider=gce # <----- HERE
As you can see the value in our case is gce, which stangs for Google Compute Engine. It informs k8s that its running on GCE/GCP.
IAM and service accounts
Now that you have your provider enabled, and tags covered, I will talk about IAM and permissions.
For k8s to be able to create a LB in GCE, it needs to be allowed to do so. Every GCE instance has a deafult service account assigned. Controller Manager uses instance service account, stored within instance metadata to access GCP API.
For this to happen you need to set Access Scopes for GCE instance (master node; the one where controller manager is running) so it can use Cloud Engine API.
Access scopes -> Set access for each API -> compute engine=Read Write
To do this the instance has to be stopped, so now stop the instance. It's better to set these scopes during instance creation so that you don't need to make any unnecessary steps.
You also need to go to IAM & Admin page in GCP Console and add permissions so that master instance's service account has Kubernetes Engine Service Agent role assigned. This is a predefined role that has much more permissions than you probably need but I have found that everything works with this role so I decided to use is for demonstration purposes, but you probably want to use least privilege rule.
additional info
There is one more thing I need to mention. It does not impact you but while testing I have found out an interesting thing.
Firstly I created only one node cluster (single master node). Even though this is allowed from k8s point of view, controller manager would not allow me to create a LB and point it to a master node where my application was running. This draws a conclusion that one cannot use LB with only master node and has to create at least one worker node.
PS
I had to figure it out the hard way; by looking at logs, changing things and looking at logs again to see if the issue got solved. I didn't find a single article/documentation page where it is documented in one place. If you manage to solve it for yourself, write the answer for others. Thank you.

pass different configmap or env property to pod depending on which node it's running on

I'm trying to configure a pod that can run in several nodes of my kubernetes cluster.
Depending on the node chosen to schedule/run the pod, the configuration needed by the pod is different.
For example, if the pod ends up running in node1, configuration value (e.g. a username to connect to some outside service) needs to be "user-node1".
If the same pod ends up running in node2, I need that configuration value to be different (e.g. "user-node2".
Ideally, my cluster/app configuration will be just one, where both user-node1 and user-node2 are present somewhere, and when my pod is scheduled to run in node1 or node2 then the appropriate configuration is passed to the pod (I'm trying to have the pod unaware of the fact there are multiple per-node configurations).
Is there any way to achieve this?
It's quite specific scenario. You can try to use one of Admission Controllers - MutatingAdmissionWebhook.
An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized. The controllers consist of the list below, are compiled into the kube-apiserver binary, and may only be configured by the cluster administrator. In that list, there are two special controllers: MutatingAdmissionWebhook and ValidatingAdmissionWebhook. These execute the mutating and validating (respectively) admission control webhooks which are configured in the API.
Admission controllers may be "validating", "mutating", or both. Mutating controllers may modify the objects they admit
In this option you would need to inject some lables or annotations to pod and then specify them also in your configmap.
This solution is used for example in Istio and was mentioned in this case.
Another option is to create some kind of Operator which will monitor events in namespaces and will modify pod.
This solution would look like in this article but instead of monitoring nodes in nodepool, it would need to monitor pods in namespaces.

GKE and NodeLocal DNSCache

We have a deployment of Kubernetes in Google Cloud Platform. Recently we hit one of the well known issues related on a problem with the kube-dns that happens at high amount of requests https://github.com/kubernetes/kubernetes/issues/56903 (its more related to SNAT/DNAT and contract but the final result is out of service of kube-dns).
After a few days of digging on that topic we found that k8s already have a solution witch is currently in alpha (https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/)
The solution is to create a caching CoreDNS as a daemonset on each k8s node so far so good.
Problem is that after you create the daemonset you have to tell to kubelet to use it with --cluster-dns option and we cant find any way to do that in GKE environment. Google bootstraps the cluster with "configure-sh" script in instance metadata. There is an option to edit the instance template and "hardcode" the required values but that is not an option if you upgrade the cluster or use the horizontal autoscaling all of the modified values will be lost.
The last idea was to use custom startup script that pull configuration and update the metadata server but this is a too complicated task.
As of 2019/12/10, GKE now supports through the gcloud CLI in beta:
Kubernetes Engine
Promoted NodeLocalDNS Addon to beta. Use --addons=NodeLocalDNS with gcloud beta container clusters create. This addon can be enabled or disabled on existing clusters using --update-addons=NodeLocalDNS=ENABLED or --update-addons=NodeLocalDNS=DISABLED with gcloud container clusters update.
See https://cloud.google.com/sdk/docs/release-notes#27300_2019-12-10
You can spin up another kube-dns deployment e.g. in different node-pool and thus having 2x nameserver in the pod's resolv.conf.
This would mitigate the evictions and other failures and generally allow you to completely control your kube-dns service in the whole cluster.
In addition to what was mentioned in this answer - With beta support on GKE, the nodelocal caches now listen on the kube-dns service IP, so there is no need for a kubelet flag change.

Use Calico for policy and networking on AWS EKS?

AWS EKS makes use of their own CNI plugin and there are docs that allow you to install Calico for managing policy. For a number of reasons, I'd like to have Calico manage networking as well.
Based on the installation instructions I can't seem to find a way to do either option:
etcd
Doesn't seem viable as I can't find a way to access the EKS control plane etcd endpoints. If I were to deploy my own etcd pods inside the cluster, I need to use the AWS CNI plugin for those to get an IP address, so that doesn't work. I could bring my own etcd cluster outside of Kubernetes, but that seems a bit ridiculous.
Kubernetes API datastore
This option wants me to change setting to the controller which I don't have access to in the AWS EKS managed control plane.
The short answer is as of this writing EKS (nor GKE) doesn't give you direct access to any of the control plane components: etcd, kube-apiserver, kube-controller-manager, coredns/kube-dns, kube-scheduler.
They do have some docs on how to install Calico on an EKS cluster, but if you want more control you'll have to set up your own standalone cluster.
They might allow you access to the master components in the future but the bottom line is that EKS is a 'managed' service where they are supposed to take care of all your control plane components.

Change kubernetes master env variable on GKE

I want to enable Stackdriver logging with my Kubernetes cluster on GKE.
It's stated here: https://kubernetes.io/docs/user-guide/logging/stackdriver/
This article assumes that you have created a Kubernetes cluster with cluster-level logging support for sending logs to Stackdriver Logging. You can do this either by selecting the Enable Stackdriver Logging checkbox in the create cluster dialogue in GKE, or by setting the KUBE_LOGGING_DESTINATION flag to gcp when manually starting a cluster using kube-up.sh.
But my cluster was created without this option enabled.
How do I change the environment variable while my cluster is running?
Unfortunately, logging isn't a setting that can be enabled/disabled on a cluster once it is running. This is something that we hope to change in the near future, but in the mean time your best bet is to delete and recreate your cluster (sorry!).