Master - Worker Node communication in kubernetes - kubernetes

I have 4 worker nodes and 1 master in kubernetes cluster. I made daemon-set deployment from the master and it starts its pods on all the worker nodes. I have script which keeps running in background which basically monitors a git repository and checks if needs to be pulled. If yes, then it pulls new changes to local. Pods can only read the local files at once when it starts and then keep using those configuration. I want to somehow restart the pod on that worker so that it picks up the new changes.
Is there any way, we can notify about the new changes to the master so that master can restart the pod.? or
Master can keep track of the git repo and send the new changes to that worker as well as restart the pod.?
Is there any other way of achieving this functionality. ?

Setting a CronJob on a master node and creating a persistent volume which would be shared between the pods might help. This way all that happens on the master would be passed to the pods, and the pods would be able to read the configuration file from the volume.
You can find an example of CronJob in Kubernetes documentation.
It would be better to follow the containerization philosophy to set up CI/CD tools (for example Jenkins/Bamboo/TeamCity) to automate this. They have built-in functions that would perform the tasks you need.

Related

Kubernetes cluster recovery after linux host reboot

We are still in a design phase to move away from monolithic architecture towards Microservices with Docker and Kubernetes. We did some basic research on Docker and Kubernetes and got some understanding. We still have couple of open question considering we will be creating K8s cluster with multiple Linux hosts (due to some reason we can't think about Cloud right now) .
Consider a scenario where we have K8s Cluster spanning over multiple linux hosts (5+).
1) If one of the linux worker node crashes and once we bring it back, does enabling kubelet as part of systemctl in advance will be sufficient to bring up required K8s jobs so that it be detected by master again?
2) I believe once worker node is crashed (X pods), after the pod eviction timeout master will reschedule those X pods into some other healthy node(s). Once the node is UP it won't do any deployment of X pods as master already scheduled to other node but will be ready to accept new requests from Master.
Is this correct ?
Yes, should be the default behavior, check your Cluster deployment tool.
Yes, Kubernetes handles these things automatically for Deployments. For StatefulSets (with local volumes) and DaemonSets things can be node specific and Kubernetes will wait for the node to come back.
Better to create a test environment and see/test the failure scenarios

Why kubernetes taints the master node with "NoSchedule" by default?

A few days ago, I looked up why none of pods are being scheduled to the master node, and found this question: Allow scheduling of pods on Kubernetes master?
It tells that it is because the master node is tainted with "NoSchedule" effect, and gives the command to remove that taint.
But before I execute that command on my cluster, I want to understand why it was there in the first place.
Is there a reason why the master node should not run pods? Any best-practices it relates to?
The purpose of kubernetes is to deploy application easily and scale them based on the demand. The pod is a basic entity which runs the application and can be increased and decreased based on high and low demands respectively (Horizontal Pod Autoscalar).
These worker pods needs to be run on worker nodes specially if you’re looking at big application where your cluster might scale upto 100’s of nodes based on demand (Cluster Autoscalar). These increasing pods can put up pressure on your nodes and once they do you can always increase the worker node in cluster using cluster autoscalar.
Suppose, you made your master schedulable then the high memory and CPU pressure put your master at risk of crashing the master. Mind you can’t autoscale the master using autoscalar. This way you’re putting your whole cluster at risk. If you have single master then your will not be able to schedule anything if master crashed. If you have 3 master and one of them crashed, then the other two master has to take the extra load of scheduling and managing worker nodes and increasing the load on themselves and hence the increased risk of failure
Also, In case of larger cluster, you already need the master nodes with high resources just to manage your worker nodes. You can’t put additional load on master nodes to run the workload as well in that case. Please have a look at the setting up large cluster in kubernetes here
If you have manageable workload and you know it doesn’t increase beyond a certain level. You can make master schedulable. However for production cluster it is not recommended at all.
Primary role of master is cluster management. Already many components of k8 are running on master.Suppose If pods scheduled on master without limit of resources and pods are consuming all the resources( cpu or memory), then master and in turn whole cluster will be at risk.
So while designing Highly Available production cluster minimum 3 master, 3 etcd, 3 infra node are created and application pods are not scheduled on these nodes. Separate worker nodes added to assign workload.
Master is intended for cluster management tasks and should not be used to run workloads. In development and test environments it is ok to schedule pods on master servers but in production better to keep it only for cluster level management activities. Use workers or nodes to schedule workloads

Handling Data Consistency In Pods in GKE

I have my data in github stored in JSON format. My pods clone this repo and use them and whenever an update is made to these data, a git hook is fired and the expectation is my pods to update with the recent data(by giving git pull). I have exposed this update service via load balancer and configured the same in githook.
Howver, when git hook fires, only one of the pods gets the request and does a git pull. Is there a way to notify all my pods under that service to update their local store?
So to achieve that I looked for some kind of shared storage which can be mounted in all the containers running in the Kubernetes cluster. eg .Google Cloud File Store equivalent to AWS EFS. So whenever there is a new commit in Github, the load balancer will ask one of the container to update the File Store. Since this is the same file store which is mounted in all the containers, they all will serve the latest data.
But,
1. Cloud File Store is still in Beta not in GA.
How does one solve this problem in a kuberentes environment?
If you are asking for a way to setup a common volume in kubernetes with multiple pods to read, You can setup a NFS pod like explained at This Official Example
I use this for my Jenkins setup in kubernetes and it does the job good.
There are 2 ways you can try:
1. every pod (via cron job) tries to pull data out of central storage, say once a minute and update is working directory when updates available.
2. the central server pushes updates to pods individually (load balancing here is not appropriate).
You can also think of implementing that via Deployments. As mentioned in another answer,
NFS can be useful in your sharing purpose.

How to properly use Kubernetes for job scheduling?

I have the following system in mind: A master program that polls a list of tasks to see if they should be launched (based on some trigger information). The tasks themselves are container images in some repository. Tasks are executed as jobs on a Kubernetes cluster to ensure that they are run to completion. The master program is a container executing in a pod that is kept running indefinitely by a replication controller.
However, I have not stumbled upon this pattern of launching jobs from a pod. Every tutorial seems to be assuming that I just call kubectl from outside the cluster. Of course I could do this but then I would have to ensure the master program's availability and reliability through some other system. So am I missing something? Launching one-off jobs from inside an indefinitely running pod seems to me as a perfectly valid use case for Kubernetes.
Your master program can utilize the Kubernetes client libraries to preform operations on a cluster. Find a complete example here.

How can I protect my GKE cluster against master node failure?

In GKE every cluster has a single master endpoint, which is managed by Google Container Engine. Is this master node high available?
I deploy a beautiful cluster of redundant nodes with kubernetes but what happen if the master node goes down? How can i test this situation?
In Google Container Engine the master is managed for you and kept running by Google. According to the SLA for Google Container Engine the master should be available at least 99.5% of the time.
In addition to what Robert Bailey said about GKE keeping the master available for you, it's worth noting that Kubernetes / GKE clusters are designed (and tested) to continue operating properly in the presence of failures. If the master is unavailable, you temporarily lose the ability change what's running in the cluster (i.e. schedule new work or modify existing resources), but everything that's already running will continue working properly.