Handling Data Consistency In Pods in GKE - kubernetes

I have my data in github stored in JSON format. My pods clone this repo and use them and whenever an update is made to these data, a git hook is fired and the expectation is my pods to update with the recent data(by giving git pull). I have exposed this update service via load balancer and configured the same in githook.
Howver, when git hook fires, only one of the pods gets the request and does a git pull. Is there a way to notify all my pods under that service to update their local store?
So to achieve that I looked for some kind of shared storage which can be mounted in all the containers running in the Kubernetes cluster. eg .Google Cloud File Store equivalent to AWS EFS. So whenever there is a new commit in Github, the load balancer will ask one of the container to update the File Store. Since this is the same file store which is mounted in all the containers, they all will serve the latest data.
But,
1. Cloud File Store is still in Beta not in GA.
How does one solve this problem in a kuberentes environment?

If you are asking for a way to setup a common volume in kubernetes with multiple pods to read, You can setup a NFS pod like explained at This Official Example
I use this for my Jenkins setup in kubernetes and it does the job good.

There are 2 ways you can try:
1. every pod (via cron job) tries to pull data out of central storage, say once a minute and update is working directory when updates available.
2. the central server pushes updates to pods individually (load balancing here is not appropriate).
You can also think of implementing that via Deployments. As mentioned in another answer,
NFS can be useful in your sharing purpose.

Related

Kubernetes Job to create a volume snapshot

I have a job, which I want to run regularly in Kubernetes 1.19.3 (DigitalOcean).
For this job, I need to take a snapshot of a PVC and do stuff to it. I know how can I run a job and mount a volume to the pod it runs, but I have a hard time finding out how to take that snapshot at the beginning of this job.
Is there any way to do it?
The tool of choice to take PV snapshots in K8s is VolumeSnapshots.
The trouble with them is that they don't come yet) with functionality for periodic triggering. So, you would have to create them from a K8s CronJob. However, doing so is not terribly straight forward, since your CronJob Pod would need to have a K8s client installed and require access to the K8s API Server with RBAC.
There are a couple of options to get there, reaching from writing your own image from scratch to using open-source solutions based on the clients from this project k8s client libraries.
Seeing that dynamic K8s manifest applying is somewhat badly supported by K8s, I actually started an open source project myself, that you could use for this purpose: K8sCrud.

Sharing files between pods

I have a service that generates a picture. Once it's ready, the user will be able to download it.
What is the recommended way to share a storage volume between a worker pod and a backend service?
In general the recommended way is "don't". While a few volume providers support multi-mounting, it's very hard to do that in a way that isn't sadmaking. Preferably use an external services like AWS S3 for hosting the actual file content and store references in your existing database(s). If you need a local equivalent, check out Minio for simple cases.
Personally i will not recommended it to do. better then that you two container side one pod if having dependency on each other. so if one pod goes fail that file manager also delete and create at particular time if needed

Master - Worker Node communication in kubernetes

I have 4 worker nodes and 1 master in kubernetes cluster. I made daemon-set deployment from the master and it starts its pods on all the worker nodes. I have script which keeps running in background which basically monitors a git repository and checks if needs to be pulled. If yes, then it pulls new changes to local. Pods can only read the local files at once when it starts and then keep using those configuration. I want to somehow restart the pod on that worker so that it picks up the new changes.
Is there any way, we can notify about the new changes to the master so that master can restart the pod.? or
Master can keep track of the git repo and send the new changes to that worker as well as restart the pod.?
Is there any other way of achieving this functionality. ?
Setting a CronJob on a master node and creating a persistent volume which would be shared between the pods might help. This way all that happens on the master would be passed to the pods, and the pods would be able to read the configuration file from the volume.
You can find an example of CronJob in Kubernetes documentation.
It would be better to follow the containerization philosophy to set up CI/CD tools (for example Jenkins/Bamboo/TeamCity) to automate this. They have built-in functions that would perform the tasks you need.

decentralised, updatable configuration with kubernetes

I need to keep some configuration maybe files or otehwrise in all instances of a kubernetes docker image deployment.
I need the ability to remotely update the configuration in all of the running pods of the deployment. This is to be followed by invocation of some java code in all of the running pods of the docker image deployment.
Whenever a new pod comes up of the same docker image deployment it should have the updated configuration.
I dont want the configuration stored anywhere centrally as much as possible. Want it in each pod of the docker image deployment.
What are my choices?
As a last resort I could do it as a rolling deployment update.
R
Rolling deployment, or similar- update to a mounted config map, etc- is the kubernetes option. Always results in an application restart.
Having an application support live configuration updates, running some code after receiving those updates, without restart- that's an application feature.
Handwavy way of doing this-
Have the correct configuration live in a ConfigMap.
Have the application listen on a separate port for either a signal to retrieve updated configuration (if the application is k8s aware) or to actually receive the configuration bits themselves. Have the application be able to handle this live configuration update process, the difficulty of which depends on the framework in use.
Have another application be responsible for delivering these updates- watch for changes to the ConfigMap, get the list of Pods in the deployment, deliver either a signal or the updated configuration to each of the Pods.
Have the first application not get to what k8s recognizes as Ready state without having received updated configuration from the second.

How to get files into pod?

I have a fully functioning Kubernetes cluster with one master and one worker, running on CoreOS.
Everything is working and my pods and services are running fine. Now I have no clue how to proceed in a webserver idea.
Before I go further: I have no configs at the moment about my idea I am going to explain. I just did a lot of research.
When setting up a pod (nginx) with a service. You get the default nginx page. After that you can setup a mount volume with a hostvolume (volume mapping from host to container).
But lets say I want to seperate every site (multiple sites separated with different pods), how can I let my users add files to their pod/nginx document root? Having FTP in the CoreOS node removes the Kubernetes way and adds security vulnerabilities.
If someone can help me shed some light on this issue, that would be great.
Thanks for your time.
I'm assuming that you want to have multiple nginx servers running. The content of each nginx server is managed by a different admin (you called them users).
TL;DR:
Option 1: Each admin needs to build their own nginx docker image every time the static files change and deploy that new image. This is if you consider these static files as a part of the source-code of the nginx application
Option 2: Use a persistent volume for nginx, the init-script for the nginx image should use something like s3 to sync all its files with s3 and then start nginx
Before you proceed with building an application with kubernetes. The most important thing is to separate your services into 2 conceptual categories, and give up your desire to touch the underlying nodes directly:
1) Stateless: These are services that are built by the developers and can be released. They can be stopped, started, moved from one node to another, their filesystem can be reset during restart and they will work perfectly fine. Majority of your web-services will fit this category.
2) Stateful: These services cannot be stopped and restarted willy nilly like the ones above. Primarily, their underlying filesystem must be persistent and remain the same across runs of the service. Databases, file-servers and similar services are in this category. These need special care and should use k8s persistent-volumes and now stateful-sets.
Typical application:
nginx: build the nginx.conf into the docker image, and deploy it as a stateless service
rails/nodejs/python service: build the source code into the docker image, configure with env-vars, deploy as a stateless service
database: mount a persistent volume, configure with env-vars, deploy as a stateful service.
Separate sites:
Typically, I think at a k8s deployment and a k8s service level. Each site can be one k8s deployment and k8s service set. You can then have separate ways to expose them (different external DNS/IPs)
Application users storing files:
This is firmly in the category of a stateful service. Use a persistent volume to mount to a /media kind of directory
Developers changing files:
Say developers or admins want to use FTP to change the files that nginx serves. The correct pattern is to build a docker image with the new files and then use that docker image. If there are too many files, and you don't consider those files to be a part of the 'source' of the nginx, then use something like s3 and a persistent volume. In your docker image init script, don't directly start nginx. Contact s3, sync all your files onto your persistent volume, then start nginx.
While the options and reasoning listed by iamnat are right, there's at least one more option to add to the list. You could consider using ConfigMap objects, maintain your file within the configmap and mount them to your containers.
A good example can be found in the official documentation - check the Real World Example configuring Redis section to get some actionable input.