Kubernetes - deployment initialization - how to ensure it happens only once? - kubernetes

I using Kubernetes 1.12. I have a service (e.g. pod) which may have multiple instances (e.g. replicas > 1)
My goal is to perform a maintenance task (e.g. create\upgrade database, generate certificate, etc) before any of service instances are up.
I was considering to use Init Container, but at least as I understand, Init Container will be executed anytime additional replica (pod) is created and worse - that might happen in parallel. In that case, multiple Init Containers might work in parallel and thus corrupt my database and everything else.
I need a clear solution to perform a bootstrap maintenance task only once per deployment. How you would suggest to do that?

I encountered the same problem running db migrations before each deployment. Here's a solution based on a Job resource:
kubectl apply -f migration-job.yml
kubectl wait --for=condition=complete --timeout=60s job/migration
kubectl delete job/migration
kubectl apply -f deployment.yml
migration-job.yml defines a Job configured with restartPolicy: Never and a reasonably low activeDeadlineSeconds. Using kubectl wait ensures that any errors or timeout in migration-job.yml causes the script to fail and thus prevent applying deployment.yml.

One of the ways you could use to retain startup sequence controll would be to use StatefulSet. With sequential startup, next pod will not start untill previous is done, removing parallel init risk.
Personally I would prefer this init to have its own locking mechanism and stick to regular Deploymants.
Remember that you need to take into account not only first startup on Deployment creation, but also cases for rolling releases, scaling, outages etc.

Related

Why does deleting a kubernetes namespace take so long?

I'm attempting to write some integration tests that setup a deployment and an ingress and then make web requests, effectively curl commands, against the ingress to test the configuration of the ingress. Backends and services are also created to gaurantee that the ingress is correctly routing and proxying to the backends.
However, tear down of the setup, to run a new set of tests is slow. By 'teardown' here I mean I simply delete the namespace in which all of these deployments live. This can take quite a while. Why is that? And what are the best ways to quickly tear down such a setup?
Kubernetes works largely through controllers, which loop endlessly looking for small pieces of work to do (like schedule a pod somewhere, unschedule a pod, remove an ingress route, etc); this makes it highly reliable but sometimes comes at the cost of relatively high latency for your operations. Namespace deletions require bringing down all the resources in a cluster, which requires a lot of small steps and therefore can take a while to finish.
There is a --force option for kubectl delete, but it comes with some scary-sounding warnings:
--force=false: If true, immediately remove resources from API and
bypass graceful deletion. Note that immediate deletion of some
resources may result in inconsistency or data loss and requires
confirmation.
So, this probably isn't advisable as a regular thing to do (perhaps someone more familiar with its behavior can add on to this).
Another option is to let the delete proceed asynchronously and just not block your CI jobs on it. The --wait=false flag (by default, set to true) will make sure the request is entered successfully but won't block kubectl from exiting while the delete actually happens. Your namespace will enter the Terminating state and eventually get deleted (unless something prevents it from coming down).
kubectl delete namespace my-test-namespace-1 --wait=false
This does mean that your next CI run may find the namespace is still there. To avoid a conflict, you could use a random suffix or incrementing counter for the namespace's name.

Initcontainer vs Helm Hook post-install

What is a difference between Helm Hooks post-install and Kubernetes initcontainers? What I am understood is that Hooks are used to define some actions during different stages of Pod lifecycle, in that case - post-install, and initcontainers, on the other hand, allow to initialize the container before it is deployed to Pod.
post-install and initcontainer as I understand allow to do the same thing, i.e. initialize a database.
Is that correct? Which is the better approach?
For database setup I would prefer a Helm hook, but even then there are some subtleties.
Say your service is running as a Deployment with replicas: 3 for a little bit of redundancy. Every one of these replicas will run an init container, if it's specified in the pod spec, without any sort of synchronization. If one of the pods crashes, or its node fails, its replacement will run the init container again. For the sort of setup task you're talking about, you don't want to repeat it that often.
The fundamental difference here is that a Helm hook is a separate Kubernetes object, typically a Job. You can arrange for this Job to be run exactly once on each helm upgrade and at no other times, which makes it a reasonable place to run things like migrations.
The one important subtlety here is that you can have multiple versions of a service running at once. Take the preceding Deployment with replicas: 3, but then helm upgrade --set tag=something-newer. The Deployment controller will first start a new pod with the new image, and only once it's up and running will it tear down an old pod, and now you have both versions going together. Similar things will happen if you helm rollback to an older version. This means you need some tolerance for the database not quite having the right schema.
If the job is more like a "seed" job that preloads some initial data, this is easier to manage: do it in a post-install hook, which you expect to run only once ever. You don't need to repeat it on every upgrade (as a post-upgrade hook) or on every pod start (as an init container).
Helm install hook and initcontainer are fundamentally different.Install hooks in helm creates a completely separate pod altogether which means that pod will not have access to main pods directly using localhost or they cannot use same volume mount etc while initcontainer can do so.
Init container which is comparable to helm pre install hooks is limited in way because it can only do initial tasks before the main pod is started and can not do any tasks which need to be executed after the pod is started for example any clean up activity.
Initialization of DB etc needs to be done before the actual container is started and I think initcontainer is sufficient enough for this use case but a helm pre install hook can also be used.
You need to use post hook, since first you have to create a db pod and then initialize db. You will notice that a pod for post hook comes up after the db pod starts running. The post hook pod will be removed after the hook is executed.

Designing K8 pod and proceses for initialization

I have a problem statement where in there is a Kubernetes cluster and I have some pods running on it.
Now, I want some functions/processes to run once per deployment, independent of number of replicas.
These processes use the same image like the image in deployment yaml.
I cannot use initcontainers and sidecars, because they will run along with main container on pod for each replica.
I tried to create a new image and then a pod out of it. But this pod keeps on running, which is not good for cluster resource, as it should be destroyed after it has done its job. Also, the main container depends on the completion on this process, in order to run the "command" part of K8 spec.
Looking for suggestions on how to tackle this?
Theoretically, You could write an admission controller webhook for intercepting create/update deployments and triggering your functions as you want. If your functions need to be checked, use ValidatingWebhookConfiguration for validating the process and then deny or accept commands.

Kubernetes rolling deploy: terminate a pod only when there are no containers running

I am trying to deploy updates to pods. However I want the current pods to terminate only when all the containers inside the pod have terminated and their process is complete.
The new pods can keep waiting to start untill all container in the old pods have completed. We have a mechanism to stop old pods from picking up new tasks and therefore they should eventually terminate.
It's okay if twice the pods exist at some instance of time. I tried finding solution for this in kubernetes docs but wan't successful. Pointers on how / if this is possible would be helpful.
well I guess then you may have to create a duplicate kind of deployment with new image as required and change the selector in service to new deployment, which will prevent external traffic from entering pre-existing pods and new calls can go to new pods. Then later you can check for something like -
Kubectl top pods -c containers
and if the load appears to be static and low, then preferrably you can delete the old pods related deployment later.
But for this thing everytime the service selectors have to be updated and likely for keeping track of things you can append the git commit hash to the service selector to keep it unique everytime.
But rollback to previous versions if required from inside Kubernetes cluster will be difficult, so preferably you can trigger the wanted build again.
I hope this makes some sense !!

Does Kubernetes provide a colocated Job container?

I wonder how would one implement a colocated auxiliary container in a Pod within a Deployment which does not provide a service but rather a job/batch workload?
Background of my questions is, that I want to deploy a scalable service at which each instance needs configuration after its start. This configuration is done via a HTTP POST to its local colocated service instance. I've implemented a auxiliary container for this in order to benefit from the feature of colocation. So the auxiliary container always knows which instance needs to be configured.
Problem is, that the restartPolicy needs to be defined at the Pod level. I am looking for something like restart policy always for the service and a different restart policy onFailurefor the configuration job.
I know that k8s provides the Job resource for such workloads. But is there an option to colocate those jobs to Pods?
Furthermore I've stumbled across the so called init containers which might be defined via annotations. But these suffer the drawback, that k8s ensures that the actual Pod is only started after the init container did run. So for my very scenario it seems unsuitable.
As I understand you need your service running to configure it.
Your solution is workable and you can set restartPolicy: always you just need a way to tell your one off configuration container that it already ran. You could create and attach an emptyDir volume to your configuration container, create a file on it to mark your configuration successful and check for this file from your process. After your initialization you enter sleep in a loop. The downside is that some resources will be taken up by that container too.
Or you can just add an extra process in the same container and do the configuration (maybe with the file mentioned above as a guard to avoid configuring twice). So write a simple shell script like this and run it instead of your main process:
#!/bin/sh
(
[ -f /mnt/guard-vol/stamp ] && exit 0
/opt/my-config-process parameters && touch /mnt/guard-vol/stamp
) &
exec /opt/my-main-process "$#"
Alternatively you could implement a separate pod that queries the kubernetes API for pods of your service with label configured=false. Configure it and remove the label with the API. You should also modify your Service to select configured=true pods.