Designing K8 pod and proceses for initialization - kubernetes

I have a problem statement where in there is a Kubernetes cluster and I have some pods running on it.
Now, I want some functions/processes to run once per deployment, independent of number of replicas.
These processes use the same image like the image in deployment yaml.
I cannot use initcontainers and sidecars, because they will run along with main container on pod for each replica.
I tried to create a new image and then a pod out of it. But this pod keeps on running, which is not good for cluster resource, as it should be destroyed after it has done its job. Also, the main container depends on the completion on this process, in order to run the "command" part of K8 spec.
Looking for suggestions on how to tackle this?

Theoretically, You could write an admission controller webhook for intercepting create/update deployments and triggering your functions as you want. If your functions need to be checked, use ValidatingWebhookConfiguration for validating the process and then deny or accept commands.

Related

Configure pod liveness after pod was created

I am using Spark, which has a predefined script to create a pod in my kubernetes cluster.
After the pod is created and running, I want to check if it's still alive. I could do this by using a livenessProbe, however this is configured in the configuration file for the Pod, which I do not have control over, as my pod is created by Spark and I cannot change its config file.
So my question is, after the pod has been already created and running, how can I change the configuration for it so that is uses livenessProbe?
Or is there any other way to check the liveness of the pod?
I am a beginner to Kubernetes, sorry for this question!
After a Pod is created you can't change the livenessProe definition.
You could use a second Pod to report on the status of your workload, if that works for your use case.
The other option is to use a Mutating Admission Controller to modify the Pod definition from your Spark script, though I would consider this not exactly beginner friendly.
https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook
https://www.trion.de/news/2019/04/25/beispiel-kubernetes-mutating-admission-controller.html
https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/

Initcontainer vs Helm Hook post-install

What is a difference between Helm Hooks post-install and Kubernetes initcontainers? What I am understood is that Hooks are used to define some actions during different stages of Pod lifecycle, in that case - post-install, and initcontainers, on the other hand, allow to initialize the container before it is deployed to Pod.
post-install and initcontainer as I understand allow to do the same thing, i.e. initialize a database.
Is that correct? Which is the better approach?
For database setup I would prefer a Helm hook, but even then there are some subtleties.
Say your service is running as a Deployment with replicas: 3 for a little bit of redundancy. Every one of these replicas will run an init container, if it's specified in the pod spec, without any sort of synchronization. If one of the pods crashes, or its node fails, its replacement will run the init container again. For the sort of setup task you're talking about, you don't want to repeat it that often.
The fundamental difference here is that a Helm hook is a separate Kubernetes object, typically a Job. You can arrange for this Job to be run exactly once on each helm upgrade and at no other times, which makes it a reasonable place to run things like migrations.
The one important subtlety here is that you can have multiple versions of a service running at once. Take the preceding Deployment with replicas: 3, but then helm upgrade --set tag=something-newer. The Deployment controller will first start a new pod with the new image, and only once it's up and running will it tear down an old pod, and now you have both versions going together. Similar things will happen if you helm rollback to an older version. This means you need some tolerance for the database not quite having the right schema.
If the job is more like a "seed" job that preloads some initial data, this is easier to manage: do it in a post-install hook, which you expect to run only once ever. You don't need to repeat it on every upgrade (as a post-upgrade hook) or on every pod start (as an init container).
Helm install hook and initcontainer are fundamentally different.Install hooks in helm creates a completely separate pod altogether which means that pod will not have access to main pods directly using localhost or they cannot use same volume mount etc while initcontainer can do so.
Init container which is comparable to helm pre install hooks is limited in way because it can only do initial tasks before the main pod is started and can not do any tasks which need to be executed after the pod is started for example any clean up activity.
Initialization of DB etc needs to be done before the actual container is started and I think initcontainer is sufficient enough for this use case but a helm pre install hook can also be used.
You need to use post hook, since first you have to create a db pod and then initialize db. You will notice that a pod for post hook comes up after the db pod starts running. The post hook pod will be removed after the hook is executed.

Running init container on deployment update

What triggers init container to be run?
Will editing deployment descriptor (or updating it with helm), for example, changing the image tag, trigger the init container?
Will deleting the pod trigger the init container?
Will reducing replica set to null and then increasing it trigger the init container?
Is it possible to manually trigger init container?
What triggers init container to be run?
Basically initContainers are run every time a Pod, which has such containers in its definition, is created and reasons of creation of a Pod can be quite different. As you can read in official documentation init containers run before app containers in a Pod and they always run to completion. If a Pod’s init container fails, Kubernetes repeatedly restarts the Pod until the init container succeeds. So one of the things that trigger starting an initContainer is, among others, previous failed attempt of starting it.
Will editing deployment descriptor (or updating it with helm), for
example, changing the image tag, trigger the init container?
Yes, basically every change to Deployment definition that triggers creation/re-creation of Pods managed by it, also triggers their initContainers to be run. It doesn't matter if you manage it by helm or manually. Some slight changes like adding for example a new set of labels to your Deployment don't make it to re-create its Pods but changing the container image for sure causes the controller (Deployment, ReplicationController or ReplicaSet) to re-create its Pods.
Will deleting the pod trigger the init container?
No, deleting a Pod will not trigger the init container. If you delete a Pod which is not managed by any controller it will be simply gone and no automatic mechanism will care about re-creating it and running its initConainers. If you delete a Pod which is managed by a controller, let's say a replicaSet, it will detect that there are less Pods than declared in its yaml definition and it will try to create such missing Pod to match the desired/declared state. So I would like to highlight it again that it is not the deletion of the Pod that triggers its initContainers to be run, but Pod creation, no matter manual or managed by the controller such as replicaSet, which of course can be triggered by manual deletion of the Pod managed by such controller.
Will reducing replica set to null and then increasing it trigger the
init container?
Yes, because when you reduce the number of replicas to 0, you make the controller delete all Pods that fall under its management. When they are re-created, all their startup processes are repeated including running initContainers being part of such Pods.
Is it possible to manually trigger init container?
As #David Maze already stated in his comment The only way to run an init container is by creating a new pod, but both updating a deployment and deleting a deployment-managed pod should trigger that. I would say it depends what you mean by the term manually. If you ask whether this is possible to trigger somehow an initContainer without restarting / re-creating a Pod - no, it is not possible. Starting initContainers is tightly related with Pod creation or in other words with its startup process.
Btw. all what you're asking in your question is quite easy to test. You have a lot of working examples in kubernetes official docs that you can use for testing different scenarios and you can also create simple initContainer by yourself e.g. using busybox image which only task is to sleep for the required number of seconds. Here you have some useful links from different k8s docs sections related to initContainers:
Init Containers
Debug Init Containers
Configure Pod Initialization

Kubernetes: Is there a way to update "nodeAffinity" and "schedulerName" in Pod configuration via the Kubernetes API when the Pod is in Pending state?

I have some specific pod-scheduling needs. So I am writing my own custom scheduler.
However, I wonder if the custom scheduler can simply update the pod configuration, specifically the "nodeAffinity" and "schedulerName" fields, so the scheduling can be delegated back to kube-scheduler.
This way, the custom scheduler need not emulate the entire kube-scheduler, and can make use of any updates in kube-scheduler in the future versions.
This is why I asked if a Job or Pod running in the cluster can update the nodeAffinity and schedulerName fields of the pod configuration - say via the Kubernetes API or any other way?
you can modify the validating part of validation.go. i did that and succeed。

Does Kubernetes provide a colocated Job container?

I wonder how would one implement a colocated auxiliary container in a Pod within a Deployment which does not provide a service but rather a job/batch workload?
Background of my questions is, that I want to deploy a scalable service at which each instance needs configuration after its start. This configuration is done via a HTTP POST to its local colocated service instance. I've implemented a auxiliary container for this in order to benefit from the feature of colocation. So the auxiliary container always knows which instance needs to be configured.
Problem is, that the restartPolicy needs to be defined at the Pod level. I am looking for something like restart policy always for the service and a different restart policy onFailurefor the configuration job.
I know that k8s provides the Job resource for such workloads. But is there an option to colocate those jobs to Pods?
Furthermore I've stumbled across the so called init containers which might be defined via annotations. But these suffer the drawback, that k8s ensures that the actual Pod is only started after the init container did run. So for my very scenario it seems unsuitable.
As I understand you need your service running to configure it.
Your solution is workable and you can set restartPolicy: always you just need a way to tell your one off configuration container that it already ran. You could create and attach an emptyDir volume to your configuration container, create a file on it to mark your configuration successful and check for this file from your process. After your initialization you enter sleep in a loop. The downside is that some resources will be taken up by that container too.
Or you can just add an extra process in the same container and do the configuration (maybe with the file mentioned above as a guard to avoid configuring twice). So write a simple shell script like this and run it instead of your main process:
#!/bin/sh
(
[ -f /mnt/guard-vol/stamp ] && exit 0
/opt/my-config-process parameters && touch /mnt/guard-vol/stamp
) &
exec /opt/my-main-process "$#"
Alternatively you could implement a separate pod that queries the kubernetes API for pods of your service with label configured=false. Configure it and remove the label with the API. You should also modify your Service to select configured=true pods.