Does Kubernetes provide a colocated Job container? - kubernetes

I wonder how would one implement a colocated auxiliary container in a Pod within a Deployment which does not provide a service but rather a job/batch workload?
Background of my questions is, that I want to deploy a scalable service at which each instance needs configuration after its start. This configuration is done via a HTTP POST to its local colocated service instance. I've implemented a auxiliary container for this in order to benefit from the feature of colocation. So the auxiliary container always knows which instance needs to be configured.
Problem is, that the restartPolicy needs to be defined at the Pod level. I am looking for something like restart policy always for the service and a different restart policy onFailurefor the configuration job.
I know that k8s provides the Job resource for such workloads. But is there an option to colocate those jobs to Pods?
Furthermore I've stumbled across the so called init containers which might be defined via annotations. But these suffer the drawback, that k8s ensures that the actual Pod is only started after the init container did run. So for my very scenario it seems unsuitable.

As I understand you need your service running to configure it.
Your solution is workable and you can set restartPolicy: always you just need a way to tell your one off configuration container that it already ran. You could create and attach an emptyDir volume to your configuration container, create a file on it to mark your configuration successful and check for this file from your process. After your initialization you enter sleep in a loop. The downside is that some resources will be taken up by that container too.
Or you can just add an extra process in the same container and do the configuration (maybe with the file mentioned above as a guard to avoid configuring twice). So write a simple shell script like this and run it instead of your main process:
#!/bin/sh
(
[ -f /mnt/guard-vol/stamp ] && exit 0
/opt/my-config-process parameters && touch /mnt/guard-vol/stamp
) &
exec /opt/my-main-process "$#"
Alternatively you could implement a separate pod that queries the kubernetes API for pods of your service with label configured=false. Configure it and remove the label with the API. You should also modify your Service to select configured=true pods.

Related

Designing K8 pod and proceses for initialization

I have a problem statement where in there is a Kubernetes cluster and I have some pods running on it.
Now, I want some functions/processes to run once per deployment, independent of number of replicas.
These processes use the same image like the image in deployment yaml.
I cannot use initcontainers and sidecars, because they will run along with main container on pod for each replica.
I tried to create a new image and then a pod out of it. But this pod keeps on running, which is not good for cluster resource, as it should be destroyed after it has done its job. Also, the main container depends on the completion on this process, in order to run the "command" part of K8 spec.
Looking for suggestions on how to tackle this?
Theoretically, You could write an admission controller webhook for intercepting create/update deployments and triggering your functions as you want. If your functions need to be checked, use ValidatingWebhookConfiguration for validating the process and then deny or accept commands.

Init container to be ran only once per deployment

I use initContainers inside a deployment. An init container is created for each pods.
It is possible to get only one init container for the whole deployment ?
edit:
use case: need to run some db migration commands before creating the pods. So i have put these commands inside init pods.
problems:
- each time a pod is created, the init container is created
- on scale up init container is created
solution:
I have finally found a good example for solving this problem in this article
While it is not possible to have one init container shared by whole deployment (which by definition is impossible as the unit of workload scheduling in kube is a pod), there are things you can use to get a similar functionality easily.
One way to achieve this would be to have an init container on every pod waiting for a specific job to be successfully executed, and a job that would do the actual init logic.
Another one would be to use leader election among your init containers so only one executes and the rest just wait for the lock to be release before they exit successfully.
No, pods are identical so if you use an init container it has to be attached to every pod in your deployment.
You're welcome to comment and tell us more about your use case, because it sounds like the problem you are trying to solve is not the best fit for init containers.

Using initcontainers in a job to do some stuff after pod initialization

I'm currently trying to create a job in Kubernetes in a project I've just joined.
This job should wait until 2 other pods are up and running, then run a .sh script to create some data for the testers in the application.
I've figured out i should use initContainers. However, there is one point i don't understand.
Why should i include some environmental values under env tag under initContainers in job description .yaml file ?
I thought i was just waiting for the pods to be initialised, not creating them again. Am i missing something ?
Thanks in advance.
initContainers are like containers running in the Pod, but executed before the ones defined in the spec key containers.
Like containers, they share some namespaces and IPC, so it means that the Scheduler will check if the declared initContainers are successful, then it will schedule the containers.
Keep in mind that when you create a Pod, basically, you're creating an empty container named pause that will provide a namespace-base for the following containers: so, in the end, the initContainer is not really creating again a new Pod, like its name suggests, it's an initializer.

Is it possible to add/modify kubernetes container spec based on clusterwide setting

I have a kubernetes-based application that uses an operator to build and deploy containers in pods. Sometimes I'd like to run containers in privileged mode to enable performance tracing, but since I'm not deploying the pod/containers directly from a manifest, I cannot simply add privileged mode and the debugfs filesystem mount.
That leaves me to fork the operator code, change where it builds the container spec, and redeploy with the modified operator. Doable, but awkward.
So my question is, is it possible to impose additional attributes to be added to container specs based on some clusterwide setting, either before pods are deployed by the operator? Or to modify the container spec after deployment? I tried that with kubectl edit pod mypod, but that didn't work.
This is on a physical cluster installed with kubespray.
There are three things to consider:
Your operator can create a controller (e.g. Deployment) instead of Pod, which allows modifications in the Pod Spec area, thus triggering Deployment's rollout (see rolling update strategy).
Use MutatingAdmissionWebhook
so before creating the Pod, its manifest would be modified/overwritten on the fly.
More info regarding MutatingAdmissionWebhook can be found here and here.
A workaround solution in a form of modifying the supply spec -> swapping the pod-a.
More about this was discussed here.
Please let me know if any of the above helped.

Deploy a scalable application on Kubernetes which requires each replica Pod to have different args

I am trying to understand how to deploy an application on Kubernetes which requires each Pod of the same deployment to have different args used with the starting command.
I have this application which runs spark on Kubernetes and needs to spawn executor Pods on start. The problem is that each Pod of the application needs to spawn its own executors using its own port and spark app name.
I've read of stateful sets and searched the documentation but I didn't found a solution to my problem. Since every Pod needs to use a different port, I need that port to be declared in a service if I understood correctly, and also directly passed as an argument to the pod command in the args.
Is there a way to obtain this without using multiple deployments, one for each pod I need to create? Because this is the only solution i can think of but it can't be scaled after being deployed.
I'm using Helm to deploy the application, so I can easily create as many deployments and / or services as needed, but I would like to find a solution which can scale at runtime, if possible.
I don't think you can have a Deployment which creates PODs from different Specs. You can't have it in Kubernetes and Helm won't help here (since Helm is just a template manager over Kubernetes configurations).
What you can do is to specify each Pod as a separate configuration (if single Pod, you don't necessarily need Deployment) and let Helm manage it.
Posting the solution I used since it could be useful for other people searching around.
In the end I found a great configuration to solve my problem. I used a StatefulSet to declare the deployment of the Spark application. Associated with the StatefulSet, a headless Service which expose each pod on a specific port.
StatefulSet can declare a property spec.serviceName which can have the same name of a headless service to create a unique network name for each Pod. Something like <pod_name>.<service_name>
Additionally, each Pod has a unique and not-changing name which is created using the application name and an ordinal starting from 0 for each replica Pod.
Using a starting script in the docker image and inserting in the environment of each Pod the pod name from the metadata, I was able to use different configurations for each pod since, even with the same deployment, each pod have their own unique metadata name and I can use the StatefulSet service to obtain what I needed.
This way, the StatefulSet is scalable at run time and works as expected.
hey I am not sure if this will exactly match your scenario but I think this is what you can try. Use a sidecar container to run the replica instances, A sidecar is a container which runs along with the main container and also shares the same namespace and can share volumes across each container.
Now to pass the different arguments to each container or sidecar, you will have to tweak the dockerfile or rather tweak the way your container starts.
Create a start.sh script file which accepts the arguments and starts the container with those arguments, the trick here is to accept the argument from environment variables thus allowing you to later configure these from configmaps or pod env.
So here is an example of php/laravel application running the same code and starting with different arguments. And the start.sh the file looks like this.
#!/bin/sh
if [ "${CONTAINER_ROLE}" = "queue" ];
then
echo "Running the queue..."
php artisan queue:work --queue=${QUEUENAME}
echo "Queue Started"
else
echo "Running Iceberg."
exec apache2-foreground
fi
So a sample dockerfile looks like this
FROM php:7.1.24-apache
COPY . /srv/myapp
...
...
RUN chown -R www-data:www-data /srv/app \
&& a2enmod remoteip && a2enmod rewrite
WORKDIR /srv/app
RUN chmod +x .docker/start.sh
CMD [ "sh",".docker/start.sh"]
Let me know how it goes.