Can a Job have multiple distinct tasks running in different pods under it? - kubernetes

In a K8S cluster when a Job has multiple pods under it, are these all replicas?
Can a Job have 5 pods running under it and each of the pod is basically a different task?

Yes, there is a provision for running multiple pods under a job either sequentially or in parallel. In the spec section of a Job, you can mention completions and it will be equivalent to the number of pods to be run sequentially.
And if you want to run them in parallel, similarly you can define parallelism and assign some value to this key in the spec part of Job yaml.
The former will take care of pods to complete execution successfully and later will define the limit of pods those can run in parallel.

I also started a discussion about this topic on Kubernetes community forum. Here is a discussion on it: https://discuss.kubernetes.io/t/can-a-job-have-multiple-distinct-tasks-running-in-different-pods-under-it/11575
As per that, a Job can only have a single pod spec. As a result even if a job has multiple pods deployed under it all of them will be mirror images of each other. There needs to a separate arrangement to make each of the pods do different things.

If your steps are sequential and not in parallel, then you can just define multiple containers under the same job.

Related

Run different replica count for different containers within same pod

I have a pod with 2 closely related services running as containers. I am running as a StatefulSet and have set replicas as 5. So 5 pods are created with each pod having both the containers.
Now My requirement is to have the second container run only in 1 pod. I don't want it to run in 5 pods. But my first service should still run in 5 pods.
Is there a way to define this in the deployment yaml file for Kubernetes? Please help.
a "pod" is the smallest entity that is managed by kubernetes, and one pod can contain multiple containers, but you can only specify one pod per deployment/statefulset, so there is no way to accomplish what you are asking for with only one deployment/statefulset.
however, if you want to be able to scale them independently of each other, you can create two deployments/statefulsets to accomplish this. this is imo the only way to do so.
see https://kubernetes.io/docs/concepts/workloads/pods/ for more information.
Containers are like processes,
Pods are like VMs,
and Statefulsets/Deployments are like the supervisor program controlling the VM's horizontal scaling.
The only way for your scenario is to define the second container in a new deployment's pod template, and set its replicas to 1, while keeping the old statefulset with 5 replicas.
Here are some definitions from documentations (links in the references):
Containers are technologies that allow you to package and isolate applications with their entire runtime environment—all of the files necessary to run. This makes it easy to move the contained application between environments (dev, test, production, etc.) while retaining full functionality. [1]
Pods are the smallest, most basic deployable objects in Kubernetes. A Pod represents a single instance of a running process in your cluster. Pods contain one or more containers. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod's resources. [2]
A deployment provides declarative updates for Pods and ReplicaSets. [3]
StatefulSet is the workload API object used to manage stateful applications. Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods. [4]
Based on all that information - this is impossible to match your requirements using one deployment/Statefulset.
I advise you to try the idea #David Maze mentioned in a comment under your question:
If it's possible to have 4 of the main application container not having a matching same-pod support container, then they're not so "closely related" they need to run in the same pod. Run the second container in a separate Deployment/StatefulSet (also with a separate Service) and you can independently control the replica counts.
References:
Documentation about Containers
Documentation about Pods
Documentation about Deployments
Documentation about StatefulSet

How to make sure cronjobs get spread around a Kubernetes cluster?

I need to run cronjobs on a K8s cluster that test the health of nodes. I need to ensure that it will continue to work if there's a network partition.
I've thought of 2 possibilities:
Run multiple replicas of the cronjob on each invocation, and use anti-affinity rules to make sure they're run on different nodes
Find a way of configuring K8s to distribute successive cronjobs on nodes in different zones.
I can't find any information on either. Is this possible?

One pod per DAG instead of one pod per task with Airflow Kubernetes Executor?

We decided to run Airflow on Kubernetes. We would like to make use of the power of Kubernetes, but in a balanced way.
We have some very small tasks in our DAGs, for example create a directory. The KubernetesExecutor spins up a pod for every task, this takes long and therefore is overkill for many short tasks.
My question is, is it possible to configure Airflow to spin up a Kubernetes pod for a whole DAG, instead of a pod per task? (Preferably without Celery)
I do not think it is possible to use one pod per DAG, because KubernetesExecutor is designed to request a pod per task:
When a DAG submits a task, the KubernetesExecutor requests a worker pod from the Kubernetes API. The worker pod then runs the task, reports the result, and terminates.
Maybe combining multiple smaller tasks into one is a way to go.
https://airflow.apache.org/docs/apache-airflow/stable/executor/celery_kubernetes.html
The CeleryKubernetes Executor allows you to use the immediate resources of a celery worker or spin up a pod for a task. I haven’t configured this setup but it seems to match your use case.

Kubernetes batch performance with activation of thousands of pods using jobs

I am writing a pipeline with kubernetes in google cloud.
I need to activate sometimes a few pods in a second, where each pod is a task that runs inside a pod.
I plan to call kubectl run with Kubernetes job and wait for it to complete (poll every second all the pods running) and activate the next step in the pipeline.
I will also monitor the cluster size to make sure I am not exceeding the max CPU/RAM usage.
I can run tens of thousands of jobs at the same time.
I am not using standard pipelines because I need to create a dynamic number of tasks in the pipeline.
I am running the batch operation so I can handle the delay.
Is it the best approach? How long does it take to create a pod in Kubernetes?
If you wanna run ten thousands of jobs at the same time - you will definitely need to plan resource allocation. You need to estimate the number of nodes that you need. After that you may create all nodes at once, or use GKE cluster autoscaler for automatically adding new nodes in response to resource demand. If you preallocate all nodes at once - you will probably have high bill at the end of month. But pods can be created very quickly. If you create only small number of nodes initially and use cluster autoscaler - you will face large delays, because nodes take several minutes to start. You must decide what your approach will be.
If you use cluster autoscaler - do not forget to specify maximum nodes number in cluster.
Another important thing - you should put your jobs into Guaranteed quality of service in Kubernetes. Otherwise if you use Best Effort or Burstable pods - you will end up with Eviction nightmare which is really terrible and uncontrolled.

Kubernetes pods N:M scheduling how-to

Batch computations, Monte Carlo, using Docker image, multiple jobs running on Google cloud and managed by Kubernetes. No Replication Controllers, just multiple pods with NoRestart policy delivering computed payloads to our server. So far so good. Problem is, I have cluster with N nodes/minions, and have M jobs to compute, where M > N. So I would like to fire M pods at once and tell Kubernetes to schedule it in such a way so that only N are running at a given time, and everything else is kept in Pending state. As soon as one pod is done, next is scheduled to run moving from Pending to Running and so on and so forth till all M pods are done.
Is it possible to do so?
Yes, you can have them all ask for a resource of which there's only one on each node, then the scheduler won't be able to schedule more than N at a time. The most common way to do this is to have each pod ask for a hostPort in the ports section of its containers spec.
However, I can't say I'm completely sure why you would want to limit the system to one such pod per node. If there are enough resources available to run multiple at a time on each node, it should speed up your job to let them run.
Just for the record, after discussion with Alex, trial and error and a binary search for a good number, what worked for me was setting the CPU resource limit in the Pod JSON to:
"resources": {
"limits": {
"cpu": "490m"
}
}
I have no idea how and why this particular value influences the Kubernetes scheduler, but it keeps nodes churning through the jobs, with exactly one pod per node running at any given moment.