One pod per DAG instead of one pod per task with Airflow Kubernetes Executor? - kubernetes

We decided to run Airflow on Kubernetes. We would like to make use of the power of Kubernetes, but in a balanced way.
We have some very small tasks in our DAGs, for example create a directory. The KubernetesExecutor spins up a pod for every task, this takes long and therefore is overkill for many short tasks.
My question is, is it possible to configure Airflow to spin up a Kubernetes pod for a whole DAG, instead of a pod per task? (Preferably without Celery)

I do not think it is possible to use one pod per DAG, because KubernetesExecutor is designed to request a pod per task:
When a DAG submits a task, the KubernetesExecutor requests a worker pod from the Kubernetes API. The worker pod then runs the task, reports the result, and terminates.
Maybe combining multiple smaller tasks into one is a way to go.

https://airflow.apache.org/docs/apache-airflow/stable/executor/celery_kubernetes.html
The CeleryKubernetes Executor allows you to use the immediate resources of a celery worker or spin up a pod for a task. I haven’t configured this setup but it seems to match your use case.

Related

Kubernetes Job should use all available resources

I run a Kubernetes job with thousands of workers following the pattern described in Coarse Parallel Processing Using a Work Queue. I use the Python client for the Kubernetes API to define the job programmatically. The cluster does not scale automatically. The available resources are unknown at the time of programming.
The goal is to use all available resources of the cluster for my job. I have tried to optimise the .spec.parallelism setting. If I set .spec.parallelism and .spec.completions to the same value, all pods for the job are started at the beginning, but most of them could not be scheduled due to resource requirements (e.g. insufficient CPU). When the first pods are finished, the resources are free and more pods are scheduled. But after some time (2.4 hours on my cluster) Kubernetes gives up scheduling the remaining pods and marks them as failed, which eventually causes the whole job to fail.
Is there a pattern for a job on a Kubernetes cluster to use all available resources?

Can a Job have multiple distinct tasks running in different pods under it?

In a K8S cluster when a Job has multiple pods under it, are these all replicas?
Can a Job have 5 pods running under it and each of the pod is basically a different task?
Yes, there is a provision for running multiple pods under a job either sequentially or in parallel. In the spec section of a Job, you can mention completions and it will be equivalent to the number of pods to be run sequentially.
And if you want to run them in parallel, similarly you can define parallelism and assign some value to this key in the spec part of Job yaml.
The former will take care of pods to complete execution successfully and later will define the limit of pods those can run in parallel.
I also started a discussion about this topic on Kubernetes community forum. Here is a discussion on it: https://discuss.kubernetes.io/t/can-a-job-have-multiple-distinct-tasks-running-in-different-pods-under-it/11575
As per that, a Job can only have a single pod spec. As a result even if a job has multiple pods deployed under it all of them will be mirror images of each other. There needs to a separate arrangement to make each of the pods do different things.
If your steps are sequential and not in parallel, then you can just define multiple containers under the same job.

Deployment "A" checks a set of checks and scales deployment "B" to run tasks

I have a GKE cluster running (v1.12.8-gke.10). I am trying to set up a specific app that will work the way I want but I can't seem to find and documentation to piece it together. What I am trying to accomplish may not even be possible.
I would like to set up a deployment(1 pod) using a python docker image where it is running a looped pythons script performing checks. If the checks all pass, I would like this deployment/pod to start/scale another deployment that will do a simple task and then kill the pod that was started.
I am not sure if I should be using a deployment or if I need a HPA mixed somewhere in this process. I have also tried looking at KEDA but it only has specified triggers and doesn't fit what I am trying to do.
I am expecting two different deployments.
Deploy A = 1 pod constantly running a python script that is checking if it should be sending any commands to Deploy B.
Deploy B = listening for Deploy A to reach out to tell it to start a pod to run a task. After the task is completed, have the pod terminate.
The workflow you describe is possible. The controller would need access to the Kubernetes API, probably using the official Python client. When you received a request, you would create a Job, and probably pass information about what to run as command-line arguments. The process inside the Job's Pod would do the work and then exit normally. You'd then be responsible for monitoring the Job's status and noticing when it finished, but you wouldn't have to explicitly scale it down; deleting the completed Job would be polite.
The architecture I'd generally recommend here would be to use a job queue like RabbitMQ. You'd have a Deployment for your controller, and a separate Deployment for your worker, and a StatefulSet to run the job queue (or perhaps something like the stable/rabbitmq Helm chart. None of these would directly interact with the Kubernetes API. When a new request came in, the controller would post a message to RabbitMQ, and when the worker received a message off the queue, it would do a job.
This has the advantage of being easier to develop locally (you can just run RabbitMQ on your laptop or in a container, but getting access to the Kubernetes API is harder). If you suddenly get swamped with a huge number of job submissions, you won't try to overload the cluster with thousands of jobs; they'll back up in RabbitMQ and you can do them one at a time. If you want the cluster to do more, you can kubectl scale deployment to get more workers. If you run out of jobs the worker pod(s) will sit idle but that's not really a problem.

How to best run Apache Airflow tasks on a Kubernetes cluster?

What we want to achieve:
We would like to use Airflow to manage our machine learning and data pipeline while using Kubernetes to manage the resources and schedule the jobs. What we would like to achieve is for Airflow to orchestrate the workflow (e.g. Various tasks dependencies. Re-run jobs upon failures) and Kubernetes to orchestrate the infrastructure (e.g cluster autoscaling and individual jobs assignment to nodes). In other words Airflow will tell the Kubernetes cluster what to do and Kubernetes decides how to distribute the work. In the same time we would also want Airflow to be able to monitor the individual tasks status. For example if we have 10 tasks spreaded across a cluster of 5 nodes, Airflow should be able to communicate with the cluster and reports show something like: 3 “small tasks” are done, 1 “small task” has failed and will be scheduled to re-run and the remaining 6 “big tasks” are still running.
Questions:
Our understanding is that Airflow has no Kubernetes-Operator, see open issues at https://issues.apache.org/jira/browse/AIRFLOW-1314. That being said we don’t want Airflow to manage resources like managing service accounts, env variables, creating clusters, etc. but simply send tasks to an existing Kubernetes cluster and let Airflow know when a job is done. An alternative would be to use Apache Mesos but it looks less flexible and less straightforward compared to Kubernetes.
I guess we could use Airflow’s bash_operator to run kubectl but this seems not like the most elegant solution.
Any thoughts? How do you deal with that?
Airflow has both a Kubernetes Executor as well as a Kubernetes Operator.
You can use the Kubernetes Operator to send tasks (in the form of Docker images) from Airflow to Kubernetes via whichever AirflowExecutor you prefer.
Based on your description though, I believe you are looking for the KubernetesExecutor to schedule all your tasks against your Kubernetes cluster. As you can see from the source code it has a much tighter integration with Kubernetes.
This will also allow you to not have to worry about creating the docker images ahead of time as is required with the Kubernetes Operator.

How to properly use Kubernetes for job scheduling?

I have the following system in mind: A master program that polls a list of tasks to see if they should be launched (based on some trigger information). The tasks themselves are container images in some repository. Tasks are executed as jobs on a Kubernetes cluster to ensure that they are run to completion. The master program is a container executing in a pod that is kept running indefinitely by a replication controller.
However, I have not stumbled upon this pattern of launching jobs from a pod. Every tutorial seems to be assuming that I just call kubectl from outside the cluster. Of course I could do this but then I would have to ensure the master program's availability and reliability through some other system. So am I missing something? Launching one-off jobs from inside an indefinitely running pod seems to me as a perfectly valid use case for Kubernetes.
Your master program can utilize the Kubernetes client libraries to preform operations on a cluster. Find a complete example here.