Airflow Kubernetes Executor pods go into "NotReady" state instead of being deleted

Airflow Kubernetes Executor pods go into "NotReady" state instead of being deleted - kubernetes

Installed airflow in kubernetes using the repo https://airflow-helm.github.io/charts and airflow-stable/airflow with version 8.1.3. So I have Airflow v2.0.1 installed. I have it setup using external postgres database and using the kubernetes executor.
What I have noticed is when airflow related pods are done they go into a "NotReady" status. This happens with the update-db pod at startup and also pods launched by the kubernetes executioner. When I go into airflow and look at the task some are successful and some can be failure, but either way the related pods are in "NotReady" status. In the values file I set the below thinking it would delete the pods when they are done. I've gone through the logs and made sure one of the dags ran as intended and was success in the related task was success and of course the related pod when it was done went into "NotReady" status.
The values below are located in Values.airflow.config.
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS: "true"
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS_ON_FAILURE: "true"
So I'm not really sure what I'm missing and if anyone has seen this behavior? It's also really strange that the upgrade-db pod is doing this too.
Screenshot of kubectl get pods for the namespace airflow is deployed in with the "NotReady" pods

Figured it out. The K8 namespace had auto injection of linkerd sidecar container for each pod. Would have to just use celery executioner or setup some sort of k8 job to cleanup completed pods and jobs that don’t get cleaned up due to the linkerd container running forever in those pods.

Related

Airflow Clean Up Pods

Currently we have a CronJob to clean pods deployed by airflow.
Cleanup cron job in airflow is defined as follows
This Cleans all completed pods (Successful pod and Pods that are marked as Error).
I have a requirement where in CleanUp Pods CronJob shouldn't clean Pods that are marked as ERROR.
I checked Airflow Docs but couldn't get anything. Any other way in which i can achieve this

There are 2 airflow environments variables that might help.
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS - If True, all worker pods will be deleted upon termination
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS_ON_FAILURE - If False (and delete_worker_pods is True), failed worker pods will not be deleted so users can investigate them. This only prevents removal of worker pods where the worker itself failed, not when the task it ran failed
for more details see here

While deploying Kafka on on premises k8s the status of pod is pending for long time

I am trying to use helm charts for deploying kafka and zookeeper in local k8s but while checking the status of respective pods it shows PENDING for long time and pod is not assigning to any node nevertheless i have 2 worker nodes running which are healthy
I tried by deleting the pods and redeployed still i landed in same situation not able to make pods run need help on how i can run this pods

Kubernetes pod failed to update

We have a Gitlab CI/CD to deploy pod via Kubernetes. However, the updated pod is always pending and the deleted pod is always stuck at terminating.
The controller and scheduler are both okay.
If I described the pending pod, it shows it is scheduled but nothing else.
This is the pending pod's logs:
$ kubectl logs -f robo-apis-dev-7b79ccf74b-nr9q2 -n xxx -f Error from
server (BadRequest): container "robo-apis-dev" in pod
"robo-apis-dev-7b79ccf74b-nr9q2" is waiting to start:
ContainerCreating
What could be the issue? Our Kubernetes cluster never had this issue before.

Okay, it turns out we used to have an NFS server as PVC. But we have moved to AWS EKS recently, thus cleaning the NFS servers. Maybe there are some resources from nodes that are still on the NFS server. Once we temporarily roll back the NFS server, the pods start to move to RUNNING state.
The issue was discussed here - Orphaned pod https://github.com/kubernetes/kubernetes/issues/60987

AWS EKS kubernetes Deployments are not ready NodePort and LoadBalancer is not reachable

I am trying to deploy pods on the EKS cluster. Below are some screen shots which shows that AWS EKS cluster is created and is active, group nodes are also active, now when i try to deploy any pod like nginx, wordpress or something else, these are not in the ready state. I tried deploying kubernetes dashboard and its in ready state, but why others are not in ready state do not know and that's why their URLs are not reachable.
also, while checking logs it says as below:
Error from server (NotFound): pods "deployment-2048-64549f6964-87d59" not found

Pods are in pending state. If a Pod is stuck in Pending it means that it can not be scheduled onto a node. It can happen because there are insufficient resources of one type or another that prevent pods scheduling.
You can look at the output by kubectl describe <deployment/pod_name>. There will be messages from the scheduler about why it can not schedule your pod.

Can't shut down influxDB in Kubernetes

I have spun up a Kubernetes cluster in AWS using the official "kube-up" mechanism. By default, an addon that monitors the cluster and logs to InfluxDB is created. It has been noted in this post that InfluxDB quickly fills up disk space on nodes, and I am seeing this same issue.
The problem is, when I try to kill the InfluxDB replication controller and service, it "magically" comes back after a time. I do this:
kubectl delete rc --namespace=kube-system monitoring-influx-grafana-v1
kubectl delete service --namespace=kube-system monitoring-influxdb
kubectl delete service --namespace=kube-system monitoring-grafana
Then if I say:
kubectl get pods --namespace=kube-system
I do not see the pods running anymore. However after some amount of time (minutes to hours), the replication controllers, services, and pods are back. I don't know what is restarting them. I would like to kill them permanently.

You probably need to remove the manifest files for influxdb from the /etc/kubernetes/addons/ directory on your "master" host. Many of the kube-up.sh implementations use a service (usually at /etc/kubernetes/kube-master-addons.sh) that runs periodically and makes sure that all the manifests in /etc/kubernetes/addons/ are active.
You can also restart your cluster, but run export ENABLE_CLUSTER_MONITORING=none before running kube-up.sh. You can see other environment settings that impact the cluster kube-up.sh builds at cluster/aws/config-default.sh

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Airflow Kubernetes Executor pods go into "NotReady" state instead of being deleted - kubernetes

Figured it out. The K8 namespace had auto injection of linkerd sidecar container for each pod. Would have to just use celery executioner or setup some sort of k8 job to cleanup completed pods and jobs that don’t get cleaned up due to the linkerd container running forever in those pods.

Related

Airflow Clean Up Pods

While deploying Kafka on on premises k8s the status of pod is pending for long time

Kubernetes pod failed to update

AWS EKS kubernetes Deployments are not ready NodePort and LoadBalancer is not reachable

Can't shut down influxDB in Kubernetes

Categories

Resources