K8s: Converting a completed job into a pod? - kubernetes

I have a job that runs on deployment of our app. The job runs fine 99.9% of the time but every so often something goes wrong (in the application config) and we need to run commands by hand. Because the job has several initContainers it's not as simple as just running an instance of an application pod and execing into it.
We've considered creating a utility pod as part of the application (and this may be the way to go) but I was wondering if there was a good way to convert a completed job into a pod? I've experimented with getting the pod definition, editing by hand, and then applying; but since it's often urgent when we need to do this and it's quite possible to introduce errors when hand editing, this feels wrong.
I'm sure this can't be an unusual requirement, are there tools, commands, or approaches to this problem that I'm simply ignorant of?

Option 1: Just re-submit the job
"Converting a job into a pod" is basically what happens when you submit a Job resource to Kubernetes...so one option is just to delete and re-create the job:
kubectl get job myjob -o json | kubectl replace --force -f-
Poof, you have a new running pod!
Option 2: Extract the pod template
You can use jq to extract .spec.template from the Job and attach the necessary bits to turn it into a Pod manifest:
kubectl get job myjob -o json |
jq '
"apiVersion": "v1",
"kind": "Pod",
"metadata": {"name": "example"}
} * .spec.template
' |
kubectl apply -f-
The above will create a pod named example; change the name attribute if you want to name it something else.


How to determine what the affinity/anti-affinity of programmatically created pod are?

We are having issues that rarely we will have a pod fail to be scheduled with an error that all 16 nodes failed due to affinity/anti-affinity. We would not expect affinity to prevent any of the nodes from being scheduled.
I'd like to determine what the actual cause of the affinity failing in scheduling is, and for that I think I need to know what the affinities a pod was initialized with. However, I can't look at chart configuration files since these particular pods are being scheduled programmatically at runtime. Is there a kubectl command I can use to view what the pod's affinity was set to, or to determine why every node is failing its affinity checks?
Figured this out on my own. The command I used was:
kubectl get pods <pod_name> -o json | jq '.spec.affinity'
I had to yum install jq for this to work. if instead you wanted to look at affinity of all pods I think you need to remove the pod name and add a .items[] in front of the .spec for the jq command.
For those curious my affinity has this
"key": "host",
"operator": "In",
"values": [
That "yes" doesn't seem quite right to me. So yeah something funky is happening in our pod creation.

GCP Alerting Policy for failed GKE CronJob

What would be the best way to set up a GCP monitoring alert policy for a Kubernetes CronJob failing? I haven't been able to find any good examples out there.
Right now, I have an OK solution based on monitoring logs in the Pod with ERROR severity. I've found this to be quite flaky, however. Sometimes a job will fail for some ephemeral reason outside my control (e.g., an external server returning a temporary 500) and on the next retry, the job runs successfully.
What I really need is an alert that is only triggered when a CronJob is in a persistent failed state. That is, Kubernetes has tried rerunning the whole thing, multiple times, and it's still failing. Ideally, it could also handle situations where the Pod wasn't able to come up either (e.g., downloading the image failed).
Any ideas here?
First of all, confirm the GKE’s version that you are running. For that, the following commands are going to help you to identify the GKE’s
default version and the available versions too:
Default version.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
Available versions.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
Now that you know your GKE’s version and based on what you want is an alert that is only triggered when a CronJob is in a persistent failed state, GKE Workload Metrics was the GCP’s solution that used to provide a fully managed and highly configurable solution for sending to Cloud Monitoring all Prometheus-compatible metrics emitted by GKE workloads (such as a CronJob or a Deployment for an application). But, as it is right now deprecated in G​K​E 1.24 and was replaced with Google Cloud Managed Service for Prometheus, then this last is the best option you’ve got inside of GCP, as it lets you monitor and alert on your workloads, using Prometheus, without having to manually manage and operate Prometheus at scale.
Plus, you have 2 options from the outside of GCP: Prometheus as well and Ranch’s Prometheus Push Gateway.
Finally and just FYI, it can be done manually by querying for the job and then checking it's start time, and compare that to the current time, this way, with bash:
START_TIME=$(kubectl -n=your-namespace get job your-job-name -o json | jq '.status.startTime')
Or, you are able to get the job’s current status as a JSON blob, as follows:
kubectl -n=your-namespace get job your-job-name -o json | jq '.status'
You can see the following thread for more reference too.
Taking the “Failed” state as the medullary point of your requirement, setting up a bash script with kubectl to send an email if you see a job that is in “Failed” state can be useful. Here I will share some examples with you:
while true; do if `kubectl get jobs myjob -o jsonpath='{.status.conditions[?(#.type=="Failed")].status}' | grep True`; then mail email#address -s jobfailed; else sleep 1 ; fi; done
For newer K8s:
while true; do kubectl wait --for=condition=failed job/myjob; mail#address -s jobfailed; done

What is proper replacement for initContainer/job which runs once per deployment?

I need a job to be run once every time I apply my configuration to Kubernetes. I tried to use a job but it does not allow to update image version and requires deletion of previous job. I tried to use initContainer but it seems to be overkill since I don't need this to be run every time pod is started.
Essentially I have simple deployment with single pod and need to run a job before deployment is happening every time I update image in deployment and confused what is the best way to achieve it. Job seems to be perfect for this but issue is that I can not submit the same config second time with update image tag for job.
I'm afraid there is no other option. I recently had a similar situation and this is how I'm doing the job recreation:
kubectl get <job name> -o json | \
jq -r '.metadata.annotations."kubectl.kubernetes.io/last-applied-configuration"' | \
kubectl replace --save-config --force -f -
It uses the "kubectl.kubernetes.io/last-applied-configuration" annotation, that holds the initial job config that was applied, without the auto-generated/cluster fields. Then the kubectl replace will completely replace the current one with a copy of itself, simulating the "rerun" that you are looking for.
You can then add to the snippet above some code replacing the image name/tag, before the kubectl replace.

Knowing the replica count in Kubernetes

I'm wondering for a batch distributed job I need to run. Is there a way in K8S if I use a Job/Stateful Set or whatever, a way for the pod itself(via ENV var or whatever) to know its 1 of X pods run for this job?
I need to chunk up some data and have each process fetch the stuff it needs.
I guess the statefulset hostname setting is one way of doing it. Is there a better option?
This is planned but not yet implemented that I know of. You probably want to look into higher order layers like Argo Workflows or Airflow instead for now.
You could write some infrastructure as code using Ansible that will perform the following tasks in order:
kubectl create -f jobs.yml
kubectl wait --for=condition=complete job/job1
kubectl wait --for=condition=complete job/job2
kubectl wait --for=condition=complete job/job3
kubectl create -f pod.yml
kubectl wait can be used in situations like this to halt progress until an action has been performed. In this case, a job has completed its run.
Here is a similar question that someone asked on StackOverflow before.

How can I see all Jobs, both successful and failed?

Here is a transcript:
LANELSON$ kubectl --kubeconfig foo get -a jobs
No resources found.
OK, fine; even with the -a option, no jobs exist. Cool! Oh, let's just be paranoid and check for one that we know was created. Who knows? Maybe we'll learn something:
LANELSON$ kubectl --kubeconfig foo get -a job emcc-poc-emcc-broker-mp-populator
emcc-poc-emcc-broker-mp-populator 1 0 36m
Er, um, what?
In this second case, I just happen to know the name of a job that was created, so I ask for it directly. I would have thought that kubectl get -a jobs would have returned it in its output. Why doesn't it?
Of course what I'd really like to do is get the logs of one of the pods that the job created, but kubectl get -a pods doesn't show any of that job's terminated pods either, and of course I don't know the name of any of the pods that the job would have spawned.
What is going on here?
Kubernetes 1.7.4 if it matters.
The answer is that Istio automatic sidecar injection happened to be "on" in the environment (I had no idea, nor should I have). When this happens, you can opt out of it, but otherwise all workloads are affected by default (!). If you don't opt out of it, and Istio's presence causes your Job not to be created for any reason, then your Job is technically uninitialized. If a resource is uninitialized, then it does not show up in kubectl get lists. To make an uninitialized resource show up in kubectl get lists, you need to include the --include-uninitialized option to get. So once I issued kubectl --kubeconfig foo get -a --include-uninitialized jobs, I could see the failed jobs.
My higher-level takeaway is that the initializer portion of Kubernetes, currently in alpha, is not at all ready for prime time yet.