Trigger Kubernetes/EKS cron job via HTTP call - kubernetes

I have bunch of cron jobs that sit in an EKS cluster and would like to trigger them via HTTP call. Does such API exist from Kubernetes? If not, what else can be done?

Every action in Kubernetes ca be invoked via rest API call. This is also stated as such in the docs.
There is a full API reference for the Kubernetes API you can review.
In fact, kubectl is using http under the hood. You can see those http calls by using the v flag with some verbosity level. For example:
$ kubectl get pods -v=6
I1206 00:06:33.591871 19308 loader.go:372] Config loaded from file: /home/blue/.kube/config
I1206 00:06:33.826009 19308 round_trippers.go:454] GET https://mycluster.azmk8s.io:443/api?timeout=32s 200 OK in 233 milliseconds
...
So you could check out the command you need by looking how kubectl does it. But given the fact that kubectl does use http, it's maybe easier to just use kubectl directly.

A cron job by definition is triggered by a time event (every hour, every month).
If you want to to force a trigger you can use:
kubectl create job --from=cronjob/<cron-job-name> <job-name> -n <namespace>
This is using the Kube Api which is a RESTful service, so I think it's fulfilling your request.

Related

GCP Alerting Policy for failed GKE CronJob

What would be the best way to set up a GCP monitoring alert policy for a Kubernetes CronJob failing? I haven't been able to find any good examples out there.
Right now, I have an OK solution based on monitoring logs in the Pod with ERROR severity. I've found this to be quite flaky, however. Sometimes a job will fail for some ephemeral reason outside my control (e.g., an external server returning a temporary 500) and on the next retry, the job runs successfully.
What I really need is an alert that is only triggered when a CronJob is in a persistent failed state. That is, Kubernetes has tried rerunning the whole thing, multiple times, and it's still failing. Ideally, it could also handle situations where the Pod wasn't able to come up either (e.g., downloading the image failed).
Any ideas here?
Thanks.
First of all, confirm the GKE’s version that you are running. For that, the following commands are going to help you to identify the GKE’s
default version and the available versions too:
Default version.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.defaultVersion)"
Available versions.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.validVersions)"
Now that you know your GKE’s version and based on what you want is an alert that is only triggered when a CronJob is in a persistent failed state, GKE Workload Metrics was the GCP’s solution that used to provide a fully managed and highly configurable solution for sending to Cloud Monitoring all Prometheus-compatible metrics emitted by GKE workloads (such as a CronJob or a Deployment for an application). But, as it is right now deprecated in G​K​E 1.24 and was replaced with Google Cloud Managed Service for Prometheus, then this last is the best option you’ve got inside of GCP, as it lets you monitor and alert on your workloads, using Prometheus, without having to manually manage and operate Prometheus at scale.
Plus, you have 2 options from the outside of GCP: Prometheus as well and Ranch’s Prometheus Push Gateway.
Finally and just FYI, it can be done manually by querying for the job and then checking it's start time, and compare that to the current time, this way, with bash:
START_TIME=$(kubectl -n=your-namespace get job your-job-name -o json | jq '.status.startTime')
echo $START_TIME
Or, you are able to get the job’s current status as a JSON blob, as follows:
kubectl -n=your-namespace get job your-job-name -o json | jq '.status'
You can see the following thread for more reference too.
Taking the “Failed” state as the medullary point of your requirement, setting up a bash script with kubectl to send an email if you see a job that is in “Failed” state can be useful. Here I will share some examples with you:
while true; do if `kubectl get jobs myjob -o jsonpath='{.status.conditions[?(#.type=="Failed")].status}' | grep True`; then mail email#address -s jobfailed; else sleep 1 ; fi; done
For newer K8s:
while true; do kubectl wait --for=condition=failed job/myjob; mail#address -s jobfailed; done

GKE pod replica count in cluster

How can we obtain the gke pod counts running in the cluster? I found there are ways to get node count but we needed pod count as well. it will be better if we can use something with no logging needed in gcp operations.
You can do it with Kubernetes Python Client library as shown in this question, posted by Pradeep Padmanaban C, where he was looking for more effective way of doing it, but his example is actually the best what you can do to perform such operation as there is no specific method which would allow you just to count pods without retrieving their entire json manifests:
from kubernetes import client , config
config.load_kube_config()
v1= client.CoreV1Api()
ret_pod = v1.list_pod_for_all_namespaces(watch=False)
print(len(ret_pod.items))
You can also use a different method, which allows to retrieve pods only from specific namespace e.g.:
list_namespaced_pod("default")
In kubectl way you can do it as follows (as proposed here by RammusXu):
kubectl get pods --all-namespaces --no-headers | wc -l
You can directly access the kubernetes API using a restful API call. You will need to make sure you provide the authentication token in your call by including a bearer token.
Once you are able to query the api server directly, you can use GET <master_endpoint>/api/v1/pods to list all the pods in the cluster. You can also search for specific namespaces by specifying the namespace /api/v1/namespaces/<namespace>/pods.
Keep in mind that the kubectl cli tool is just a wrapper for API calls, each kubectl command will form a RESTful API call in a similar format to the one listed above, so any interaction you have with the cluster using kubectl can also be achieved through RESTful API calls

Openshift deployment validation - QA

wanted to know if there's any tool that can validate an openshift deployment. Let's say you have a deploy configuration file with different features (secrets, routes, services, environment variables, etc) and I want to validate after the deployment has finished and the POD/s is/are created in Openshift, that all those things are there as requested on the file. Like a tool for QA.
thanks
Readiness probe are there which can execute http requests on the pod to confirm its availability. Also it can execute commands to confirm desired resources are available within the container.
Readiness probe
There is a particular flag --dry-run in Kubernetes for resource creation which performs basic syntax verification and template object schema validation without real object implementation, therefore you can do the test for all underlying objects defined in the deployment manifest file.
I think it is also feasible to achieve through OpenShift client:
$ oc create -f deployment-app.yaml --dry-run
or
$ oc apply -f deployment-app.yaml --dry-run
You can find some useful OpenShift client commands in Developer CLI Operations documentation page.
For one time validation, you can create a Job (OpenShift) with Init Container (OpenShift) that ensures that all deployment process is done, and then run test/shell script with sequence of kubectl/curl/other commands to ensure that every piece of deployment are in place and in desired state.
For continuous validation, you can create a CronJob (OpenShift) that will periodically create a test Job and report the result somewhere.
This answer can help you to create all that stuff.

Configure the master api server to check cordon node and destroy if it has no jobs running

Team,
We need to roll out some drivers on worker nodes of a K8s cluster and our flow is as below:
cordon node [no more scheduling]
wait for jobs to complete
destroy
Is there a way I can automate this using K8s options itself instead of writing some bash script to do those checks every time because we don't know when pods would complete. So, can we configure the master API server to check cordon node and destroy if it has no jobs running?
You can write your own application either using the Go Client, Python Client, or Java Client and basically do this:
$ kubectl apply -f yourjob.yaml
$ kubectl cordon <nodename>
$ kubectl wait --for=condition=complete job/myjob
$ kubectl drain <nodename>
# Terminate your node if drain returns successfully
If this is a frequent pattern, you could also probably leverage a custom controller (operator) with a custom resource definition (CRD) to do that. You will have to embed the code of your application that talks to the API server.

What is the Kubernetes ApiServer endpoint to upload any YAML file?

As you probably know, Kubernetes Dashboard let the user to upload any YAML file to the API Server.
I browsed the API by Swagger UI but i cannot find any Api Server endpoint where to PUT/POST/DELETE a generic (and potentially multi-resource) Kubernetes YAML file, as the dashboard does.
In other words, I need the Api Server endpoint used by kubectl when the command is
kubectl create -f myResources.yaml
No such endpoint exists. Adding --v=6 to the kubectl command reveals the API calls it is making, and it iterates and posts individual objects