How to measure the time between the creation the job on kubernetes by the user and the time of starting running this job on the node? - kubernetes

I am running a job with a kubernetes POD and I need to measure the time between the creation the job by the user and the time of starting running this job on the node .
I want to get it through some api.
Does anyone know how can I get it ?

Monitoring Kubernetes ( number of pending pods/jobs)
Use the kube-state-metrics package for monitoring and a small Go program called veneur-prometheus to scrape the Prometheus metrics kube-state-metrics emits and publish them as statsd metrics to monitoring system.
For example, here’s a chart of the number of pending pods in the cluster over the last hour. Pending means that they’re waiting to be assigned a worker node to run on. You can see that the number spikes at 11am, because a lot of cron jobs run at the 0th minute of the hour in this case.
An example chart showing pending pods in a cluster over the last hour

Related

How to prevent or fix Kubernetes pod getting stuck in containerCreating occasionally

I'm running AWS EKS, running on Fargate, and using Kubernetes to orchestrate multiple cron jobs. I spin roughly 1000 pods up and down over the course of a day.
Very seldomly(once every 3 weeks) one of the pods gets stuck in ContainerCreating and just hangs there and because I have concurrency disabled that particular job will never run. The fix is simply terminating the job or the pod and having it restart but this is a manual intervention.
Is there a way to get a pod to terminate or restart, if it takes too long to create?
The reason for the pod getting stuck varies quite a bit. A solution would need to be general. It can be a time based solution as all the pods are running the same code with different configurations so the startup time is relatively consistent.
Sadly there is no mecanism to stop a job if it fail at image pulling or container creating. I also tried to do what you are trying to achieve.
You can set a backoffLimit inside your template. But it won't handle the number of retries during containerCreating, only while running.
What you can do is a script that makes describes of each pods in namespace. And try to parse it and restart the pod if it is stuck in containerCreating.
Or try to debug/trace what is causing this. kubectl describe pods to get info when your pod is in containerCreating.

Cron job timing with new nodes

Say I have a kubernetes Cron Job that runs every day at 10am but it needs to provision new nodes in order for it to run. Will k8 wait until 10am to start provisioning those resources (therefore actually running some time after 10am)? Or will it have everything ready to go at 10am? Are there settings to control this?
Will k8 wait until 10am to start provisioning those resources...
No. K8s does not provision a computer node and make the computer join the cluster.
...will it have everything ready to go at 10am?
The job will be created at the scheduled time and pod will be spawn. At that time if your cluster runs out of node to run the pod, the pod will enter pending state. At this point if you have configured autoscaler in your cluster, autoscaler will request for a new node to join your cluster so that the pending pod can run.

Cron service in multiple k8s pod

I have an application running on multiple k8s pods. Forgive me for my lack of knowledge about k8s pod, from my understanding k8s will route incoming traffic to a different pod just like a proxy.
What happened if my application is running a cron job that fetches data. Is the cron job gonna be called multiple times based on how much pod is running, my concern is that will be a data duplication because these pod will fetch the same data.
My question is how to avoid data duplication when a cron job fetches data? can these pod configured to become something like a worker? let's say the cron job is fetching 500 data. Given 5 pods, each pod will fetch 100 data.
Ideally, it should be not like that way POD will be mostly running the workload like API or socket servers
For cronjobs there are other options n Kubernetes is especially for the cronjob.
There are cronjobs & jobs two things in Kubernetes.
https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
cron jobs get executed as per the corn timing while jobs get executed when you create it using YAML.
Cronjob indirectly creates the job run time. you can check out using the kubectl get jobs & kubectl get cronjobs
can these pod configured to become something like a worker? let's say
the cron job is fetching 500 data. Given 5 pods, each pod will fetch
100 data.
now this depends on your scenrio you can create the single cronjob which fetch all those 500 data and single pod will be running at a time or corn execution.
My question is how to avoid data duplication when a cron job fetches
data?
Run that type of workload as Cronjob pod.
https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/

Can I reduce the time of containercreating period in Kubernetes?

When we run kubectl apply -f, we create a new pod in Kubernetes. But it takes about 5 seconds to arrive Running status even though the image has been already pulled in the node. Before that, the pod is in containercreating status. I Run kubectl describe to see the events and found that the scheduling is very fast but the gap between scheduled and imagepulling is about 3 seconds , and the container starting time is about 2 seconds. I wonder if I can reduce the time of containercreating time. Thank you!
The target latency between Creation to Running is ~5 sec (if the image is pre-pulled). Your Pods' creation times are meeting both the scheduling time goal and the API latency goal. There was a discussion regarding that topic which resulted in the current SLA. And further Enhancement Proposals (example) are rejected.
However you may want to review the Scheduler Performance Tuning but bear in mind that it would be relevant for large Kubernetes clusters mainly.

Set Configure Kubernetes scheduler interval?

I use k8s to run large compute Jobs (to completion). There are not many jobs nor nodes in my setup so scale is small.
I noticed that the scheduler do not start a pod immediately when a node is available. It takes between 5 to 40 seconds for a pod to be scheduled once node is ready.
Is there a way to make the scheduler more "aggressive"? I cant find a setting for the interval in Default Scheduler custom policy. Is implementing my own scheduler the only way forward? Any tips appreciated!
There is difference between pod scheduling and pod creation. Scheduler only find suitable node and schedule pod to that node but pod creation done by kubelet.
Kubelet polls api-server for desired state and get newly scheduled pod spec and then create pod.
So this process can take time you Specfied in question.
So i dont think writing custom scheduler help here.