I'm wondering if there is an option to execute the task from POD? So when I have POD which is, for example, listening for some requests and on request will delegate the task to other POD(worker POD). This worker POD is not alive when there are no jobs to do and if there is more than one job to do more PODs will be created. After the job is done "worker PODs" will be stopped. So worker POD is live during one task then killed and when new task arrived new worker POD is started. I hope that I described it properly. Do you know if this is possible in Kubernetes? Worker POD start may be done by, for example, rest call from the main POD.
There are few ways to achieve this behavior.
Pure Kubernetes Way
This solution requires ServiceAccount configuration. Please take a loot at Kubernetes documentation Configure Service Accounts for Pods.
Your application service/pod can handle different custom task. Applying specific serviceaccount to your pod in order to perform specific task is the best practice. In kubernetes using service account with predefined/specified RBAC rules allows you to handle this task almost out of the box.
The main concept is to configure specific RBAC authorization rules for specific service account by giving different permission (get,list,watch,create) to different Kubernetes resources like (pod,jobs).
In this scenario working pod is waiting for incoming request, after it receives specific request it can perform specific task against kubernetes api.
This can be extend i.e. by using sidecar container inside your working pod. More details about sidecar concept can be found in this article.
Create custom controller
Another way to achieve your goal is to use Custom Controller.
Example presented in Extending the Kubernetes Controller article is showing how custom controller watch kubernetes api in order to instrument underling worker pod (watching kubernetes configuration for any changes and then deletes corresponding pods). In your setup, such controller could watch your api for waiting/not processed request and perform additional task like kubernetes job creation inside k8s cluster.
Using existing solution like Job Processing Using a Work Queue.
RabbitMQ on Kubernetes
Coarse Parallel Processing Using a Work Queue
Kubernetes Message Queue
Keda
Related
I'm running a web-server deployment in an EKS cluster. The deployment is exposed behind a NodePort service, ingress resource, and AWS Load Balancer controller.
This deployment is configured to run on "always-on" nodes, using a Node Selector.
The EKS cluster runs additional auto-scaled workloads which can also use spot instances if needed (in the same namespace).
Since the Node-Port service exposes a static port across all nodes in the cluster, there are many targets in the said target group, which are being registered and de-registered whenever a new node is being added/removed from the cluster.
What exactly happens if a request from the client is being navigated to the service that resides in a node that is about the be scaled down?
I'm asking since I'm getting many 504 Gateway Timeouts from the ALB. Specifically, these requests do not reach our FE/BE pods and terminate at the ALB level.
Welcome to the community #gil-shelef!
Based on AWS documentation, there should be used additional handlers to add both resilience and cost-savings.
Let's start with understanding how this works:
There is a specific node termination handler DaemonSet which adds pods to each spot instances and listens to spot instance interruption notification. This provides a possibility to gracefully terminate any running pods on that node, drain the node from loadbalancer and for Kubernetes scheduler to reschedule removed pods on different instances.
Workflow looks like following (taken from aws documentation - Spot Instance Interruption Handling. This link also has an example):
The workflow can be summarized as:
Identify that a Spot Instance is about to be interrupted in two minutes.
Use the two-minute notification window to gracefully prepare the node for termination.
Taint the node and cordon it off to prevent new pods from being placed on it.
Drain connections on the running pods.
Once pods are removed from endpoints, kube-proxy will trigger an update in iptables. It takes a little bit of time. To make this smoother for end-users, you should consider adding pre-stop pause about 5-10 seconds. More information about how this happens and how you can mitigate it, you can find in my answer here.
Also here are links for these handlers:
Node termination handler
Cluster autoscaler on AWS
For your last question, please check this AWS KB article on how to troubleshoot EKS and 504 errors
Here's my scenario,
I want to launch a Job in Kubernetes, the first container that runs will look through a list of custom resources, and launch each of the containers defined in that resource to completion. I don't know whats in the list ahead of time, I only know when the job is kicked off.
Is this possible? Can someone point me to something that shows how to do it?
You can use the Kubernetes Client Libraries to create any Kubernetes resource from inside your code (given that it has the correct service account of course if RBAC is configured in your cluster).
If you want to run a container to completion, a Kubernetes Job would be the best fit.
It is possible to manage jobs programmatically using the kubernetes client-go project.
Here are some examples.
To create a job to completion, refer:
Job APIs
JobInterface
Batch client APIs
Custom resources definitions can be managed using the kubernetes apiextensions-apiserver project.
To manage custom resources definitions, refer:
CRD APIs
CRD API tests
To create custom resources, refer:
This link has steps to access kubernetes API from inside a pod
Example
I have a GKE cluster running (v1.12.8-gke.10). I am trying to set up a specific app that will work the way I want but I can't seem to find and documentation to piece it together. What I am trying to accomplish may not even be possible.
I would like to set up a deployment(1 pod) using a python docker image where it is running a looped pythons script performing checks. If the checks all pass, I would like this deployment/pod to start/scale another deployment that will do a simple task and then kill the pod that was started.
I am not sure if I should be using a deployment or if I need a HPA mixed somewhere in this process. I have also tried looking at KEDA but it only has specified triggers and doesn't fit what I am trying to do.
I am expecting two different deployments.
Deploy A = 1 pod constantly running a python script that is checking if it should be sending any commands to Deploy B.
Deploy B = listening for Deploy A to reach out to tell it to start a pod to run a task. After the task is completed, have the pod terminate.
The workflow you describe is possible. The controller would need access to the Kubernetes API, probably using the official Python client. When you received a request, you would create a Job, and probably pass information about what to run as command-line arguments. The process inside the Job's Pod would do the work and then exit normally. You'd then be responsible for monitoring the Job's status and noticing when it finished, but you wouldn't have to explicitly scale it down; deleting the completed Job would be polite.
The architecture I'd generally recommend here would be to use a job queue like RabbitMQ. You'd have a Deployment for your controller, and a separate Deployment for your worker, and a StatefulSet to run the job queue (or perhaps something like the stable/rabbitmq Helm chart. None of these would directly interact with the Kubernetes API. When a new request came in, the controller would post a message to RabbitMQ, and when the worker received a message off the queue, it would do a job.
This has the advantage of being easier to develop locally (you can just run RabbitMQ on your laptop or in a container, but getting access to the Kubernetes API is harder). If you suddenly get swamped with a huge number of job submissions, you won't try to overload the cluster with thousands of jobs; they'll back up in RabbitMQ and you can do them one at a time. If you want the cluster to do more, you can kubectl scale deployment to get more workers. If you run out of jobs the worker pod(s) will sit idle but that's not really a problem.
I would like to know, in which parts of the codes in kubernetes (
https://github.com/kubernetes/kubernetes) the scheduler talks to the API server and then the API server sends out the scheduling information to kubelet?
Scheduler register a informer to specify resource(e.g. pod, PV...), register some callback function to event(e.g. add, delete, update...), these code at: https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/eventhandlers.go#L319.
Then event callback will put pod spec in the queue, scheduler will check the queue, add use some algorithm to schedule the pod to some node. Finally, scheduler will update the pod information to apiserver.
Kubelet will check the apiserver to find which pod need update, then create the container, bind the volume...
p.s. It is complex to understand the whole lifecycle about how kubernetes works, please provide what you want to know exactly.
I have Consul running in my cluster and each node runs a consul-agent as a DaemonSet. I also have other DaemonSets that interact with Consul and therefore require a consul-agent to be running in order to communicate with the Consul servers.
My problem is, if my DaemonSet is started before the consul-agent, the application will error as it cannot connect to Consul and subsequently get restarted.
I also notice the same problem with other DaemonSets, e.g Weave, as it requires kube-proxy and kube-dns. If Weave is started first, it will constantly restart until the kube services are ready.
I know I could add retry logic to my application, but I was wondering if it was possible to specify the order in which DaemonSets are scheduled?
Kubernetes itself does not provide a way to specific dependencies between pods / deployments / services (e.g. "start pod A only if service B is available" or "start pod A after pod B").
The currect approach (based on what I found while researching this) seems to be retry logic or an init container. To quote the docs:
They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.
This means you can either add retry logic to your application (which I would recommend as it might help you in different situations such as a short service outage) our you can use an init container that polls a health endpoint via the Kubernetes service name until it gets a satisfying response.
retry logic is preferred over startup dependency ordering, since it handles both the initial bringup case and recovery from post-start outages