Kubernetes job that consists of two pods (that must run on different nodes and communicate with each other) - kubernetes

I am trying to create a Kubernetes job that consists of two pods that have to be scheduled on separate nodes in our Hybrid cluster. Our requirement is that one of the pods runs on a Windows Server node and the other pod is running on a Linux node (thus we cannot just run two Docker containers from the same pod, which I know is possible, but would not work in our scenario). The Linux pod (which you can imagine as a client) will communicate over the network with the Windows pod (which you can imagine as a stateful server) exchanging data while the job runs. When the Linux pod terminates, we want to also terminate the Windows pod. However, if one of the pods fail, then we want to fail both pods (as they are designed to be a single job)
Our current design is to write a K8S service that handles the communication between the pods, and then apply the service and the two pods to the cluster to "emulate" a job. However, this is not ideal since the two pods are not tightly coupled as a single job and adds quite a bit of overhead to manually manage this setup (e.g. when failures or the job, we probably need to manually kill the service and deployment of the Windows pod). Plus we would need to deploy a new service for each "job", as we require the Linux pod to always communicate with the same Windows pod for the duration of the job due to underlying state (thus cannot use a single service for all Windows pods).
Any thoughts on how this could be best achieved on Kubernetes would be much appreciated! Hopefully this scenario is supported natively, and I would not need to resort in this kind of pod-service-pod setup that I described above.
Many thanks

I am trying to distinguish your distaste for creating and wiring the Pods from your distaste at having to do so manually. Because, in theory, a Job that creates Pods is very similar to what you are describing, and would be able to have almost infinite customization for those kinds of rules. With a custom controller like that, one need not create a Service for the client(s) to speak to their server, as the Job could create the server Pod first, obtain its Pod-specific-IP, and feed that to the subsequently created client Pods.
I would expect one could create a Job controller using only bash and either curl or kubectl: generate the json or yaml that describes the situation you wish to have, feed it to the kubernetes API (since the Job would have a service account - just like any other in-cluster container), and use normal traps to cleanup after itself. Without more of the specific edge cases loaded in my head it's hard to say if that's a good idea or not, but I believe it's possible.

Related

Running other non-cluster containers on k8s node

I have a k8s cluster that runs the main workload and has a lot of nodes.
I also have a node (I call it the special node) that some of special container are running on that that is NOT part of the cluster. The node has access to some resources that are required for those special containers.
I want to be able to manage containers on the special node along with the cluster, and make it possible to access them inside the cluster, so the idea is to add the node to the cluster as a worker node and taint it to prevent normal workloads to be scheduled on it, and add tolerations on the pods running special containers.
The idea looks fine, but there may be a problem. There will be some other containers and non-container daemons and services running on the special node that are not managed by the cluster (they belong to other activities that have to be separated from the cluster). I'm not sure that will be a problem, but I have not seen running non-cluster containers along with pod containers on a worker node before, and I could not find a similar question on the web about that.
So please enlighten me, is it ok to have non-cluster containers and other daemon services on a worker node? Does is require some cautions, or I'm just worrying too much?
Ahmad from the above description, I could understand that you are trying to deploy a kubernetes cluster using kudeadm or minikube or any other similar kind of solution. In this you have some servers and in those servers one is having some special functionality like GPU etc., for deploying your special pods you can use node selector and I hope you are already doing this.
Coming to running separate container runtime on one of these nodes you need to consider two points mainly
This can be done and if you didn’t integrated the container runtime with
kubernetes it will be one more software that is running on your server
let’s say you used kubeadm on all the nodes and you want to run docker
containers this will be separate provided you have drafted a proper
architecture and configured separate isolated virtual network
accordingly.
Now comes the storage part, you need to create separate storage volumes
for kubernetes and container runtime separately because if any one
software gets failed or corrupted it should not affect the second one and
also for providing the isolation.
If you maintain proper isolation starting from storage to network then you can run both kubernetes and container runtime separately however it is not a suggested way of implementation for production environments.

How to tell Kubernetes to not reschedule a pod unless it dies?

Kubernetes tends to assume apps are small/lightweight/stateless microservices which can be stopped on one node and restarted on another node with no downtime.
We have a slow starting (20min) legacy (stateful) application which, once run as a set of pod should not be rescheduled without due cause. The reason being all user sessions will be killed and the users will have to login again. There is NO way to serialize the sessions and externalize them. We want 3 instances of the pod.
Can we tell k8s not to move a pod unless absolutely necessary (i.e. it dies)?
Additional information:
The app is a tomcat/java monolith
Assume for the sake of argument we would like to run it in Kubernetes
We do have a liveness test endpoint available
There is no benefit, if you tell k8s to use only one pod. That is not the "spirit" of k8s. In this case, it might be better to use a dedicated machine for your app.
But you can assign a pod to a special node - Assigning Pods to Nodes. The should be necessary only, when special hardware requirements are needed (e.g. the AI-microservice needs a GPU, which is only on node xy).
k8s don't restart your pod for fun. It will restart it, when there is a reason (node died, app died, ...) and I never noticed a "random reschedule" in a cluster. It is hard to say, without any further information (like deployment, logs, cluster) what exactly happened to you.
And for your comment: There are different types of recreation, one of them starts a fresh instance and will kill the old one, when the startup was successfully. Look here: Kubernetes deployment strategies
All points together:
Don't enforce a node to your app - k8s will "smart" select the node.
There are normally no planned reschedules in k8s.
k8s will recreate pods only, if there is a reason. Maybe your app didn't answer on the liveness-endpoint? Or someone/something deleting your pod?

Deployment "A" checks a set of checks and scales deployment "B" to run tasks

I have a GKE cluster running (v1.12.8-gke.10). I am trying to set up a specific app that will work the way I want but I can't seem to find and documentation to piece it together. What I am trying to accomplish may not even be possible.
I would like to set up a deployment(1 pod) using a python docker image where it is running a looped pythons script performing checks. If the checks all pass, I would like this deployment/pod to start/scale another deployment that will do a simple task and then kill the pod that was started.
I am not sure if I should be using a deployment or if I need a HPA mixed somewhere in this process. I have also tried looking at KEDA but it only has specified triggers and doesn't fit what I am trying to do.
I am expecting two different deployments.
Deploy A = 1 pod constantly running a python script that is checking if it should be sending any commands to Deploy B.
Deploy B = listening for Deploy A to reach out to tell it to start a pod to run a task. After the task is completed, have the pod terminate.
The workflow you describe is possible. The controller would need access to the Kubernetes API, probably using the official Python client. When you received a request, you would create a Job, and probably pass information about what to run as command-line arguments. The process inside the Job's Pod would do the work and then exit normally. You'd then be responsible for monitoring the Job's status and noticing when it finished, but you wouldn't have to explicitly scale it down; deleting the completed Job would be polite.
The architecture I'd generally recommend here would be to use a job queue like RabbitMQ. You'd have a Deployment for your controller, and a separate Deployment for your worker, and a StatefulSet to run the job queue (or perhaps something like the stable/rabbitmq Helm chart. None of these would directly interact with the Kubernetes API. When a new request came in, the controller would post a message to RabbitMQ, and when the worker received a message off the queue, it would do a job.
This has the advantage of being easier to develop locally (you can just run RabbitMQ on your laptop or in a container, but getting access to the Kubernetes API is harder). If you suddenly get swamped with a huge number of job submissions, you won't try to overload the cluster with thousands of jobs; they'll back up in RabbitMQ and you can do them one at a time. If you want the cluster to do more, you can kubectl scale deployment to get more workers. If you run out of jobs the worker pod(s) will sit idle but that's not really a problem.

Kubernetes with hybrid containers on one VM?

I have played around a little bit with docker and kubernetes. Need some advice here on - Is it a good idea to have one POD on a VM with all these deployed in multiple (hybrid) containers?
This is our POC plan:
Customers to access (nginx reverse proxy) with a public API endpoint. eg., abc.xyz.com or def.xyz.com
List of containers that we need
Identity server Connected to SQL server
Our API server with Hangfire. Connected to SQL server
The API server that connects to Redis Server
The Redis in turn has 3 agents with Hangfire load-balanced (future scalable)
Setup 1 or 2 VMs?
Combination of Windows and Linux Containers, is that advisable?
How many Pods per VM? How many containers per Pod?
Should we attach volumes for DB?
Thank you for your help
Cluster size can be different depending on the Kubernetes platform you want to use. For managed solutions like GKE/EKS/AKS you don't need to create a master node but you have less control over our cluster and you can't use latest Kubernetes version.
It is safer to have at least 2 worker nodes. (More is better). In case of node failure, pods will be rescheduled on another healthy node.
I'd say linux containers are more lightweight and have less overhead, but it's up to you to decide what to use.
Number of pods per VM is defined during scheduling process by the kube-scheduler and depends on the pods' requested resources and amount of resources available on cluster nodes.
All data inside running containers in a Pod are lost after pod restart/deletion. You can import/restore DB content during pod startup using Init Containers(or DB replication) or configure volumes to save data between pod restarts.
You can easily decide which container you need to put in the same Pod if you look at your application set from the perspective of scaling, updating and availability.
If you can benefit from scaling, updating application parts independently and having several replicas of some crucial parts of your application, it's better to put them in the separate Deployments. If it's required for the application parts to run always on the same node and if it's fine to restart them all at once, you can put them in one Pod.

Can kubernetes schedule multiple unrelated pods on one host?

If I have 10 different services, each of which are independent from each other and run from their own container, can I get kubernetes to run all of those services on, say, 1 host?
This is unclear in the kubernetes documentation. It states that you can force it to schedule containers from the same pod onto one host, using a "multi-container pod", but it doesn't seem to approach the subject of whether you can have multiple pods running on one host.
In fact kubernetes will do exactly what you want by default. It is capable of running dozens if not hundreds of containers on a single host (depending on its specs).
If you want very advanced control over scheduling pods, there is an alpha feature for that, which introduces concept of node/pod (anti)affinities. But I would say it is a rather advanced k8s topic at the moment, so you are probably good with what is in stable/beta for most use cases.
Honorable mention: there is a nasty trick that allows you to control when pods can not be collocated on the same node. An that is when they both declare same hostPort in their ports section. It can be usefull for some cases, but be aware it affects ie. how rolling deployments happen in some situations.
You can use node selectors and assign the same node for each of the pod to the same node / host
http://kubernetes.io/docs/user-guide/node-selection/
Having said that, the whole point to Kubernetes is to manage a cluster where you can deploy apps / pods across them.