Kubernetes - waiting on Oracle DB container - kubernetes

Apologies for a basic question. I have a simple Kubernetes deployment where I have 3 containers (each in their own pod) deployed to a Kubernetes cluster.
The RESTapi container is dependent upon the OracleDB container starting. However, the OracleDB container takes a while to startup and by that time the RESTapi container has restarted a number of times due to not being able to connect and ends up in a Backoff state.
Is there a more elegant solution for this?
I’ve also noticed that when the RESTapi container goes into the Backoff state it stops retrying?

This is a community wiki answer posted for better visibility. Feel free to expand it.
The best approach in this case is to improve your “RESTapi” application to provide a more reliable and fault-tolerant service that will allow it reconnect to the database anyway.
From Kubernetes
production
best practices:
When the app starts, it shouldn't crash because a dependency such as a
database isn't ready.
Instead, the app should keep retrying to connect to the database until
it succeeds.
Kubernetes expects that application components can be started in any
order.
In other case you can use solution with Init Containers.
You can look at this question on stackoverflow, which is just one of many others about the practical use of Init Containers for the case described.

An elegant way to achieve this is by using a combination of Kubernetes Init Containers paired with k8s-wait-for scripts.
Essentially, what you would do is configure an Init Container for your RESTapi which uses k8s-wait-for. You configure k8s-wait-for to wait for a specific pod to be in a given state, in this case, you can provide the OracleDB pod and wait for it to be in a Ready state.
The resulting effect will be that the deployment of the RESTapi will be paused until the OracleDB is ready to go. That should alleviate the constant restarts.
https://github.com/groundnuty/k8s-wait-for

Related

How to tell Kubernetes to not reschedule a pod unless it dies?

Kubernetes tends to assume apps are small/lightweight/stateless microservices which can be stopped on one node and restarted on another node with no downtime.
We have a slow starting (20min) legacy (stateful) application which, once run as a set of pod should not be rescheduled without due cause. The reason being all user sessions will be killed and the users will have to login again. There is NO way to serialize the sessions and externalize them. We want 3 instances of the pod.
Can we tell k8s not to move a pod unless absolutely necessary (i.e. it dies)?
Additional information:
The app is a tomcat/java monolith
Assume for the sake of argument we would like to run it in Kubernetes
We do have a liveness test endpoint available
There is no benefit, if you tell k8s to use only one pod. That is not the "spirit" of k8s. In this case, it might be better to use a dedicated machine for your app.
But you can assign a pod to a special node - Assigning Pods to Nodes. The should be necessary only, when special hardware requirements are needed (e.g. the AI-microservice needs a GPU, which is only on node xy).
k8s don't restart your pod for fun. It will restart it, when there is a reason (node died, app died, ...) and I never noticed a "random reschedule" in a cluster. It is hard to say, without any further information (like deployment, logs, cluster) what exactly happened to you.
And for your comment: There are different types of recreation, one of them starts a fresh instance and will kill the old one, when the startup was successfully. Look here: Kubernetes deployment strategies
All points together:
Don't enforce a node to your app - k8s will "smart" select the node.
There are normally no planned reschedules in k8s.
k8s will recreate pods only, if there is a reason. Maybe your app didn't answer on the liveness-endpoint? Or someone/something deleting your pod?

How to create a dependency between Kubernetes Deployments or StatefulSets

I have a couple of Stateful Sets, one is dependent on another one, I need the pods in the first Stateful Set to be in Ready state before the 2nd Stateful Set is started to be scaled.
I'm looking to see if there is a way to link the two Stateful Sets to achieve this in an automatic way, instead of me manually doing it. I have the same concern about Deployments as well.
Currently I don't use Helm or any other tools, just kubectl.
I would advise against doing this as it would increase coupling between different services.
A better way of handling your problem is for containers in the dependent service to check if the other service is available. If it isn't, it's ok to crash the container since Kubernetes provides self-healing by automatically restarting containers for you. This way, the dependent service will try to connect to the service and if the latter isn't available, then the dependent service will crash and try again later using exponential back-off.

Specify scheduling order of a Kubernetes DaemonSet

I have Consul running in my cluster and each node runs a consul-agent as a DaemonSet. I also have other DaemonSets that interact with Consul and therefore require a consul-agent to be running in order to communicate with the Consul servers.
My problem is, if my DaemonSet is started before the consul-agent, the application will error as it cannot connect to Consul and subsequently get restarted.
I also notice the same problem with other DaemonSets, e.g Weave, as it requires kube-proxy and kube-dns. If Weave is started first, it will constantly restart until the kube services are ready.
I know I could add retry logic to my application, but I was wondering if it was possible to specify the order in which DaemonSets are scheduled?
Kubernetes itself does not provide a way to specific dependencies between pods / deployments / services (e.g. "start pod A only if service B is available" or "start pod A after pod B").
The currect approach (based on what I found while researching this) seems to be retry logic or an init container. To quote the docs:
They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.
This means you can either add retry logic to your application (which I would recommend as it might help you in different situations such as a short service outage) our you can use an init container that polls a health endpoint via the Kubernetes service name until it gets a satisfying response.
retry logic is preferred over startup dependency ordering, since it handles both the initial bringup case and recovery from post-start outages

Kubernetes job that consists of two pods (that must run on different nodes and communicate with each other)

I am trying to create a Kubernetes job that consists of two pods that have to be scheduled on separate nodes in our Hybrid cluster. Our requirement is that one of the pods runs on a Windows Server node and the other pod is running on a Linux node (thus we cannot just run two Docker containers from the same pod, which I know is possible, but would not work in our scenario). The Linux pod (which you can imagine as a client) will communicate over the network with the Windows pod (which you can imagine as a stateful server) exchanging data while the job runs. When the Linux pod terminates, we want to also terminate the Windows pod. However, if one of the pods fail, then we want to fail both pods (as they are designed to be a single job)
Our current design is to write a K8S service that handles the communication between the pods, and then apply the service and the two pods to the cluster to "emulate" a job. However, this is not ideal since the two pods are not tightly coupled as a single job and adds quite a bit of overhead to manually manage this setup (e.g. when failures or the job, we probably need to manually kill the service and deployment of the Windows pod). Plus we would need to deploy a new service for each "job", as we require the Linux pod to always communicate with the same Windows pod for the duration of the job due to underlying state (thus cannot use a single service for all Windows pods).
Any thoughts on how this could be best achieved on Kubernetes would be much appreciated! Hopefully this scenario is supported natively, and I would not need to resort in this kind of pod-service-pod setup that I described above.
Many thanks
I am trying to distinguish your distaste for creating and wiring the Pods from your distaste at having to do so manually. Because, in theory, a Job that creates Pods is very similar to what you are describing, and would be able to have almost infinite customization for those kinds of rules. With a custom controller like that, one need not create a Service for the client(s) to speak to their server, as the Job could create the server Pod first, obtain its Pod-specific-IP, and feed that to the subsequently created client Pods.
I would expect one could create a Job controller using only bash and either curl or kubectl: generate the json or yaml that describes the situation you wish to have, feed it to the kubernetes API (since the Job would have a service account - just like any other in-cluster container), and use normal traps to cleanup after itself. Without more of the specific edge cases loaded in my head it's hard to say if that's a good idea or not, but I believe it's possible.

Ignite ReadinessProbe

Deploying an ignite cluster within Kubernetes, I cam across an issue that prevents cluster members from joining the group. If I use a readinessProbe and a livenessProbe, even with a delay as low as 10 seconds, they nodes never join each other. If I remove those probes, they find each other just fine.
So, my question is: can you use these probes to monitor node health, and if so, what are appropriate settings. On top of that, what would be good, fast health checks for Ignite, anyway?
Update:
After posting on the ignite mailing list, it looks like StatefulSets are the way to go. (Thanks Dmitry!)
I think I'm going to leave in the below logic to self-heal any segmentation issues although hopefully it won't be triggered often.
Original answer:
We are having the same issue and I think we have a workable solution. The Kubernetes discovery spi lists services as they become ready.
This means that if there are no ready pods at startup time, ignite instances all think that they are the first and create their own grid.
The cluster should be able to self heal if we have a deterministic way to fail pods if they aren't part of an 'authoritative' grid.
In order to do this, we keep a reference to the TcpDiscoveryKubernetesIpFinder and use it to periodically check the list of ignite pods.
If the instance is part of a cluster that doesn't contain the alphabetical first ip in the list, we know we have a segmented topology. Killing the pods that get into that state should cause them to come up again, look at service list and join the correct topology.
I am facing the same issue, using Ignite embedded within a Java spring application.
As you said the readinessProbe: on the Kubernetes Deployment spec.template.spec.container has the side effect to prevent the Kubernetes Pods from being listed on the related Kubernetes Service as Endpoints
Trying without any readinessProbe, it seems to indeed works better (Ignite nodes are all joinging the same Ignite cluster)
Yet this have the undesired side effect of exposing the Kubernetes Pods when not yet ready, as Spring has not yet fully started ...