Kubernetes Lifecycle Hooks - kubernetes

I would like to take particular actions when a K8 Pod, or the node its running on, crashes/restarts/etc -- basically notify another part of the application that this has happened. I also need this to be guaranteed to execute. Can a kubernetes PreStop hook accomplish this? From my understanding, these are generally used to gracefully shutdown containers when a pod is deleted and the hook handler is guaranteed to run. It seems like most people use them in scenarios where they are shutting things down themselves.
Will the hooks also run when a node unexpectedly crashes? If not, is there a kubernetes solution for what I'm trying to accomplish?

PreStop hook doesn't work for nodes. PreStop hook is a task running during termination of containers and is executing specific command or HTTP request against a specific endpoint on the Container.
If you are interested in health monitoring of nodes, you may read about
node-problem-detector already installed by default to Kubernetes in GCE.

Related

Kubernetes preStop hook

I think kubernetes preStop hook capabilities are poorly defined.
my requirement is straight forward to call an API to identify if pod crashed but I am not getting any logs even though I have added echo statment in the command section
Can someone point me to example that achieve similar requirements.
The preStop hook is designed for orderly shutdown, not for crash notifications. (kubelet will restart the container automatically for you anyway)

Intercepting All Pod Shutdowns on Kubernetes to Perform Diagnostics

I would like to intercept all pod shutdowns to perform diagnostic actions like fetching logs. Is this possible either via the k8s API or some type of Linux hook on process exit?
I would encourage you to read about Logging Architecture in Kubernetes.
Application logs can help you understand what is happening inside your application. The logs are particularly useful for debugging problems and monitoring cluster activity.
Depend on your needs you can configure it at Node leve or Cluster level.
Cluster-level logging architectures require a separate backend to store, analyze, and query logs. Kubernetes does not provide a native storage solution for log data. Instead, there are many logging solutions that integrate with Kubernetes.
Depend on your environment (local or cloud) and your needs you can use many integrated applications to centralize logs, like Fluentd, Stackdriver, Datadog, Logspout, etc.
In short you would be able to get all logs from deleted pods and find root cause.
Another thing which might help you to achieve your goal is to use Container Lifecycle Hooks like PostStart and PreStop.
Analogous to many programming language frameworks that have component lifecycle hooks, such as Angular, Kubernetes provides Containers with lifecycle hooks. The hooks enable Containers to be aware of events in their management lifecycle and run code implemented in a handler when the corresponding lifecycle hook is executed.
If you would want to implement them in your setup, you can check Attach Handlers to Container Lifecycle Events Documentation. It's using postStart and preStop events.
Kubernetes sends the postStart event immediately after a Container is started, and it sends the preStop event immediately before the Container is terminated. A Container may specify one handler per event.
For example you could configure preStop event to write some last logs, errors, exit code to a file.
There is also an option to set specific termination message path write there status or information why pod was terminated. More details can be found in Determine the Reason for Pod Failure Documentation.
Last thing which is worth to mention is Termination Grace Period. Grace Period Time is the time the kubelet gives you to shut down gracefully (by handling TERM signals). Additional information you can find in Termination of Pods. It might be solution if pod needs more than 30 seconds to shut down.
It's also worth to mention that you can also use script to get logs from pods like Kubetail.
Bash script that enables you to aggregate (tail/follow) logs from multiple pods into one stream. This is the same as running "kubectl logs -f " but for multiple pods.
Blockquote

With Kubernetes Is there a way to wait for a pod to finish its ongoing tasks before updating it?

I'm managing a application inside kubernetes,
I have a front end (nginx, flask) and a backend (celery)
Long running tasks are sent to the backend using a middle-ware (rabbitmq)
My issue here is that i can receive long running tasks at anytime, and i don't want it to disturb my plan of upgrading the version of my application.
I'm using the command kubectl apply -f $MY_FILE to deploy/update my application. But if i do it when a celery po is busy, the pod will be terminated, and i'll be losing the task.
I tried using the readiness probe, but the pods are still being terminated.
My question is, is there a way for kube to target only 'free' pods, and wait for the busy on to finish ?
Thank you
You can use preStop hooks to complete ongoing task before the pod is terminated.
Kubernetes sends the preStop event immediately before the Container is terminated. Kubernetes’ management of the Container blocks until the preStop handler completes, unless the Pod’s grace period expires.For more details, see Termination of Pods.
https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/#define-poststart-and-prestop-handlers
One way is to create another deployment with the new image, expose it as a service. Pass on any new requests ONLY to this new deployment/service.
Meanwhile, the old deployment/service can still continue processing the existing requests and not take any new ones. Once all the requests are processed the old deployment/service can be deleted.
The only problem with this approach, roughly double the resources are required for some duration as old/new deployment/service run in parallel.
Something like a A/B testing. FYI ... Istio makes is easy with traffic management.

Specify scheduling order of a Kubernetes DaemonSet

I have Consul running in my cluster and each node runs a consul-agent as a DaemonSet. I also have other DaemonSets that interact with Consul and therefore require a consul-agent to be running in order to communicate with the Consul servers.
My problem is, if my DaemonSet is started before the consul-agent, the application will error as it cannot connect to Consul and subsequently get restarted.
I also notice the same problem with other DaemonSets, e.g Weave, as it requires kube-proxy and kube-dns. If Weave is started first, it will constantly restart until the kube services are ready.
I know I could add retry logic to my application, but I was wondering if it was possible to specify the order in which DaemonSets are scheduled?
Kubernetes itself does not provide a way to specific dependencies between pods / deployments / services (e.g. "start pod A only if service B is available" or "start pod A after pod B").
The currect approach (based on what I found while researching this) seems to be retry logic or an init container. To quote the docs:
They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.
This means you can either add retry logic to your application (which I would recommend as it might help you in different situations such as a short service outage) our you can use an init container that polls a health endpoint via the Kubernetes service name until it gets a satisfying response.
retry logic is preferred over startup dependency ordering, since it handles both the initial bringup case and recovery from post-start outages

How to properly use Kubernetes for job scheduling?

I have the following system in mind: A master program that polls a list of tasks to see if they should be launched (based on some trigger information). The tasks themselves are container images in some repository. Tasks are executed as jobs on a Kubernetes cluster to ensure that they are run to completion. The master program is a container executing in a pod that is kept running indefinitely by a replication controller.
However, I have not stumbled upon this pattern of launching jobs from a pod. Every tutorial seems to be assuming that I just call kubectl from outside the cluster. Of course I could do this but then I would have to ensure the master program's availability and reliability through some other system. So am I missing something? Launching one-off jobs from inside an indefinitely running pod seems to me as a perfectly valid use case for Kubernetes.
Your master program can utilize the Kubernetes client libraries to preform operations on a cluster. Find a complete example here.