I think kubernetes preStop hook capabilities are poorly defined.
my requirement is straight forward to call an API to identify if pod crashed but I am not getting any logs even though I have added echo statment in the command section
Can someone point me to example that achieve similar requirements.
The preStop hook is designed for orderly shutdown, not for crash notifications. (kubelet will restart the container automatically for you anyway)
Related
I am using Kubernetes (specifically, I am using Azure Kubernetes Service, if that matters in this case). As far as I understand:
I can use the preStop hook to execute some code before my pod is terminated.
The preStop hook will be interrupted if it takes more seconds than what is defined by the terminationGracePeriodSeconds parameter.
The preStop is executed at least once in any situation in which the pod will be terminated. That is what I understood from this documentation, although I may be making some mistaken assumptions.
I wanted to know: is there any case in which the preStop hook will not be called at all? For instance, if the pod is evicted or some other scenario.
Also: how safe is it to rely on this hook to execute some logic needed to save the state of the pod?
And, finally: I assume this hook will not be executed if the node (in which the pod "lives") crashes. Is that correct?
One of my namespace is in Terminating state.
While there are many posts that explain how to forcefully delete such namespaces. The ultimate result is that everything in your namespace will be gone. Which is not what you might want especially if that termination was a result of mistake or bug (or may cause downtime of any kind).
Is it possible to tell kubernetes not to try to delete that namespace anymore. Where that state is kept?
Terminating state blocks me from recreating the whole stack with gitops (helm chart installation in such namespace is not possible).
I simply wish to remove the terminating state and my fluxcd controller would fix everything else.
Is there a way to cancel namespace termination in kubernetes?
As far as I know, unfortunately not. Termination is a one-way process. Note how termination pods take place:
You send a command or API call to terminate the Pod.
Kubernetes updates the Pod status to reflect the time after which the Pod is to be considered "dead" (the time of the termination request plus the grace period).
Kubernetes marks the Pod state as "Terminating" and stops sending traffic to the Pod.
Kubernetes send a TERM signal to the Pod, indicating that the Pod should shut down.
When the grace period expires, Kubernetes issues a SIGKILL to any processes still running in the Pod.
Kubernetes removes the Pod from the API server on the Kubernetes Master.
So it is impossible to cancel termination process.
Is it possible to tell kubernetes not to try to delete that namespace anymore.
There is no dedicated solution, but you can try to automate this process with custom scripts. Look at this example in Python and another one in Bash.
See also this question.
I would like to intercept all pod shutdowns to perform diagnostic actions like fetching logs. Is this possible either via the k8s API or some type of Linux hook on process exit?
I would encourage you to read about Logging Architecture in Kubernetes.
Application logs can help you understand what is happening inside your application. The logs are particularly useful for debugging problems and monitoring cluster activity.
Depend on your needs you can configure it at Node leve or Cluster level.
Cluster-level logging architectures require a separate backend to store, analyze, and query logs. Kubernetes does not provide a native storage solution for log data. Instead, there are many logging solutions that integrate with Kubernetes.
Depend on your environment (local or cloud) and your needs you can use many integrated applications to centralize logs, like Fluentd, Stackdriver, Datadog, Logspout, etc.
In short you would be able to get all logs from deleted pods and find root cause.
Another thing which might help you to achieve your goal is to use Container Lifecycle Hooks like PostStart and PreStop.
Analogous to many programming language frameworks that have component lifecycle hooks, such as Angular, Kubernetes provides Containers with lifecycle hooks. The hooks enable Containers to be aware of events in their management lifecycle and run code implemented in a handler when the corresponding lifecycle hook is executed.
If you would want to implement them in your setup, you can check Attach Handlers to Container Lifecycle Events Documentation. It's using postStart and preStop events.
Kubernetes sends the postStart event immediately after a Container is started, and it sends the preStop event immediately before the Container is terminated. A Container may specify one handler per event.
For example you could configure preStop event to write some last logs, errors, exit code to a file.
There is also an option to set specific termination message path write there status or information why pod was terminated. More details can be found in Determine the Reason for Pod Failure Documentation.
Last thing which is worth to mention is Termination Grace Period. Grace Period Time is the time the kubelet gives you to shut down gracefully (by handling TERM signals). Additional information you can find in Termination of Pods. It might be solution if pod needs more than 30 seconds to shut down.
It's also worth to mention that you can also use script to get logs from pods like Kubetail.
Bash script that enables you to aggregate (tail/follow) logs from multiple pods into one stream. This is the same as running "kubectl logs -f " but for multiple pods.
Blockquote
I'm managing a application inside kubernetes,
I have a front end (nginx, flask) and a backend (celery)
Long running tasks are sent to the backend using a middle-ware (rabbitmq)
My issue here is that i can receive long running tasks at anytime, and i don't want it to disturb my plan of upgrading the version of my application.
I'm using the command kubectl apply -f $MY_FILE to deploy/update my application. But if i do it when a celery po is busy, the pod will be terminated, and i'll be losing the task.
I tried using the readiness probe, but the pods are still being terminated.
My question is, is there a way for kube to target only 'free' pods, and wait for the busy on to finish ?
Thank you
You can use preStop hooks to complete ongoing task before the pod is terminated.
Kubernetes sends the preStop event immediately before the Container is terminated. Kubernetes’ management of the Container blocks until the preStop handler completes, unless the Pod’s grace period expires.For more details, see Termination of Pods.
https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/#define-poststart-and-prestop-handlers
One way is to create another deployment with the new image, expose it as a service. Pass on any new requests ONLY to this new deployment/service.
Meanwhile, the old deployment/service can still continue processing the existing requests and not take any new ones. Once all the requests are processed the old deployment/service can be deleted.
The only problem with this approach, roughly double the resources are required for some duration as old/new deployment/service run in parallel.
Something like a A/B testing. FYI ... Istio makes is easy with traffic management.
I would like to take particular actions when a K8 Pod, or the node its running on, crashes/restarts/etc -- basically notify another part of the application that this has happened. I also need this to be guaranteed to execute. Can a kubernetes PreStop hook accomplish this? From my understanding, these are generally used to gracefully shutdown containers when a pod is deleted and the hook handler is guaranteed to run. It seems like most people use them in scenarios where they are shutting things down themselves.
Will the hooks also run when a node unexpectedly crashes? If not, is there a kubernetes solution for what I'm trying to accomplish?
PreStop hook doesn't work for nodes. PreStop hook is a task running during termination of containers and is executing specific command or HTTP request against a specific endpoint on the Container.
If you are interested in health monitoring of nodes, you may read about
node-problem-detector already installed by default to Kubernetes in GCE.