Intercepting All Pod Shutdowns on Kubernetes to Perform Diagnostics - kubernetes

I would like to intercept all pod shutdowns to perform diagnostic actions like fetching logs. Is this possible either via the k8s API or some type of Linux hook on process exit?

I would encourage you to read about Logging Architecture in Kubernetes.
Application logs can help you understand what is happening inside your application. The logs are particularly useful for debugging problems and monitoring cluster activity.
Depend on your needs you can configure it at Node leve or Cluster level.
Cluster-level logging architectures require a separate backend to store, analyze, and query logs. Kubernetes does not provide a native storage solution for log data. Instead, there are many logging solutions that integrate with Kubernetes.
Depend on your environment (local or cloud) and your needs you can use many integrated applications to centralize logs, like Fluentd, Stackdriver, Datadog, Logspout, etc.
In short you would be able to get all logs from deleted pods and find root cause.
Another thing which might help you to achieve your goal is to use Container Lifecycle Hooks like PostStart and PreStop.
Analogous to many programming language frameworks that have component lifecycle hooks, such as Angular, Kubernetes provides Containers with lifecycle hooks. The hooks enable Containers to be aware of events in their management lifecycle and run code implemented in a handler when the corresponding lifecycle hook is executed.
If you would want to implement them in your setup, you can check Attach Handlers to Container Lifecycle Events Documentation. It's using postStart and preStop events.
Kubernetes sends the postStart event immediately after a Container is started, and it sends the preStop event immediately before the Container is terminated. A Container may specify one handler per event.
For example you could configure preStop event to write some last logs, errors, exit code to a file.
There is also an option to set specific termination message path write there status or information why pod was terminated. More details can be found in Determine the Reason for Pod Failure Documentation.
Last thing which is worth to mention is Termination Grace Period. Grace Period Time is the time the kubelet gives you to shut down gracefully (by handling TERM signals). Additional information you can find in Termination of Pods. It might be solution if pod needs more than 30 seconds to shut down.
It's also worth to mention that you can also use script to get logs from pods like Kubetail.
Bash script that enables you to aggregate (tail/follow) logs from multiple pods into one stream. This is the same as running "kubectl logs -f " but for multiple pods.
Blockquote

Related

Export logs of Kubernetes cronjob to a path after each run

I currently have a Cronjob that has a job that schedule at some period of time and run in a pattern. I want to export the logs of each pod runs to a file in the path as temp/logs/FILENAME
with the FILENAME to be the timestamp of the run being created. How am I going to do that? Hopefully to provide a solution. If you would need to add a script, then please use python or shell command. Thank you.
According to Kubernetes Logging Architecture:
In a cluster, logs should have a separate storage and lifecycle
independent of nodes, pods, or containers. This concept is called
cluster-level logging.
Cluster-level logging architectures require a separate backend to
store, analyze, and query logs. Kubernetes does not provide a native
storage solution for log data. Instead, there are many logging
solutions that integrate with Kubernetes.
Which brings us to Cluster-level logging architectures:
While Kubernetes does not provide a native solution for cluster-level
logging, there are several common approaches you can consider. Here
are some options:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application.
Kubernetes does not provide log aggregation of its own. Therefore, you need a local agent to gather the data and send it to the central log management. See some options below:
Fluentd
ELK Stack
You can find all logs that PODs are generating at /var/log/containers/*.log
on each Kubernetes node. You could work with them manually if you prefer, using simple scripts, but you will have to keep in mind that PODs can run on any node (if not restricted), and nodes may come and go.
Consider sending your logs to an external system like ElasticSearch or Grafana Loki and manage them there.

K8s limit for pod logs

When I try to retrieve logs from my pods, I note that K8s does not print all the logs, and I know that because I observe that logs about microservice initialization are not present in the head of logs.
Considering that my pods print a lot of logs in a long observation period, does someone know if K8s has a limit in showing all logs?
I also tried to set --since parameter in the kubectl logs command to get all logs in a specific time range, but it seems to have no effect.
Thanks.
The container runtime engine typically manages container (pod) logs. Do check the settings on the runtime engine in use.
There seems to be an issue with the logging earlier. Attaching the link for the same. https://github.com/kubernetes/kubernetes/pull/78071
There are some answers, I'll add more details and sources.
The answer is quite short. There is no limit but free space. By default kubernetes is not responsible for log rotation:
An important consideration in node-level logging is implementing log
rotation, so that logs don't consume all available storage on the
node. Kubernetes is not responsible for rotating logs, but rather a
deployment tool should set up a solution to address that. For example,
in Kubernetes clusters, deployed by the kube-up.sh script, there is a
logrotate tool configured to run each hour. You can also set up a
container runtime to rotate an application's logs automatically.
As it was stated by William, Kubernetes itself doesn’t provide log aggregation of its own and it relies on container runtime by default.
When a container running on Kubernetes writes its logs to stdout or
stderr streams, they are picked up by the kubelet service running on
that node, and are delegated to the container engine for handling
based on the logging driver configured in Kubernetes.
In most cases, Docker container logs will end up in the
/var/log/containers directory on your host. Docker supports multiple
logging drivers but, unfortunately, Kubernetes API does not support
driver configuration.
Once a container terminates or restarts, kubelet keeps its logs on the
node. To prevent these files from consuming all of the host’s storage,
a log rotation mechanism should be set on the node.
Kubernetes doesn’t provide built-in log rotation, but this
functionality is available in many tools, such as Docker’s log-opt, or
standard file shippers or even a simple custom cron job. When a
container is evicted from the node, so are its corresponding log files
That means you can try to find full logs in /var/log/containers and var/log/pods. This part is from official documentation and more precise:
By default, if a container restarts, the kubelet keeps one terminated
container with its logs. If a pod is evicted from the node, all
corresponding containers are also evicted, along with their logs.
To have a good visibility and accessibility of logs you may consider having a dedicated solution for logs storing. E.g. node logging agent or streaming to a sidecar
Please find articles and official kubernetes documentation with concepts and examples:
Kubernetes logging architecture
Practical guide to kubernetes

With Kubernetes Is there a way to wait for a pod to finish its ongoing tasks before updating it?

I'm managing a application inside kubernetes,
I have a front end (nginx, flask) and a backend (celery)
Long running tasks are sent to the backend using a middle-ware (rabbitmq)
My issue here is that i can receive long running tasks at anytime, and i don't want it to disturb my plan of upgrading the version of my application.
I'm using the command kubectl apply -f $MY_FILE to deploy/update my application. But if i do it when a celery po is busy, the pod will be terminated, and i'll be losing the task.
I tried using the readiness probe, but the pods are still being terminated.
My question is, is there a way for kube to target only 'free' pods, and wait for the busy on to finish ?
Thank you
You can use preStop hooks to complete ongoing task before the pod is terminated.
Kubernetes sends the preStop event immediately before the Container is terminated. Kubernetes’ management of the Container blocks until the preStop handler completes, unless the Pod’s grace period expires.For more details, see Termination of Pods.
https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/#define-poststart-and-prestop-handlers
One way is to create another deployment with the new image, expose it as a service. Pass on any new requests ONLY to this new deployment/service.
Meanwhile, the old deployment/service can still continue processing the existing requests and not take any new ones. Once all the requests are processed the old deployment/service can be deleted.
The only problem with this approach, roughly double the resources are required for some duration as old/new deployment/service run in parallel.
Something like a A/B testing. FYI ... Istio makes is easy with traffic management.

Specify scheduling order of a Kubernetes DaemonSet

I have Consul running in my cluster and each node runs a consul-agent as a DaemonSet. I also have other DaemonSets that interact with Consul and therefore require a consul-agent to be running in order to communicate with the Consul servers.
My problem is, if my DaemonSet is started before the consul-agent, the application will error as it cannot connect to Consul and subsequently get restarted.
I also notice the same problem with other DaemonSets, e.g Weave, as it requires kube-proxy and kube-dns. If Weave is started first, it will constantly restart until the kube services are ready.
I know I could add retry logic to my application, but I was wondering if it was possible to specify the order in which DaemonSets are scheduled?
Kubernetes itself does not provide a way to specific dependencies between pods / deployments / services (e.g. "start pod A only if service B is available" or "start pod A after pod B").
The currect approach (based on what I found while researching this) seems to be retry logic or an init container. To quote the docs:
They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.
This means you can either add retry logic to your application (which I would recommend as it might help you in different situations such as a short service outage) our you can use an init container that polls a health endpoint via the Kubernetes service name until it gets a satisfying response.
retry logic is preferred over startup dependency ordering, since it handles both the initial bringup case and recovery from post-start outages

Kubernetes Lifecycle Hooks

I would like to take particular actions when a K8 Pod, or the node its running on, crashes/restarts/etc -- basically notify another part of the application that this has happened. I also need this to be guaranteed to execute. Can a kubernetes PreStop hook accomplish this? From my understanding, these are generally used to gracefully shutdown containers when a pod is deleted and the hook handler is guaranteed to run. It seems like most people use them in scenarios where they are shutting things down themselves.
Will the hooks also run when a node unexpectedly crashes? If not, is there a kubernetes solution for what I'm trying to accomplish?
PreStop hook doesn't work for nodes. PreStop hook is a task running during termination of containers and is executing specific command or HTTP request against a specific endpoint on the Container.
If you are interested in health monitoring of nodes, you may read about
node-problem-detector already installed by default to Kubernetes in GCE.