Kubernetes - keeping the execution logs of a pod - kubernetes

I'm trying to keep the execution logs of containers in Kubernetes.
I added in my cronjob yaml the successfulJobsHistoryLimit: 5 failedJobsHistoryLimit: 5 in order to see the execution history, but when I try to view the logs of the pods I get this error
I assume it is because the pods have been deleted because when I go to a running pod I can see the logs.
So is there a way of keeping the logs in this part of Kubernetes or is there something that I have to setup in order to have this functionality?
Sorry if the question have been asked but I didn't really find something and I'm new to Kubernetes.
Thanks for the replies.

Looking at this problem in a bigger picture it's generally a good idea to have your logs stored via logging agents or directly pushed into an external service as per the official documentation.
Taking advantage of Kubernetes logging architecture explained here you can also try to fetch the logs directly from the log-rotate files in the node hosting the pods. Please note that this option might depend on the specific Kubernetes implementation as log files might be deleted when the pod eviction is triggered.

Related

Get K8s deployment logs for a specific release

Is there a way that I can get release logs for a particular K8s release within my K8s cluster as the replica-sets related to that deployment is no longer serving pods?
For an example kubectl rollout history deployment/pod1-dep would result
1
2 <- failed deploy
3 <- Latest deployment successful
If I want to pick the logs related to events in 2, would it be a possible task, or is there a way that we can such functionality with this.
This is a Community Wiki answer, posted for better visibility, so feel free to edit it and add any additional details you consider important.
As David Maze rightly suggested in his comment above:
Once a pod is deleted, its logs are gone with it. If you have some
sort of external log collector that will generally keep historical
logs for you, but you needed to have set that up before you attempted
the update.
So the answer to your particular question is: no, you can't get such logs once those particular pods are deleted.

K8s: why is there no easy way to get notifications if a pod becomes unhealthy and is restarted?

Why is there no easy way to get notifications if a pod becomes unhealthy and is restarted?
To me, it suggests I shouldn't care that a pod was restarted, but why not?
If a pod/container crashes for some reason Kubernetes is supposed to provide that reliability/availability that it will start somewhere else in the cluster. Having said that you probably want warnings and alerts (if you the pod goes into a Crashloopbackoff.
Although you can write your own tool you can watch for specific events in your cluster and then you alert/warn on those using some of these tools:
kubewatch
kube-slack (Slack tool).
The most popular K8s monitoring tool: prometheus.
A paid tool like Sysdig.
Think of Pods as ephemeral entities - they can live in different nodes, they can crash, they can start again...
Kubernetes is responsible to handle the lifecycle of a pod. Your job is to tell it where to run (affinity rules) and how to tell if a pod if healthy.
There are many ways of monitoring pod crashes. For example - prometheus has a great integation with Kubernetes.
I wrote an open source tool to do this called Robusta. (Yes, it's named after the coffee.)
You can send the notifications to multiple destinations - here is a screenshot for Slack.
Under the hood we're using our own fork of Kubewatch to track APIServer events, but we're adding on multiple features like fetching logs.
You define in YAML the triggers and the actions:
- triggers:
- on_pod_update: {}
actions:
- restart_loop_reporter:
restart_reason: CrashLoopBackOff
- image_pull_backoff_reporter:
rate_limit: 3600
Each action is defined with a Python function, but you typically don't need to write them yourself because we have 50+ builtin actions. (See some examples, here.)

Heapster status stuck in Container Creating or Pending status

I am new to Kubernetes and started working with it from past one month.
When creating the setup of cluster, sometimes I see that Heapster will be stuck in Container Creating or Pending status. After this happens the only way have found here is to re-install everything from the scratch which has solved our problem. Later if I run the Heapster it would run without any problem. But I think this is not the optimal solution every time. So please help out in solving the same issue when it occurs again.
Heapster image is pulled from the github for our use. Right now the cluster is running fine, So could not send the screenshot of the heapster failing with it's status by staying in Container creating or Pending status.
Suggest any alternative for the problem to be solved if it occurs again.
Thanks in advance for your time.
A pod stuck in pending state can mean more than one thing. Next time it happens you should do 'kubectl get pods' and then 'kubectl describe pod '. However, since it works sometimes the most likely cause is that the cluster doesn't have enough resources on any of its nodes to schedule the pod. If the cluster is low on remaining resources you should get an indication of this by 'kubectl top nodes' and by 'kubectl describe nodes'. (Or with gke, if you are on google cloud, you often get a low resource warning in the web UI console.)
(Or if in Azure then be wary of https://github.com/Azure/ACS/issues/29 )

kubernetes minikube faster uptime

I am using minikube and building my projects by tearing down the previous project and rebuilding it with
kubectl delete -f myprojectfiles
kubectl apply -f myprojectfiles
The files are a deployment and a service.
When I access my website I get a 503 error as I'm waiting for kubernetes to bring up the deployment. Is there anyway to speed this up? I see that my application is already built because the logs show it is ready. However it stays showing 503 for what feels like a few minute before everything in kubernetes triggers and starts serving me the application.
What are some things I can do to speed up the uptime?
Configure what is called readinessProbe, it won't fasten your boot up time, but it will help you by not giving false sense that application is up and running. With this your traffic will only be sent to your application pod when it is ready to accept the connection. Please read about it here.
FWIW your application might be waiting on some dependency to be up and running, also add these kinda health checks to that dependency pod.
You should not delete your Kubernetes resources. Use either kubectl apply or kubectl replace to update your project.
If you delete it, the nginx ingress controller won't find any upstream for a short period of time and puts on a blacklist for some seconds.
Also you should make sure, that you use Deployment which is able to do a rolling update without any downtime.

What is stored in a kubernetes job and how do I check resource use of old job(s)?

This morning I learned about the (unfortunate) default in kubernetes of all previously run cronjobs' jobs instances being retained in the cluster. Mea culpa for not reading that detail in the documentation. I also notice that deleting jobs (kubectl delete job [<foo> or --all]) takes quite a long time. Further, I noticed that even a reasonably provisioned kubernetes cluster with three large nodes appears to fail (get timeouts of all sorts when trying to use kubectl) when there are just ~750 such old jobs in the system (plus some other active containers that otherwise had not entailed heavy load) [Correction: there were also ~7k pods associated with those old jobs that were also retained :-o]. (I did learn about the configuration settings to limit/avoid storing old jobs from cronjobs, so this won't be a problem [for me] in the future.)
So, since I couldn't find documentation for kubernetes about this, my (related) questions are:
what exactly is stored when kubernetes retains old jobs? (Presumably it's the associated pod's logs and some metadata, but this doesn't explain why they seemed to place such a load on the cluster.)
is there a way to see the resources (disk only, I assume, but maybe
there is some other resource) that individual or collective old jobs
are using?
why does deleting a kubernetes job take on the order of a minute?
I don't know if k8s provides that kinda details of what job is consuming how much disk space but here is something you can try.
Try to find the pods associated with the job:
kubectl get pods --selector=job-name=<job name> --output=jsonpath={.items..metadata.name}
Once you know the pod then find the docker container associated with it:
kubectl describe pod <pod name>
In the above output look for Node & Container ID. Now go on that node and in that node goto path /var/lib/docker/containers/<container id found above> here you can do some investigation to find out what is wrong.