I have been running a pod for more than a week and there has been no restart since started. But, still I am unable to view the logs since it started and it only gives logs for the last two days. Is there any log rotation policy for the container and how to control the rotation like based on size or date?
I tried the below command but shows only last two days logs.
kubectl logs POD_NAME --since=0
Is there any other way?
Is there any log rotation policy for the container and how to control the rotation like based on size or date
The log rotation is controlled by the docker --log-driver and --log-opts (or their daemon.json equivalent), which for any sane system has file size and file count limits to prevent a run-away service from blowing out the disk on the docker host. that answer also assumes you are using docker, but that's a fairly safe assumption
Is there any other way?
I strongly advise something like fluentd-elasticsearch, or graylog2, or Sumologic, or Splunk, or whatever in order to egress those logs from the hosts. No serious cluster would rely on infinite log disks nor on using kubectl logs in a for loop to search the output of Pods. To say nothing of egressing the logs from the kubernetes containers themselves, which is almost essential for keeping tabs on the health of the cluster.
Related
Background: I use glog to register signal handler, but it cannot kill the init process (PID=1) with kill sigcall. That way, even though deadly signals like SIGABRT is raised, kubernetes controller manager won't be able to understand the pod is actually not functioning, thus kill the pod and restart a new one.
My idea is to add logic into my readiness/liveness probe: check the content for current container, whether it's in healthy state.
I'm trying to look into the logs on container's local filesystem /var/log, but haven't found anything useful.
I'm wondering if it's possible to issue a HTTP request to somewhere, to get the complete log? I assume it's stored somewhere.
You can find the kubernetes logs on Master machine at:
/var/log/pods
if using docker containers:
/var/lib/docker/containers
Containers are Ephemeral
Docker containers emit logs to the stdout and stderr output streams. Because containers are stateless, the logs are stored on the Docker host in JSON files by default.
The default logging driver is json-file. The logs are then annotated with the log origin, either stdout or stderr, and a timestamp. Each log file contains information about only one container.
As #Uri Loya said, You can find these JSON log files in /var/lib/docker/containers/ directory on a Linux Docker host. Here's how you can access them:
/var/lib/docker/containers/<container id>/<container id>-json.log
You can collect the logs with a log aggregator and store them in a place where they'll be available forever. It's dangerous to keep logs on the Docker host because they can build up over time and eat into your disk space. That's why you should use a central location for your logs and enable log rotation for your Docker containers.
I have a bunch of Rancher clusters I take care of and on some of them developers use PriorityClasses to ensure that some of the more important workloads get scheduled. The 3 PriorityClasses are in 3 digits range so they will not interfere with the default ones. However, at present none of the PriorityClasses is set as default and neither is the preemptionPolicy set so it defaults to PreemptLowerPriority.
None of the rancher, longhorn, prometheus, grafana, etc., workloads have priorityClassName set.
Long story short, I believe this causes havoc on the cluster when resources are in short supply.
Before I take my opinion to the developers I would like to collect some data to back up my story.
The question: How do I detect if the pod was Terminated due to Preemption?
I tried to google the subject but couldn't find anything. I was hoping kube state metrics would have something but I didn't find anything.
Any help would be greatly appreciated.
You can try to look for convincing data like the pod termination reason with help of kubectl.
You can see the last restart logs of a container using the following command:
kubectl logs podname -c containername --previous
You can also use the following command to check the lifecycle events sent by the kubelet to the apiserver about the pod.
kubectl describe pod podname
Finally, You can also write a final message to /dev/termination-log, and this will show up as described in the docs.
To use kubectl commands with rancher kindly refer to this documentation page.
I have many versions and tags of containers used by Deployment in k8s (and hence many log groups).
It would be nice if i could display the container URI and tag in a readinessProbe or livenessProbe which then flows to persisted logging.
Basically so that I know my Pod whose logs I am viewing is running on the correct image.
I thought of simply echoing it as a container variable, so i thought of setting the container image URI as a container variable in the Pod manifest.
The docs at k8s EnvVarSource says it only supports certain fields for fieldRef, importantly, it doesn't support grabbing the spec.containers image field.
Anyone has any smart ideas how I might achieve this in other ways?
Or when/if does the kubernetes team support this?
UPDATE:
i found that doing echo under readinessProbe.exec.command works (the Pod is Ready status) but the echo output does not flow to the logs.
Only the application (server) output appear in the logs in my logging backend (CloudWatch).
If I run, for example, kubectl logs --namespace kube-system kube-apiserver-XXXX | head -n 25 I can see the output with only a timestamp and no date. I can't tell if these are from the inception of the pod or not.
Generally speaking, how long do a pod's logs last in Kubernetes?
From k8s unofficial docs
Kubernetes performs log rotation daily, or if the log file grows beyond 10MB in size. Each rotation belongs to a single container; if the container repeatedly fails or the pod is evicted, all previous rotations for the container are lost. By default, Kubernetes keeps up to five logging rotations per container.
The container runtime attaches the current timestamp to every line produced by the application. You can display these timestamps by using the --timestamps=true option while running kubectl logs.
This morning I learned about the (unfortunate) default in kubernetes of all previously run cronjobs' jobs instances being retained in the cluster. Mea culpa for not reading that detail in the documentation. I also notice that deleting jobs (kubectl delete job [<foo> or --all]) takes quite a long time. Further, I noticed that even a reasonably provisioned kubernetes cluster with three large nodes appears to fail (get timeouts of all sorts when trying to use kubectl) when there are just ~750 such old jobs in the system (plus some other active containers that otherwise had not entailed heavy load) [Correction: there were also ~7k pods associated with those old jobs that were also retained :-o]. (I did learn about the configuration settings to limit/avoid storing old jobs from cronjobs, so this won't be a problem [for me] in the future.)
So, since I couldn't find documentation for kubernetes about this, my (related) questions are:
what exactly is stored when kubernetes retains old jobs? (Presumably it's the associated pod's logs and some metadata, but this doesn't explain why they seemed to place such a load on the cluster.)
is there a way to see the resources (disk only, I assume, but maybe
there is some other resource) that individual or collective old jobs
are using?
why does deleting a kubernetes job take on the order of a minute?
I don't know if k8s provides that kinda details of what job is consuming how much disk space but here is something you can try.
Try to find the pods associated with the job:
kubectl get pods --selector=job-name=<job name> --output=jsonpath={.items..metadata.name}
Once you know the pod then find the docker container associated with it:
kubectl describe pod <pod name>
In the above output look for Node & Container ID. Now go on that node and in that node goto path /var/lib/docker/containers/<container id found above> here you can do some investigation to find out what is wrong.