Get log files from Pod (containers) before it get killed? - kubernetes

I am having 2 containers inside one pod, 1 is DB and 1 is application. When my application container started but, not ready to accept traffic by this time the container generates some log files and I want those log files for further application investigation. As the container does not passes readiness probe and it failed to start so, the pod is getting killed so, the log files also getting deleted so how I can get those log files before the pod is getting killed??

The quickest solution is probably to just mount a volume of type hostPath to your pod. Then, bind this volume to your log directory.
See the documentation here.
Just keep in mind that this solution is certainly not the cleanest one. It's just for debug purpose.

What about forward logs to STDOUT and STDERR? That would be the cleanest solution (however, it requires some changes in your code).
https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-running-pods

A one-off solution is to install stern and then run the following command in a separate terminal before starting your application container:
stern <pod_name>
You can then pipe the output to local storage for further analysis

You can get logs of a container inside a pod by using -c CONTAINER flag of oc logs command.
If you know the name of the container in your pod, you can get the logs of that container with a command like below
for i in {1..100}; do oc get pods -o name | grep -v "deploy" | xargs -i oc logs -p {} -c CONTAINER_NAME; done
Of course it will be good if you run this in an empty project only with your failing pod.

create a persistent volume and mount it on the log directory of containers.You will get logs even after pod is killed. Few volume type that you can be used for this task are -
azure disk
hostpath
gce persistent disk
The simplest one is hostpath but not preferred .

Related

How do we check container logs in kubernetes before they are written to the log file?

kubectl logs -f <pod-name>
This command shows the logs from the container log file.
Basically, I want to check the difference between "what is generated by the container" and "what is written to the log file".
I see some unusual binary logs, so I just want to find out if the container is creating those binary logs or the logs are not properly getting written to the log file.
"Unusual logs":
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
Usually, containerized applications, do not write to the log files but send messages to stdout/stderr, there is no point in storing log files inside containers, as they will be deleted when the pod is deleted.
What you see when running
kubectl logs -f <pod-name>
are messages sent to stdout/stderr. There are no container specific logs here, only application logs.
If, for some reason, your application does write to the log file, you can check it by execing into pod with e.g.
kubectl exec -it <pod-name> -- /bin/bash
and read logs as you would in shell.
Edit
Application logs
A container engine handles and redirects any output generated to a containerized application's stdout and stderr streams. For example, the Docker container engine redirects those two streams to a logging driver, which is configured in Kubernetes to write to a file in JSON format.
Those logs are also saved to
/var/log/containers/
/var/log/pods/
By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs.
Everything you see by issuing the command
kubectl logs <pod-name>
is what application sent to stdout/stderr, or what was redirected to stdout/stderr. For example nginx:
The official nginx image creates a symbolic link from /var/log/nginx/access.log to /dev/stdout, and creates another symbolic link from /var/log/nginx/error.log to /dev/stderr, overwriting the log files and causing logs to be sent to the relevant special device instead.
Node logs
Components that do not run inside containers (e.g kubelet, container runtime) write to journald. Otherwise, they write to .log fies inside /var/log/ directory.
Excerpt from official documentation:
For now, digging deeper into the cluster requires logging into the relevant machines. Here are the locations of the relevant log files. (note that on systemd-based systems, you may need to use journalctl instead)
Master
/var/log/kube-apiserver.log - API Server, responsible for serving the API
/var/log/kube-scheduler.log - Scheduler, responsible for making scheduling decisions
/var/log/kube-controller-manager.log - Controller that manages replication controllers
Worker Nodes
/var/log/kubelet.log - Kubelet, responsible for running containers on the node
/var/log/kube-proxy.log - Kube Proxy, responsible for service load balancing
The only way I could imagine this to work is to make use of some external logging facility like Syslog or Elastisearch or anything else. Configure your application to send logs directly to logging facility (avoiding agents like fluentd or logstash which parse logs from files).
All modern languages have support for external logging. You can also configure Docker to send logs to syslog server.
Simple way to check log is kubernets:
=> If pod have single container
kubectl logs POD_NAME
=> If pod have multiple containers
kubectl logs POD_NAME -c CONTAINER_NAME -n NAMESPACE

Where does K8s store the logs that it prints when running kubectl logs -f

I am pretty sure it writes it on disk somewhere. Otherwise if the container runs for several hours and logs a lot, then it would exceed what the stderr can hold I think. No?
Is it possible to compress and download the logs of kubectl logs?i.e. comparess on the container without downloading them?
Firstly take a look on logging-kubernetes official documentation.
In most cases, Docker container logs are put in the /var/log/containers directory on your host (host they are deployed on). Docker supports multiple logging drivers but Kubernetes API does not support driver configuration.
Once a container terminates or restarts, kubelet keeps its logs on the node. To prevent these files from consuming all of the host’s storage, a log rotation mechanism should be set on the node.
You can use kubectl logs to retrieve logs from a previous instantiation of a container with --previous flag, in case the container has crashed.
If you want to take a look at additional logs. For example, in Linux journald logs can be retrieved using the journalctl command:
$ journalctl -u docker
You can implement cluster-level logging and expose or push logs directly from every application but the implementation for such a logging mechanism is not in the scope of Kubernetes.
Also there are many tools offered for Kubernetes for logging management and aggregation - see: logs-tools.

Kubernetes: view logs of crashed Airflow worker pod

Pods on our k8s cluster are scheduled with Airflow's KubernetesExecutor, which runs all Tasks in a new pod.
I have a such a Task for which the pod instantly (after 1 or 2 seconds) crashes, and for which of course I want to see the logs.
This seems hard. As soon the pod crashes, it gets deleted, along with the ability to retrieve crash logs. I already tried all of:
kubectl logs -f <pod> -p: cannot be used since these pods are named uniquely
(courtesy of KubernetesExecutor).
kubectl logs -l label_name=label_value: I
struggle to apply the labels to the pod (if this is a known/used way of working, I'm happy to try further)
An shared nfs is mounted on all pods on a fixed log directory. The failing pod however, does not log to this folder.
When I am really quick I run kubectl logs -f -l dag_id=sample_dag --all-containers (dag_idlabel is added byAirflow)
between running and crashing and see Error from server (BadRequest): container "base" in pod "my_pod" is waiting to start: ContainerCreating. This might give me some clue but:
these are only but the last log lines
this is really backwards
I'm basically looking for the canonical way of retrieving logs from transient pods
You need to enable remote logging. Code sample below is for using S3. In airflow.cfg set the following:
remote_logging = True
remote_log_conn_id = my_s3_conn
remote_base_log_folder = s3://airflow/logs
The my_s3_conn can be set in airflow>Admin>Connections. In the Conn Type dropdown, select S3.

Difficulty with different kubernetes pods run using kubetctl apply running same container images sharing directories

I am attempting to run two separate pods using the same container image on a cluster by applying a config file. Despite there being no shared or persistent volume when both pods are active the same directory on both pods is updated with created files from the other pod and write access changes suddenly. The container being used is the jupyter-docker-stacks jupyter/minimal-notebook image being pulled directly from dockerhub. These pods running this container is created by applying a manifest. The two pods have different labels and names. A service with a unique name is created for each pod for access.
Do resources for containers persist over time on a cluster like in docker containers? I cannot find something equivalent to a --rm flag to be used alongside kubectl apply.
Thanks
If you want to delete the pod after the job is completed, you might want to apply job instead of pod. The idea of job in k8s is to launch a pod and do the job, and then the pod get stopped. For more info: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
$ kubectl apply -f <fileName> will create or make some changes in the pod. If you want to delete pod using apply you must use $ kubectl delete -f <fileName>
About sharing, if you have 2 separate manifest you can specify volumeMounts for each container. For more information please read the documentation depends on your needs.
Also as #Kaizhe Huang advised you can use Job if you want to execute something one time or try initContainers if you want to install something in POD before main container will be run. More about initContainers here.
You could check the dockerfile of your image. See if there are 'VOLUME' claimed. If have, maybe they share the same volume on host. Not sure, but you could check.

Can I get hold of a log file in a kubernetes pod?

Is there any way to get hold of the log file of the pod in Kubernetes cluster?
I know I can fetch logs using "kubectl exec log -f $POD_NAME" command but I want to get access to log file directly.
It depends on the logging driver you're using
I'm assuming you're using the default json logging driver here, but you can see the node the pod is scheduled on by using kubectl get po -o wide
Then, logon to that node and you'll see the docker logs of the container under /var/lib/docker/containers/<long_container_id>/<long_container_id>-json.log
You will need to use docker ps and docker inspect to determine the long container id.
Run kubectl get pod <pod_name> -n <namespace> -o jsonpath='{.spec.nodeName}' to get the node this Pod is running on.
ssh into the node and you'll find the logs for the Pod at /var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/.
The files within the /var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/ directory are symlinks to where your container runtime writes its container log files. So unlike jaxxstorm's answer, it doesn't matter which container runtime you're running.
I normally retrieve it from /var/log/containers where you will find all the containers' logs deployed on that particular machine