how to see kube-scheduler log? - kubernetes

I also want to see the logs of kube-scheduler go files. I tried different methods like
using kubeclt logs: this show the event logs of kube-scheduler but not all the logs in kube-scheduler.
using docker logs [kube-scheduler container id]: it does not have any log.
journalctl -u kubelet: only show the log of kubelet.
All of these do not work. Please let me know if you guys found the way to log them out.

Check this on Master /var/log/kube-scheduler.log
see also:
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/

You can modify the verbosity level of kube-scheduler by going into config yaml file and add an argument --v N. I find 4 works for me. It will be showing logs for various scheduler go files. Then you can check kubectl logs for a specific kube-scheduler pod.
https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/

I use kubectl get events for this.

Related

How to monitor pod preemption event

I have a bunch of Rancher clusters I take care of and on some of them developers use PriorityClasses to ensure that some of the more important workloads get scheduled. The 3 PriorityClasses are in 3 digits range so they will not interfere with the default ones. However, at present none of the PriorityClasses is set as default and neither is the preemptionPolicy set so it defaults to PreemptLowerPriority.
None of the rancher, longhorn, prometheus, grafana, etc., workloads have priorityClassName set.
Long story short, I believe this causes havoc on the cluster when resources are in short supply.
Before I take my opinion to the developers I would like to collect some data to back up my story.
The question: How do I detect if the pod was Terminated due to Preemption?
I tried to google the subject but couldn't find anything. I was hoping kube state metrics would have something but I didn't find anything.
Any help would be greatly appreciated.
You can try to look for convincing data like the pod termination reason with help of kubectl.
You can see the last restart logs of a container using the following command:
kubectl logs podname -c containername --previous
You can also use the following command to check the lifecycle events sent by the kubelet to the apiserver about the pod.
kubectl describe pod podname
Finally, You can also write a final message to /dev/termination-log, and this will show up as described in the docs.
To use kubectl commands with rancher kindly refer to this documentation page.

Kubernetes: view logs of crashed Airflow worker pod

Pods on our k8s cluster are scheduled with Airflow's KubernetesExecutor, which runs all Tasks in a new pod.
I have a such a Task for which the pod instantly (after 1 or 2 seconds) crashes, and for which of course I want to see the logs.
This seems hard. As soon the pod crashes, it gets deleted, along with the ability to retrieve crash logs. I already tried all of:
kubectl logs -f <pod> -p: cannot be used since these pods are named uniquely
(courtesy of KubernetesExecutor).
kubectl logs -l label_name=label_value: I
struggle to apply the labels to the pod (if this is a known/used way of working, I'm happy to try further)
An shared nfs is mounted on all pods on a fixed log directory. The failing pod however, does not log to this folder.
When I am really quick I run kubectl logs -f -l dag_id=sample_dag --all-containers (dag_idlabel is added byAirflow)
between running and crashing and see Error from server (BadRequest): container "base" in pod "my_pod" is waiting to start: ContainerCreating. This might give me some clue but:
these are only but the last log lines
this is really backwards
I'm basically looking for the canonical way of retrieving logs from transient pods
You need to enable remote logging. Code sample below is for using S3. In airflow.cfg set the following:
remote_logging = True
remote_log_conn_id = my_s3_conn
remote_base_log_folder = s3://airflow/logs
The my_s3_conn can be set in airflow>Admin>Connections. In the Conn Type dropdown, select S3.

Recovery from kubectl crash

What is the best way to troubleshoot when kubectl doesn't responde or exit with timeout? How to get it work again?
I'm having my kubectl as well as helm on my cluster down when installing a helm chart.
General advice:
Check if your kubectl is connecting to the correct kube-api endpoint. You could take a look at your kubeconfig. It is by default stored in $HOME/.kube. Try simple CURL to make sure that it is not DNS problem, etc.
Take a look at your nodes' logs by ssh into the nodes that you have: see this for more details instructions and log locations.
Once you have more information, you could get yourself started in the investigation of problems.

How to get to the end of the Kubernates containers log

By running command kubectl logs pod -c container
I am getting continuous autoscrolling list of logs. Is there any way I can get to the end or see the latest log. I don't want go through all the logs.
I have tried using -f as well. Any suggestion?
According to kubectl logs --help
you can use --tail
e.g. kubectl logs pod --tail=10
You have two ways to see the recent log files, based on number of lines and based on time:
kubectl logs --tail=20 nginx
It will show you 20 lines of most recent logs
kubectl logs --since=1h nginx
It will show you logs of last one hour.

How can I access the pod when it become CrashLoopBackOff?

Right now, I deployed some pods on my kubernetes cluster. But sometime, my image may has some bugs which make the pod cannot start correctly.
For example:
nats-1 0/1 CrashLoopBackOff 121 10h
I also cannot see any error in the kubectl log.
So is there any way to access this pod? Or is there any tools or tech can allow to to enter the container?
Thanks a lot all! :)
You can kubectl describe to get the events, it sometimes might show some errors there. Otherwise you can probably also make the deployment/pod run a command like sleep 3600 to keep it open for you to exec into it to investigate further.
Edited after clarification:
You could go into the worker (kubectl get pod <pod-name> -o wide to get which one) and access the node syslogs or pods' logs. That should show you a more detailed information of what happened.
But #ho-man approach is very valid and less cumbersome.