I have a python application that is running in Kubernetes. The app has a ping health-check which is called frequently via a REST call and checks that the call returns an HTTP 200. This clutters the Kubernetes logs when I view it through the logs console.
The function definition looks like this:
def ping():
return jsonify({'status': 'pong'})
How can I silence a specific call from showing up in the log? Is there a way I can put this in code such as a python decorator on top of the health check function? Or is there an option in the Kubernetes console where I can configure to ignore this call?
In kubernetes, everything in container which you have on stdout or stderr will come into the kubernetes logs. The only way to exclude the logs of health-check ping call remove from kubernetes logs is that, In your application you should redirect output of those ping calls to somefile in say /var/log/. This will effectively remove the output of that health-check ping from the stdout.
Once the output is not in stdout or stderr of the pods, pod logs will not have logs from that special health-check
You can also use the sidecar containers to streamline your application logs, like if you don't want all of the logs of application in kubectl logs output. You can write those file.
As stated in Official docs of kubernetes:
By having your sidecar containers stream to their own stdout and stderr streams, you can take advantage of the kubelet and the logging agent that already run on each node. The sidecar containers read logs from a file, a socket, or the journald. Each individual sidecar container prints log to its own stdout or stderr stream.
This approach allows you to separate several log streams from different parts of your application, some of which can lack support for writing to stdout or stderr. The logic behind redirecting logs is minimal, so it’s hardly a significant overhead.
For more information on Kubernetes logging, please refer official docs:
https://kubernetes.io/docs/concepts/cluster-administration/logging/
clutters the Kubernetes logs when I view it through the logs console.
Since you said "view", I know it might not be the most accurate way to do this, but works ok:
kubectl logs [pod] [options] | awk '!/some health-check pattern|or another annoying pattern to exclude|another pattern you dont want to see|and so on/ {print $0}'
Used awk for filtering the logs, if you're unaware, you can read about it here
!: means exclude the following pattern
/pattern/: enclose your pattern between // (standard awk syntax)
{print $0}: means print the current line (you can also use just print)
Note: There is always a risk of excluding something necessary. Use wisely.
Just invert the logic log on fail in the app, modify the code or wrap with a custom decorator
Related
We just migrated to a kubernetes cluster, I was wondering if it is possible to send a kafka event when a container/pod finishes automatically with the stdout as message. Right now we are using fluentd with elastic search but the output of a pod is used as input for the next one, we need to poll constantly elastic search for when the output is ready and that causes performance issues on overall execution
I'm not sure of your current setup but my first thought would jump to:
Use something such as fluentd or Logstash on it's own pod per node
Configure volume access to Kubernetes log folder /var/log/containers/*
Use the Kafka output for either fluentd or Logstash with file input (tail) on the logging folder
This approach would require the configuration above on each node however but requires minimal configuration of logging locations etc..
It's not something I've personally configured but have considered it for the future.
More info here
When I call kubectl logs pod_name, I get both the stdout/err combined. Is it possible to specify that I only want stdout or stderr? Likewise I am wondering if it is possible to do so through the k8s rest interface. I've searched for several hours and read through the repository but could not find anything.
Thanks!
No, this is not possible. To my knowlegde, the moment of writing this, kubernetes supports only one logs api endpoint that returns all logs (stdout and stderr combined).
If you want to access them separately you should consider using different logging driver or query logs directly from docker.
I'm trying to verify that shutdown is completing cleanly on Kubernetes, with a .NET Core 2.0 app.
I have an app which can run in two "modes" - one using ASP.NET Core and one as a kind of worker process. Both use Console and JSON-which-ends-up-in-Elasticsearch-via-Filebeat-sidecar-container logger output which indicate startup and shutdown progress.
Additionally, I have console output which writes directly to stdout when a SIGTERM or Ctrl-C is received and shutdown begins.
Locally, the app works flawlessly - I get the direct console output, then the logger output flowing to stdout on Ctrl+C (on Windows).
My experiment scenario:
App deployed to GCS k8s cluster (using helm, though I imagine that doesn't make a difference)
Using kubectl logs -f to stream logs from the specific container
Killing the pod from GCS cloud console site, or deleting the resources via helm delete
Dockerfile is FROM microsoft/dotnet:2.1-aspnetcore-runtime and has ENTRYPOINT ["dotnet", "MyAppHere.dll"], so not wrapped in a bash process or anything
Not specifying a terminationGracePeriodSeconds so guess it defaults to 30 sec
Observing output returned
Results:
The API pod log streaming showed just the immediate console output, "[SIGTERM] Stop signal received", not the other Console logger output about shutdown process
The worker pod log streaming showed a little more - the same console output and some Console logger output about shutdown process
The JSON logs didn't seem to pick any of the shutdown log output
My conclusions:
I don't know if Kubernetes is allowing the process to complete before terminating it, or just issuing SIGTERM then killing things very quick. I think it should be waiting, but then, why no complete console logger output?
I don't know if console output is cut off when stdout log streaming at some point before processes finally terminates?
I would guess that the JSON stuff doesn't come through to ES because filebeat running in the sidecar terminates even if there's outstanding stuff in files to send
I would like to know:
Can anyone advise on points 1,2 above?
Any ideas for a way to allow a little extra time or leeway for the sidecar to send stuff up, like a pod container termination order, delay on shutdown for that container, etc?
SIGTERM does indeed signal termination. The less obvious part is that when the SIGTERM handler returns, everything is considered finished.
The fix is to not return from the SIGTERM handler until the app has finished shutting down. For example, using a ManualResetEvent and Wait()ing it in the handler.
I've started to look into this for my own purposes and have come across your question over a year after it was posted... This is a bit late, but have you tried GraceTerm?
There is an associated NuGET package for this.
From the description...
Graceterm middleware provides implementation to ensure graceful shutdown of AspNet Core applications. The basic concept is: After application received a SIGTERM (a signal asking it to terminate), Graceterm will hold it alive till all pending requests are completed or a timeout occur.
I haven't personally tried this yet, but it does look promising.
Try add STOPSIGNAL SIGINT to your Dockerfile
Our setup:
We are using kubernetes in GCP.
We have pods that write logs to a shared volume, with a sidecar container that sucks up our logs for our logging system.
We cannot just use stdout instead for this process.
Some of these pods are long lived and are filling up disk space because of no log rotation.
Question:
What is the easiest way to prevent the disk space from filling up here (without scheduling pod restarts)?
I have been attempting to install logrotate using: RUN apt-get install -y logrotate in our Dockerfile and placing a logrotate config file in /etc/logrotate.d/dynamicproxy but it doesnt seem to get run. /var/lib/logrotate/status never gets generated.
I feel like I am barking up the wrong tree or missing something integral to getting this working. Any help would be appreciated.
We ended up writing our own daemonset to properly collect the logs from the nodes instead of the container level. We then stopped writing to shared volumes from the containers and logged to stdout only.
We used fluentd to the logs around.
https://github.com/splunk/splunk-connect-for-kubernetes/tree/master/helm-chart/splunk-kubernetes-logging
In general, you should write logs to stdout and configure log collection tool like ELK stack. This is the best practice.
However, if you want to run logrotate as a separate process in your container - you may use Supervisor, which serves as a very simple init system and allows you to run as many parallel process in container as you want.
Simple example for using Supervisor for rotating Nginx logs can be found here: https://github.com/misho-kr/docker-appliances/tree/master/nginx-nodejs
If you write to the filesystem the application creating the logs should be responsible for rotation. If you are running a java application with logback or log4j it is simple configuration change. For other languages/frameworks it is usually similar.
If that is not an option you could use a specialized tool to handle the rotation and piping the output to it. One example would be http://cr.yp.to/daemontools/multilog.html
As method of last resort you could investigate to log into a named pipe (FIFO) instead of a real file and have some other process handling the retrieval and writing of the data - including the rotation.
Before looking at Kubernetes, we are writing all our logs to stdout(according to 12-factor-app) and using logspout to collect the logs to Logstash. And in Logstash we then route logs to different targets:
InfluxDB+Grafana: to monitor application metrics(e.g., how long does a certain calculation takes)
Riemann: to alert if some performance thresholds are crossed
How these things can be done in Kubernetes?
I know that with Heapster you can see JVM level graphs(memory usages, etc) or even maybe Heapster can send events to Riemann in order to alert some system level statistics(e.g., disk is full). But for stuff on the application level, what would be the right approach then?
Heapster should be grabbing the stdout from the containers as well and can send the data to different backends (sinks). It would essentially be an API call with the data. Check out: https://github.com/kubernetes/heapster/blob/master/docs/sink-configuration.md
I'm not 100% sure on stdout being the only method for a 12fa, but we use a in-house logging lib that also streams the stdout to our logging engine (graylog). That happens inside the app so that the log messages are preserved as a full 'event' vs heapster or other stdout scrapings treating each line as an event.