Fluentd Reading Logs Process K8s - kubernetes

I have a Qs pertaining to how fluentd's process of reading containers' logs looks like in K8s.
/var/lib/docker/container/* is mounted into the fluentd containers of the DaemonSet to give fluentd access to the logs of all the cluster's containers. So this is the location where the logs can be read from.
My confusion stems from the path argument of the in_tail plugin. In the docs it says for this argument --> The path(s) to read. Multiple paths can be specified, separated by comma ','.
My questions:
Why does Fluentd read logs from path specified in in_tail and not directly from /var/lib/docker/containers/*.
What puts the logs from /var/lib/docker/containers/* into path from in_tail?
I coudln't find this explained anywhere and would be very happy if someone could help me out.
Yours
Tridelt

Related

Fluentd config to collect logs for each namespace separately

What would be the Fluentd configuration to collect logs and create a separate log file / folder path of each namespace separately ?
I want to use a Fluentd instance and have a configuration that would help me segregate and group logs of each namespace separately, and then zip them separately to be sent over http.
One starting with (kubernetes.var.log...) should contain the namespace, and therefore you can filter based on specific namespaces and decide how to handle those specific logs.
If, for any reason, the log path in your cluster does not contain the namespace in its path, you can also use the kubernetes plugin.
It will enrich your logs with metadata relevant to the cluster, and allow you to extract the namespace logs originated from and deal with them accordingly.
Refer to this for collecting logs with Fluentd ,Blog

how to force all kubernetes services (proxy, kublet, apiserver..., containers) to write logs to /var/logs

How to force all kubernetes services (proxy, kublet, apiserver..., containers) to write logs to /var/logs?
For example:
/var/logs/apiServer.log
or:
/var/logs/proxy.log
Can I use syslog config to do that? What would be an example of that config?
I have already tried journald configuration forward to syslogs=yes.
Just first what comes to my mind - create sidecar container that will gather all the logs in 1 place.
The Complete Guide to Kubernetes Logging.
That's a pretty wide question that should be divided on few parts. Kubernets stores different types of logs in different places.
Kubernetes Container Logs (out of this question, but simply kubectl logs <podname> + -n for namespace, if its not default + -c for specifying container inside the pod)
Kubernetes Node Logs
Kubernetes Cluster Logs
Kubernetes Node Logs
Depending on your operating system and services, there are various
node-level logs you can collect, such as kernel logs or systemd logs.
On nodes with systemd both the kubelet and container runtime write to
journald. If systemd is not present, they write to .log files in the
/var/log directory.
You can access systemd logs with the journalctl command.
Tutorial: Logging with journald have a huge explanation how can you configure journalctl to gather logs. With agrregation logs tools like ELK and without them. journald log filtering can simplify your life.
There are two ways of centralizing journal entries via syslog:
syslog daemon acts as a journald client (like journalctl or Logstash or Journalbeat)
journald forwards messages to syslog (via socket)
Option 1) is slower – reading from the journal is slower than reading from the socket – but captures all the fields from the journal.
Option 2) is safer (e.g. no issues with journal corruption), but the journal will only forward traditional syslog fields (like severity, hostname, message..)
Talking about ForwardToSyslog=yes in /etc/systemd/journald.conf --> it will write messages, in syslog format, to /run/systemd/journal/syslog. You can pass processing then this file to rsyslog for example. Either you can manually process logs or move them to desired place..
Kubernetes Cluster Logs
By default, system components outside a container write files to journald, while components running in containers write to /var/log directory. However, there is the option to configure the container engine to stream logs to a preferred location.
Kubernetes doesn’t provide a native solution for logging at cluster level. However, there are other approaches available to you:
Use a node-level logging agent that runs on every node
Add a sidecar container for logging within the application pod
Expose logs directly from the application.
P.S. I have NOT tried below approach, but it looks promising - check it and maybe it will help you in your not easiest task.
The easiest way of setting up a node-level logging agent is to
configure a DaemonSet to run the agent on each node
helm install --name st-agent \
--set infraToken=xxxx-xxxx \
--set containerToken=xxxx-xxxx \
--set logsToken=xxxx-xxxx \
--set region=US \
stable/sematext-agent
This setup will, by default, send all cluster and container logs to a
central location for easy management and troubleshooting. With a tiny
bit of added configuration, you can configure it to collect node-level
logs and audit logs as well.

How do we check container logs in kubernetes before they are written to the log file?

kubectl logs -f <pod-name>
This command shows the logs from the container log file.
Basically, I want to check the difference between "what is generated by the container" and "what is written to the log file".
I see some unusual binary logs, so I just want to find out if the container is creating those binary logs or the logs are not properly getting written to the log file.
"Unusual logs":
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
Usually, containerized applications, do not write to the log files but send messages to stdout/stderr, there is no point in storing log files inside containers, as they will be deleted when the pod is deleted.
What you see when running
kubectl logs -f <pod-name>
are messages sent to stdout/stderr. There are no container specific logs here, only application logs.
If, for some reason, your application does write to the log file, you can check it by execing into pod with e.g.
kubectl exec -it <pod-name> -- /bin/bash
and read logs as you would in shell.
Edit
Application logs
A container engine handles and redirects any output generated to a containerized application's stdout and stderr streams. For example, the Docker container engine redirects those two streams to a logging driver, which is configured in Kubernetes to write to a file in JSON format.
Those logs are also saved to
/var/log/containers/
/var/log/pods/
By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs.
Everything you see by issuing the command
kubectl logs <pod-name>
is what application sent to stdout/stderr, or what was redirected to stdout/stderr. For example nginx:
The official nginx image creates a symbolic link from /var/log/nginx/access.log to /dev/stdout, and creates another symbolic link from /var/log/nginx/error.log to /dev/stderr, overwriting the log files and causing logs to be sent to the relevant special device instead.
Node logs
Components that do not run inside containers (e.g kubelet, container runtime) write to journald. Otherwise, they write to .log fies inside /var/log/ directory.
Excerpt from official documentation:
For now, digging deeper into the cluster requires logging into the relevant machines. Here are the locations of the relevant log files. (note that on systemd-based systems, you may need to use journalctl instead)
Master
/var/log/kube-apiserver.log - API Server, responsible for serving the API
/var/log/kube-scheduler.log - Scheduler, responsible for making scheduling decisions
/var/log/kube-controller-manager.log - Controller that manages replication controllers
Worker Nodes
/var/log/kubelet.log - Kubelet, responsible for running containers on the node
/var/log/kube-proxy.log - Kube Proxy, responsible for service load balancing
The only way I could imagine this to work is to make use of some external logging facility like Syslog or Elastisearch or anything else. Configure your application to send logs directly to logging facility (avoiding agents like fluentd or logstash which parse logs from files).
All modern languages have support for external logging. You can also configure Docker to send logs to syslog server.
Simple way to check log is kubernets:
=> If pod have single container
kubectl logs POD_NAME
=> If pod have multiple containers
kubectl logs POD_NAME -c CONTAINER_NAME -n NAMESPACE

How to collect logs from java app (k8s) to fluentd(k8s)

I have java app in k8s and fluentd (daemonset). In fluentd conf:
*`<source>
#type forward
port 24224
</source>
<match **>
#type stdout
</match>`*
I am little bit confused.
Do I need use fluentd-logger-java lib? I read in docs, that I need add remotehost for fluentd, but here i don't use service in general.
How app will send logs to fluentd pods?
Thanks in advance!
Given that your Java application can log to stdout and stderr you’ll use fluentd to read that log and, in most cases, ship these logs to a system that can aggregate the logs.
This picture, from the official docs, shows a common pattern of configuring node-level logging in Kubernetes with e.g. fluentd as Pods deployed with a DaemonSet:
In the above picture, the logging-agent will be fluentd and my-pod will be your Pod with a container running your Java app. The Logging Backend, from a fluentd configuration perspective, is optional but of course highly recommended. Basically you can choose to output your logs via fluentd stdout.
For this to function properly fluentd will need read access to the container logs, this is accomplished by mounting the log dir e.g. /var/lib/docker/containers into the fluentd container.
We’ve successfully used this fluentd example ConfigMap, with some modifications to read logs from the nodes and ship them to Elasticsearch. Check out the containers.input.conf part of that ConfigMap for more info on container logs and how to digest them.
Note that you shouldn't need to use the fluentd-logger-java library to start using fluentd, although you could use it as another type of logger in your Java application. Out-of-the-box you should be able to let Java log everything to stdout and stderr and read the logs with fluentd.
If you are just concerned with the live logs then you can try a product built on fluent,Elastic search and kibana ; you can get it https://logdna.com.
Just add a tag and deploy the demonset.
You can try its free trail for some days

How to send json logs through fluentd to stackdriver

I have docker containers writing logs in json format. When they run on GKE, the logs are shown in StackDriver fine, but when I run the same containers on some VM with kubernetes (not GKE) and use fluentd to route the logs to StackDriver, the log messages arrive escaped and under "log" key.
Example: {"stream":"stdout","log":"{\"time\":\"2019-07-25T09:55:18.2393210Z\", ....
How can I configure fluentd to get the logs in the same format as on GKE (without "log": and unescaped)?
There are few things to consider:
You can configure fluentd's log format with this guide.
You can try some reverse engineering. Fluentd config used by GKE can by studied at following path on fluend Pod: /etc/google-fluentd/config.d/containers.input.conf
You can directly check the GKE config in a ConfigMap called fluentd-gcp-config-v1.2.5. There is some useful information regarding how to config fluentd as non-managed. More details here.
Please let me know if that helped.