K8S EFK (especially Fluentd) daemonset for docker journald logging driver - kubernetes

Question
Are there known available Fluentd daemonset for journald docker logging driver so that I can send K8S pod logs to Elasticsearch?
Background
As in add support to log in kubeadm, the default logging driver for K8S installed by kubeadm is journald.
the community is collectively moving away from files on disk at every place possible in general, and this would unfortunately be a step backwards. ...
You can edit you /etc/docker/daemon.json to set its default log to json files and set a max size and max files to take care of the log rotation. After that, logs wont be written to journald and you will be able to send your log files to ES.
However, the K8S EFK addon and Fluentd K8S or Aggregated Logging in Tectonic still expect to look for files in /var/log/containers in the host, if I understood correctly.
It looks Alternative fluentd docker image designed as a drop-in replacement for the fluentd-es-image looks to be adopting journald driver. However could not make it run the pods.

docker log driver journald send docker logs to systemd-journald.service
so, we need to make systemd-journald persistent save to /var/log/journal
edit /etc/systemd/journald.conf:
...
[Journal]
Storage=persistent
#Compress=yes
...
then restart to apply changes:
systemctl restart systemd-journald
ls -l /var/log/journal
as /var/log has been mounted into fluentd pod, its all done, restart fluentd pod it works for me #202104.
by the way, i am using fluentd yaml from:
https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/fluentd-daemonset-elasticsearch-rbac.yaml
and the env FLUENTD_SYSTEMD_CONF value should not be disable

Related

how to force all kubernetes services (proxy, kublet, apiserver..., containers) to write logs to /var/logs

How to force all kubernetes services (proxy, kublet, apiserver..., containers) to write logs to /var/logs?
For example:
/var/logs/apiServer.log
or:
/var/logs/proxy.log
Can I use syslog config to do that? What would be an example of that config?
I have already tried journald configuration forward to syslogs=yes.
Just first what comes to my mind - create sidecar container that will gather all the logs in 1 place.
The Complete Guide to Kubernetes Logging.
That's a pretty wide question that should be divided on few parts. Kubernets stores different types of logs in different places.
Kubernetes Container Logs (out of this question, but simply kubectl logs <podname> + -n for namespace, if its not default + -c for specifying container inside the pod)
Kubernetes Node Logs
Kubernetes Cluster Logs
Kubernetes Node Logs
Depending on your operating system and services, there are various
node-level logs you can collect, such as kernel logs or systemd logs.
On nodes with systemd both the kubelet and container runtime write to
journald. If systemd is not present, they write to .log files in the
/var/log directory.
You can access systemd logs with the journalctl command.
Tutorial: Logging with journald have a huge explanation how can you configure journalctl to gather logs. With agrregation logs tools like ELK and without them. journald log filtering can simplify your life.
There are two ways of centralizing journal entries via syslog:
syslog daemon acts as a journald client (like journalctl or Logstash or Journalbeat)
journald forwards messages to syslog (via socket)
Option 1) is slower – reading from the journal is slower than reading from the socket – but captures all the fields from the journal.
Option 2) is safer (e.g. no issues with journal corruption), but the journal will only forward traditional syslog fields (like severity, hostname, message..)
Talking about ForwardToSyslog=yes in /etc/systemd/journald.conf --> it will write messages, in syslog format, to /run/systemd/journal/syslog. You can pass processing then this file to rsyslog for example. Either you can manually process logs or move them to desired place..
Kubernetes Cluster Logs
By default, system components outside a container write files to journald, while components running in containers write to /var/log directory. However, there is the option to configure the container engine to stream logs to a preferred location.
Kubernetes doesn’t provide a native solution for logging at cluster level. However, there are other approaches available to you:
Use a node-level logging agent that runs on every node
Add a sidecar container for logging within the application pod
Expose logs directly from the application.
P.S. I have NOT tried below approach, but it looks promising - check it and maybe it will help you in your not easiest task.
The easiest way of setting up a node-level logging agent is to
configure a DaemonSet to run the agent on each node
helm install --name st-agent \
--set infraToken=xxxx-xxxx \
--set containerToken=xxxx-xxxx \
--set logsToken=xxxx-xxxx \
--set region=US \
stable/sematext-agent
This setup will, by default, send all cluster and container logs to a
central location for easy management and troubleshooting. With a tiny
bit of added configuration, you can configure it to collect node-level
logs and audit logs as well.

How do we check container logs in kubernetes before they are written to the log file?

kubectl logs -f <pod-name>
This command shows the logs from the container log file.
Basically, I want to check the difference between "what is generated by the container" and "what is written to the log file".
I see some unusual binary logs, so I just want to find out if the container is creating those binary logs or the logs are not properly getting written to the log file.
"Unusual logs":
\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
Usually, containerized applications, do not write to the log files but send messages to stdout/stderr, there is no point in storing log files inside containers, as they will be deleted when the pod is deleted.
What you see when running
kubectl logs -f <pod-name>
are messages sent to stdout/stderr. There are no container specific logs here, only application logs.
If, for some reason, your application does write to the log file, you can check it by execing into pod with e.g.
kubectl exec -it <pod-name> -- /bin/bash
and read logs as you would in shell.
Edit
Application logs
A container engine handles and redirects any output generated to a containerized application's stdout and stderr streams. For example, the Docker container engine redirects those two streams to a logging driver, which is configured in Kubernetes to write to a file in JSON format.
Those logs are also saved to
/var/log/containers/
/var/log/pods/
By default, if a container restarts, the kubelet keeps one terminated container with its logs. If a pod is evicted from the node, all corresponding containers are also evicted, along with their logs.
Everything you see by issuing the command
kubectl logs <pod-name>
is what application sent to stdout/stderr, or what was redirected to stdout/stderr. For example nginx:
The official nginx image creates a symbolic link from /var/log/nginx/access.log to /dev/stdout, and creates another symbolic link from /var/log/nginx/error.log to /dev/stderr, overwriting the log files and causing logs to be sent to the relevant special device instead.
Node logs
Components that do not run inside containers (e.g kubelet, container runtime) write to journald. Otherwise, they write to .log fies inside /var/log/ directory.
Excerpt from official documentation:
For now, digging deeper into the cluster requires logging into the relevant machines. Here are the locations of the relevant log files. (note that on systemd-based systems, you may need to use journalctl instead)
Master
/var/log/kube-apiserver.log - API Server, responsible for serving the API
/var/log/kube-scheduler.log - Scheduler, responsible for making scheduling decisions
/var/log/kube-controller-manager.log - Controller that manages replication controllers
Worker Nodes
/var/log/kubelet.log - Kubelet, responsible for running containers on the node
/var/log/kube-proxy.log - Kube Proxy, responsible for service load balancing
The only way I could imagine this to work is to make use of some external logging facility like Syslog or Elastisearch or anything else. Configure your application to send logs directly to logging facility (avoiding agents like fluentd or logstash which parse logs from files).
All modern languages have support for external logging. You can also configure Docker to send logs to syslog server.
Simple way to check log is kubernets:
=> If pod have single container
kubectl logs POD_NAME
=> If pod have multiple containers
kubectl logs POD_NAME -c CONTAINER_NAME -n NAMESPACE

Where does K8s store the logs that it prints when running kubectl logs -f

I am pretty sure it writes it on disk somewhere. Otherwise if the container runs for several hours and logs a lot, then it would exceed what the stderr can hold I think. No?
Is it possible to compress and download the logs of kubectl logs?i.e. comparess on the container without downloading them?
Firstly take a look on logging-kubernetes official documentation.
In most cases, Docker container logs are put in the /var/log/containers directory on your host (host they are deployed on). Docker supports multiple logging drivers but Kubernetes API does not support driver configuration.
Once a container terminates or restarts, kubelet keeps its logs on the node. To prevent these files from consuming all of the host’s storage, a log rotation mechanism should be set on the node.
You can use kubectl logs to retrieve logs from a previous instantiation of a container with --previous flag, in case the container has crashed.
If you want to take a look at additional logs. For example, in Linux journald logs can be retrieved using the journalctl command:
$ journalctl -u docker
You can implement cluster-level logging and expose or push logs directly from every application but the implementation for such a logging mechanism is not in the scope of Kubernetes.
Also there are many tools offered for Kubernetes for logging management and aggregation - see: logs-tools.

How to collect logs from java app (k8s) to fluentd(k8s)

I have java app in k8s and fluentd (daemonset). In fluentd conf:
*`<source>
#type forward
port 24224
</source>
<match **>
#type stdout
</match>`*
I am little bit confused.
Do I need use fluentd-logger-java lib? I read in docs, that I need add remotehost for fluentd, but here i don't use service in general.
How app will send logs to fluentd pods?
Thanks in advance!
Given that your Java application can log to stdout and stderr you’ll use fluentd to read that log and, in most cases, ship these logs to a system that can aggregate the logs.
This picture, from the official docs, shows a common pattern of configuring node-level logging in Kubernetes with e.g. fluentd as Pods deployed with a DaemonSet:
In the above picture, the logging-agent will be fluentd and my-pod will be your Pod with a container running your Java app. The Logging Backend, from a fluentd configuration perspective, is optional but of course highly recommended. Basically you can choose to output your logs via fluentd stdout.
For this to function properly fluentd will need read access to the container logs, this is accomplished by mounting the log dir e.g. /var/lib/docker/containers into the fluentd container.
We’ve successfully used this fluentd example ConfigMap, with some modifications to read logs from the nodes and ship them to Elasticsearch. Check out the containers.input.conf part of that ConfigMap for more info on container logs and how to digest them.
Note that you shouldn't need to use the fluentd-logger-java library to start using fluentd, although you could use it as another type of logger in your Java application. Out-of-the-box you should be able to let Java log everything to stdout and stderr and read the logs with fluentd.
If you are just concerned with the live logs then you can try a product built on fluent,Elastic search and kibana ; you can get it https://logdna.com.
Just add a tag and deploy the demonset.
You can try its free trail for some days

gcloud container VMs logging strategy

I have an instance group of Container VMs running my app on a docker container.
I am trying to find a good strategy to manage the application logs for docker + MEAN + Google Cloud Compute Machines.
I can see the logs on individual containers running docker logs [container_id].
However, if I stop and start the VM I lose those logs. I also have VMs dynamically added by Auto scaler and would like to have a convenient way to access the logs.
Stack is MEAN and Logging tool is bunyan.
Is is possible to centralize or combine the logs from all VMS in one persistent location?
any suggestions?
UPDATES:
I installed fluentd agent and now I can see logs when I manually run thins on the shell: logger "some message for testing"
However, the logs from my container vm from my docker container never shows up on logs.
I still don't know how to get those docker logs to turn up on google cloud logs. It is supposed to be automatically collected.
cheers
Leo
Here is a yaml, Dockerfile and conf for a fluentd pod inside kubernetes.
Adjust the yaml to mount a disk:
https://github.com/GoogleCloudPlatform/kubernetes/tree/master/contrib/logging/fluentd-sidecar-gcp
Then adjust the config to log to the disk.
Build the container with the new configuration.
Deploy the new container.