Logging application logs in DataDog - kubernetes

Using datadog official docs, I am able to print the K8s stdout/stderr logs in DataDog UI, my motive is to print the app logs which are generated by spring boot application at a certain location in my pod.
Configurations done in cluster :
Created ServiceAccount in my cluster along with cluster role and cluster role binding
Created K8s secret to hold DataDog API key
Deployed the DataDog Agent as daemonset in all nodes
Configurations done in App :
Download datadog.jar and instrument it along with my app execution
Exposed ports 8125 and 8126
Added environment tags DD_TRACE_SPAN_TAGS, DD_TRACE_GLOBAL_TAGS in deployment file
Changed pattern in logback.xml
Added logs config in deployment file
Added env tags in deployment file
After doing above configurations I am able to log stdout/stderr logs where as I wanted to log application logs in datadog UI
If someone has done this please let me know what am I missing here.
If required, I can share the configurations as well. Thanks in advance

When installing Datadog in your K8s Cluster, you install a Node Logging Agent as a Daemonset with various volume mounts on the hosting nodes. Among other things, this gives Datadog access to the Pod logs at /var/log/pods and the container logs at /var/lib/docker/containers.
Kubernetes and the underlying Docker engine will only include output from stdout and stderror in those two locations (see here for more information). Everything that is written by containers to log files residing inside the containers, will be invisible to K8s, unless more configuration is applied to extract that data, e.g. by applying the side care container pattern.
So, to get things working in your setup, configure logback to log to stdout rather than /var/app/logs/myapp.log
Also, if you don't use APM there is no need to instrument your code with the datadog.jar and do all that tracing setup (setting up ports etc).

Related

how to force all kubernetes services (proxy, kublet, apiserver..., containers) to write logs to /var/logs

How to force all kubernetes services (proxy, kublet, apiserver..., containers) to write logs to /var/logs?
For example:
/var/logs/apiServer.log
or:
/var/logs/proxy.log
Can I use syslog config to do that? What would be an example of that config?
I have already tried journald configuration forward to syslogs=yes.
Just first what comes to my mind - create sidecar container that will gather all the logs in 1 place.
The Complete Guide to Kubernetes Logging.
That's a pretty wide question that should be divided on few parts. Kubernets stores different types of logs in different places.
Kubernetes Container Logs (out of this question, but simply kubectl logs <podname> + -n for namespace, if its not default + -c for specifying container inside the pod)
Kubernetes Node Logs
Kubernetes Cluster Logs
Kubernetes Node Logs
Depending on your operating system and services, there are various
node-level logs you can collect, such as kernel logs or systemd logs.
On nodes with systemd both the kubelet and container runtime write to
journald. If systemd is not present, they write to .log files in the
/var/log directory.
You can access systemd logs with the journalctl command.
Tutorial: Logging with journald have a huge explanation how can you configure journalctl to gather logs. With agrregation logs tools like ELK and without them. journald log filtering can simplify your life.
There are two ways of centralizing journal entries via syslog:
syslog daemon acts as a journald client (like journalctl or Logstash or Journalbeat)
journald forwards messages to syslog (via socket)
Option 1) is slower – reading from the journal is slower than reading from the socket – but captures all the fields from the journal.
Option 2) is safer (e.g. no issues with journal corruption), but the journal will only forward traditional syslog fields (like severity, hostname, message..)
Talking about ForwardToSyslog=yes in /etc/systemd/journald.conf --> it will write messages, in syslog format, to /run/systemd/journal/syslog. You can pass processing then this file to rsyslog for example. Either you can manually process logs or move them to desired place..
Kubernetes Cluster Logs
By default, system components outside a container write files to journald, while components running in containers write to /var/log directory. However, there is the option to configure the container engine to stream logs to a preferred location.
Kubernetes doesn’t provide a native solution for logging at cluster level. However, there are other approaches available to you:
Use a node-level logging agent that runs on every node
Add a sidecar container for logging within the application pod
Expose logs directly from the application.
P.S. I have NOT tried below approach, but it looks promising - check it and maybe it will help you in your not easiest task.
The easiest way of setting up a node-level logging agent is to
configure a DaemonSet to run the agent on each node
helm install --name st-agent \
--set infraToken=xxxx-xxxx \
--set containerToken=xxxx-xxxx \
--set logsToken=xxxx-xxxx \
--set region=US \
stable/sematext-agent
This setup will, by default, send all cluster and container logs to a
central location for easy management and troubleshooting. With a tiny
bit of added configuration, you can configure it to collect node-level
logs and audit logs as well.

Forwarding logs from kubernetes to splunk

I'm pretty much new to Kubernetes and don't have hands-on experience on it.
My team is facing issue regarding the log format pushed by kubernetes to splunk.
Application is pushing log to stdout in this format
{"logname" : "app-log", "level" : "INFO"}
Splunk eventually get this format (splunkforwarder is used)
{
"log" : "{\"logname\": \"app-log\", \"level\": \"INFO \"}",
"stream" : "stdout",
"time" : "2018-06-01T23:33:26.556356926Z"
}
This format kind of make things harder in Splunk to query based on properties.
Is there any options in Kubernetes to forward raw logs from app rather than grouping into another json ?
I came across this post in Splunk, but the configuration is done on Splunk side
Please let me know if we have any option from Kubernetes side to send raw logs from application
Kubernetes architecture provides three ways to gather logs:
1. Use a node-level logging agent that runs on every node.
You can implement cluster-level logging by including a node-level logging agent on each node. The logging agent is a dedicated tool that exposes logs or pushes logs to a backend. Commonly, the logging agent is a container that has access to a directory with log files from all of the application containers on that node.
The logs format depends on Docker settings. You need to set up log-driver parameter in /etc/docker/daemon.json on every node.
For example,
{
"log-driver": "syslog"
}
or
{
"log-driver": "json-file"
}
none - no logs are available for the container and docker logs does not
return any output.
json-file - the logs are formatted as JSON. The
default logging driver for Docker.
syslog - writes logging messages to
the syslog facility.
For more options, check the link
2. Include a dedicated sidecar container for logging in an application pod.
You can use a sidecar container in one of the following ways:
The sidecar container streams application logs to its own stdout.
The sidecar container runs a logging agent, which is configured to pick up logs from an application container.
By having your sidecar containers stream to their own stdout and stderr streams, you can take advantage of the kubelet and the logging agent that already run on each node. The sidecar containers read logs from a file, a socket, or the journald. Each individual sidecar container prints log to its own stdout or stderr stream.
3. Push logs directly to a backend from within an application.
You can implement cluster-level logging by exposing or pushing logs directly from every application.
For more information, you can check official documentation of Kubernetes
This week we had the same issue.
Using splunk forwarder DaemonSet
installing https://splunkbase.splunk.com/app/3743/ this plugin on splunk will solve your issue.
Just want to update with the solution what we tried, this worked for our log structure
SEDCMD-1_unjsonify = s/{"log":"(?:\\u[0-9]+)?(.*?)\\n","stream.*/\1/g
SEDCMD-2_unescapequotes = s/\\"/"/g
BREAK_ONLY_BEFORE={"logname":

Is there a way to track kubectl history from other users in gcp?

As the team gets more comfortable with the Google Cloud Platform and kubernetes, then the ability to track what changes are being applied to the environment gets more important. We're using kubectl apply yaml files (mostly deployments, services, and configmaps). Is there a way to see what changes are being applied via kubectl?
You can use kubernetes audits to do what you need.
If you're using GKE with a cluster version > 1.8.3 audit logging is available by default in stackdriver logging.
https://cloud.google.com/kubernetes-engine/docs/how-to/audit-logging
You could also read these logs using fluentd if you're not using GKE, by specifying the log dir in fluentd config.

How to change fluentd config for GKE-managed logging agent?

I have a container cluster in Google Container Engine with Stackdriver logging agent enabled. It is correctly pulling stdout logs from my containers. Now I would like to change the fluentd config to specify a log parser so that the logs shown in the GCP Logging view will have the correct severity and component.
Following this Stackdriver logging guide from kubernetes.io, I have attempted to:
Get the fluentd ConfigMap as a yml file
Added a new <filter> according to my log4js log format
Created a new ConfigMap named fluentd-cm-2 in kube-system namespace
Edited the DaemonSet for fluentd and set its ConfigMap to fluentd-cm-2. I did this using kubectl edit ds instead of kubectl replace -f because the latter failed with an error message: "the object has been modified", even after getting a fresh copy of the DaemonSet yaml.
Unexpected result: The DaemonSet is restarted, but its configuration is reverted back to the original ConfigMap, so my changes did not take effect.
I have also tried editing the ConfigMap directly (kubectl edit cm fluentd-gcp-config-v1.1 --namespace kube-system) and saved it, but it was also reverted.
I noticed that the DaemonSet and ConfigMap for fluentd are tagged with addonmanager.kubernetes.io/mode: Reconcile. I would conclude that GKE has overwritten my settings because of this "reconcile" mode.
So, my question is: how can I change the fluentd configuration in a Google Container Engine cluster, when the logging agent was installed by GKE on cluster provisioning?
Please take a look at the Prerequisites section on the documentation page you mentioned. It's mentioned there, that on GKE you cannot change the default Stackdriver Logging integration. The reason is that GKE maintains this configuration: updates the agent, watches its health and so on. It's not possible to provide the same level of support for all possible configurations.
However, you can always disable the default integration and deploy your own, patched version of DaemonSet. You can find out how to disable the default integration in the GKE documentation:
gcloud beta container clusters update [CLUSTER-NAME] \
--logging-service=none
Note, that after you disabled the default integration, you have to maintain the new deployment yourself: update the agent, set the resources, watch its health.
Here is a solution for using your own fluentd daemonset that is very much like the one included with GKE.
https://cloud.google.com/solutions/customizing-stackdriver-logs-fluentd

gcloud container VMs logging strategy

I have an instance group of Container VMs running my app on a docker container.
I am trying to find a good strategy to manage the application logs for docker + MEAN + Google Cloud Compute Machines.
I can see the logs on individual containers running docker logs [container_id].
However, if I stop and start the VM I lose those logs. I also have VMs dynamically added by Auto scaler and would like to have a convenient way to access the logs.
Stack is MEAN and Logging tool is bunyan.
Is is possible to centralize or combine the logs from all VMS in one persistent location?
any suggestions?
UPDATES:
I installed fluentd agent and now I can see logs when I manually run thins on the shell: logger "some message for testing"
However, the logs from my container vm from my docker container never shows up on logs.
I still don't know how to get those docker logs to turn up on google cloud logs. It is supposed to be automatically collected.
cheers
Leo
Here is a yaml, Dockerfile and conf for a fluentd pod inside kubernetes.
Adjust the yaml to mount a disk:
https://github.com/GoogleCloudPlatform/kubernetes/tree/master/contrib/logging/fluentd-sidecar-gcp
Then adjust the config to log to the disk.
Build the container with the new configuration.
Deploy the new container.