Fluentd config to collect logs for each namespace separately - kubernetes

What would be the Fluentd configuration to collect logs and create a separate log file / folder path of each namespace separately ?
I want to use a Fluentd instance and have a configuration that would help me segregate and group logs of each namespace separately, and then zip them separately to be sent over http.

One starting with (kubernetes.var.log...) should contain the namespace, and therefore you can filter based on specific namespaces and decide how to handle those specific logs.
If, for any reason, the log path in your cluster does not contain the namespace in its path, you can also use the kubernetes plugin.
It will enrich your logs with metadata relevant to the cluster, and allow you to extract the namespace logs originated from and deal with them accordingly.
Refer to this for collecting logs with Fluentd ,Blog

Related

BanzaiCloud Operator Output log from different namespace to S3

I have a specific namespace that I am trying to output logs into Amazon S3 using Banzai cloud Logging operator.I Am currently using ClusterFlow and ClusterOutput but it is not working for me and not able to see the logs in S3 bucket.
Here are the images of the code that I have:
https://imgur.com/a/s6MBpML
I have tried reaching out to the Banzai Cloud Slack to no avail. Can someone help me check the yaml files and validate them?
You should be using Flow object and not ClusterFlow.
The Flow is a namespaced resource, which means logs will only be collected from the namespace that the Flow is deployed in. Where as ClusterFlow is scoped at the cluster level and can configure log collection across all namespaces.

Fluentd Reading Logs Process K8s

I have a Qs pertaining to how fluentd's process of reading containers' logs looks like in K8s.
/var/lib/docker/container/* is mounted into the fluentd containers of the DaemonSet to give fluentd access to the logs of all the cluster's containers. So this is the location where the logs can be read from.
My confusion stems from the path argument of the in_tail plugin. In the docs it says for this argument --> The path(s) to read. Multiple paths can be specified, separated by comma ','.
My questions:
Why does Fluentd read logs from path specified in in_tail and not directly from /var/lib/docker/containers/*.
What puts the logs from /var/lib/docker/containers/* into path from in_tail?
I coudln't find this explained anywhere and would be very happy if someone could help me out.
Yours
Tridelt

Fluency with forward plugin: how to add kubernetes metadata to logs

Hey i have a question.
Im using logback-more-appenders(fluency plugin) to send logs to EFK stack (fluent-bit) which is working in kubernetes cluster, but it lacks kubernetes metadata ( like node/pod names).
I know i can use <additionalField></additionalField> in logbck.xml to add Service name (because this is static), but i cannot do it to dynamic parts like node or pod name.
I tried to do it on fluent-bit side using kubernetes filter, but this works only with tail/systemd inputs not a forward one (it parses tag with filename which contains namespce and pod name). Im using forward plugin to send logs from java software to elasticsearch, and in logback.xml i cannot enter dynamic pod name (or i don't know if i can).
Any tips how i can do it? I prefer to send logs using fluency instead of sniffing host container logs.
In my case, the best i could think of was to change from forward to tail plugin with structured logging (in json).
Have you tried to Pass POD ID and NODE NAME as environment variables in logback.xml as additional fields, that you can attribute the metadata to the logevents?

Logging application logs in DataDog

Using datadog official docs, I am able to print the K8s stdout/stderr logs in DataDog UI, my motive is to print the app logs which are generated by spring boot application at a certain location in my pod.
Configurations done in cluster :
Created ServiceAccount in my cluster along with cluster role and cluster role binding
Created K8s secret to hold DataDog API key
Deployed the DataDog Agent as daemonset in all nodes
Configurations done in App :
Download datadog.jar and instrument it along with my app execution
Exposed ports 8125 and 8126
Added environment tags DD_TRACE_SPAN_TAGS, DD_TRACE_GLOBAL_TAGS in deployment file
Changed pattern in logback.xml
Added logs config in deployment file
Added env tags in deployment file
After doing above configurations I am able to log stdout/stderr logs where as I wanted to log application logs in datadog UI
If someone has done this please let me know what am I missing here.
If required, I can share the configurations as well. Thanks in advance
When installing Datadog in your K8s Cluster, you install a Node Logging Agent as a Daemonset with various volume mounts on the hosting nodes. Among other things, this gives Datadog access to the Pod logs at /var/log/pods and the container logs at /var/lib/docker/containers.
Kubernetes and the underlying Docker engine will only include output from stdout and stderror in those two locations (see here for more information). Everything that is written by containers to log files residing inside the containers, will be invisible to K8s, unless more configuration is applied to extract that data, e.g. by applying the side care container pattern.
So, to get things working in your setup, configure logback to log to stdout rather than /var/app/logs/myapp.log
Also, if you don't use APM there is no need to instrument your code with the datadog.jar and do all that tracing setup (setting up ports etc).

How can I get a list of all namespaces within a specific Kubernetes cluster, using the Kubernetes API?

I need to get a list of all namespaces in a specifc Kubernetes cluster, using the Kubernetes API. Because I need to loop through multiple clusters in my Python program, I need to specify the cluster every time I call the API.
One option is to use list_namespace(), as described in https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md
However, this API doesn't allow me to specify the cluster. It picks up the cluster from the current-context in my .kube config file. If I remove or rename the config file, the API call fails completely.
I also found an extensions API at https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/ExtensionsV1beta1Api.md
Unfortunately, there is no API there to retrieve a list of namespaces. Is there some other API that I am unaware of?
If you see the source code of the kube_config module you can use different arguments with the method load_kube_config to select your cluster:
def load_kube_config(config_file=None, context=None,
client_configuration=None,
persist_config=True):
"""Loads authentication and cluster information from kube-config file
and stores them in kubernetes.client.configuration.
:param config_file: Name of the kube-config file.
:param context: set the active context. If is set to None, current_context
from config file will be used.
:param client_configuration: The kubernetes.client.Configuration to
set configs to.
:param persist_config: If True, config file will be updated when changed
(e.g GCP token refresh).
"""
If I understood the code correctly, you can do somewhat like the following:
from kubernetes import client, config
for file in files:
config.load_kube_config(config_file=file)
v1 = client.CoreV1Api()
response = v1.list_namespace()
print(response)
EDIT: This is an example that uses the context argument with a single kubeconfig file to iterate over multiple clusters. In the kubernetes docs there is an entry on Merging kubeconfig files. Basically after having a config file with multiple contexts you can load the file with config.load_kube_config(config_file=file) and load contexts with client.load_kube_config(context="context2')
P.S. You don't need to use config.load_kube_config() if you want to use a config file in the default path ('~/.kube/config') or if you set a path in the KUBECONFIG environment variable.
Would you check this example
There you can navigate between multiple contexts and list all pods within all namespaces
Apparently you just need to replace
list_pod_for_all_namespaces()
with
list_namespace()