How to use fluentd+elasticsearch+grafana to display the first 12 characters of the container ID? - kubernetes

Need to use fluentd to collect logs of kubernets and store logs in elasticsearch. And use grafana to display logs and digests. However, the docker's container id is 64 characters. How to set the fluentd, or elasticsearch, or grafana, to display just first 12-character of container id in grafana?
my config file as follow:
https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml

Try something like this at the end of containers.input.conf:
<filter kubernetes.**>
#type record_transformer
enable_ruby
<record>
docker.container_id ${record["docker.container_id"][0,12]}
</record>
</filter>

If it is okay to only store 12-character IDs, you can add a fluent filter parser (tested with Fluent Bit only):
parsers.conf
[PARSER]
Name dockerid_parser
Format regex
Regex ^(?<container_id>.{12})
fluent-docker.conf
[SERVICE]
...
Parsers_File /full/path/to/parsers.conf
...
[FILTER]
Name parser
Match *
Key_Name container_id
Parser dockerid_parser
Reserve_Data On
...

Related

Fluentd incorrectly routing logs to its own STDOUT

I have a GKE cluster, in which I'm using Fluentd Kubernetes Daemonset v1.14.3 (docker image: fluent/fluentd-kubernetes-daemonset:v1.14.3-debian-gcs-1.1) to collect the logs from certain containers and forward them to a GCS Bucket.
My configuration takes logs from /var/log/containers/*.log, filter the containers based on some kubernetes annotations and then upload them to GCS using a plugin.
In most cases, this works correctly, but I'm currently stuck with some weird issue:
certain containers' logs are sometimes printed into fluentd own stdout. Let me elaborate:
Assume we have a container called helloworld which runs echo "HELLO WORLD".
Soon after the container starts, I can see in fluentd's own logs:
2022-07-20 13:29:04 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/my-pod_my-namespace_helloworld-7e6359514a5601e5ad1823d145fd3b73f7b65648f5cb760f2c1855dabe27d606.log"
...
HELLO WORLD
...
This .log file contains the standard output of my docker container ("HELLO WORLD" in some json structured format).
Normally, fluentd should tail this log, and send the messages to the GCS plugin, which should upload them to the destination bucket. But sometimes, the logs are printed directly into fluentd's own output, instead of being passed to the plugin.
I'd appreciate any help to root-cause this issue.
Things I've looked into
I have increased the verbosity of the logs using fluentd -vv, but found nothing relevant.
This happens kind of randomly, but only for certain containers. It never happens for some containers, but for others it sometimes does, and sometimes it doesn't.
There's nothing special about the containers presenting this issue.
We don't have any configuration for fluentd to stream the logs to stdout. We triple checked this.
The issue happens on a GKE cluster (1.21.12-gke.1700).
Below the fluentd configuration being used:
<label #FLUENT_LOG>
<match fluent.**>
#type null
</match>
</label>
<source>
#type tail
#id in_tail_container_logs
path /var/log/containers/*.log
exclude_path ["/var/log/containers/fluentd-*", "/var/log/containers/fluentbit-*", "/var/log/containers/kube-*", "/var/log/containers/pdsci-*", "/var/log/containers/gke-*"]
pos_file /var/log/fluentd-containers.log.pos
tag "kubernetes.*"
refresh_interval 1s
read_from_head true
follow_inodes true
<parse>
#type json
time_format %Y-%m-%dT%H:%M:%S.%NZ
keep_time_key true
</parse>
</source>
<filter kubernetes.**>
#type kubernetes_metadata
#id filter_kube_metadata
kubernetes_url "#{'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
verify_ssl true
ca_file "#{ENV['KUBERNETES_CA_FILE']}"
watch false # Don't watch for changes in container metadata
de_dot false # Don't replace dots in labels and annotations
skip_labels false
skip_master_url true
skip_namespace_metadata true
annotation_match ["app.*\/log-.+"]
</filter>
<filter kubernetes.**>
#type grep
<regexp>
key $.kubernetes.namespace_name
pattern /^my-namespace$/
</regexp>
<regexp>
key $['kubernetes']['labels']['example.com/collect']
pattern /^yes$/
</regexp>
</filter>
<match kubernetes.**>
# docs: https://github.com/daichirata/fluent-plugin-gcs
#type gcs
#id out_gcs
project "#{ENV['GCS_BUCKET_PROJECT']}"
bucket "#{ENV.fetch('GCS_BUCKET_PROJECT')"
object_key_format %Y%m%d/%H%M/%{$.kubernetes.pod_name}_${$.kubernetes.container_name}_${$.docker.container_id}/%{index}.%{file_extension}
store_as json
<buffer time,$.kubernetes.pod_name,$.kubernetes.container_name,$.docker.container_id>
#type file
path /var/log/fluentd-buffers/gcs.buffer
timekey 30
timekey_wait 5
timekey_use_utc true # use utc
chunk_limit_size 1MB
flush_at_shutdown true
</buffer>
<format>
#type json
</format>
</match>

EKS - Fluent-bit, to CloudWatch unable to remove Kubernetes data from log entries

We have configured Fluent-bit to send the logs from our cluster directly to CloudWatch.
We have enabled the Kubernetes filter in order to set our log_stream_name as $(kubernetes['container_name']).
However, the logs are terrible.
Each CloudWatch line looks like this:
2022-06-23T14:17:34.879+02:00 {"kubernetes":{"redacted_redacted":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted#sha256:59392fab7hsfghsfghsfghsfghsfghsfghc39c1bee75c0b4bfc2d9f4a405aef449b25","redacted_image":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted:ve3b56a45","redacted_name":"redacted-redacted","docker_id":"b431f9788f46sd5f4ds65f4sd56f4sd65f4d336fff4ca8030a216ecb9e0a","host":"ip-0.0.0.0.region-#.compute.internal","namespace_name":"namespace","pod_id":"podpodpod-296c-podpod-8954-podpodpod","pod_name":"redacted-redacted-redacted-7dcbfd4969-mb5f5"},
2022-06-23T14:17:34.879+02:00 {"kubernetes":{"redacted_redacted":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted#sha256:59392fab7hsfghsfghsfghsfghsfghsfghc39c1bee75c0b4bfc2d9f4a405aef449b25","redacted_image":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted:ve3b56a45","redacted_name":"redacted-redacted","docker_id":"b431f9788f46sd5f4ds65f4sd56f4sd65f4d336fff4ca8030a216ecb9e0a","host":"ip-0.0.0.0.region-#.compute.internal","namespace_name":"namespace","pod_id":"podpodpod-296c-podpod-8954-podpodpod","pod_name":"redacted-redacted-redacted-7dcbfd4969-mb5f5"},
2022-06-23T14:17:34.879+02:00 {"kubernetes":{"redacted_redacted":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted#sha256:59392fab7hsfghsfghsfghsfghsfghsfghc39c1bee75c0b4bfc2d9f4a405aef449b25","redacted_image":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted:ve3b56a45","redacted_name":"redacted-redacted","docker_id":"b431f9788f46sd5f4ds65f4sd56f4sd65f4d336fff4ca8030a216ecb9e0a","host":"ip-0.0.0.0.region-#.compute.internal","namespace_name":"namespace","pod_id":"podpodpod-296c-podpod-8954-podpodpod","pod_name":"redacted-redacted-redacted-7dcbfd4969-mb5f5"},
2022-06-23T14:20:07.074+02:00 {"kubernetes":{"redacted_redacted":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted#sha256:59392fab7hsfghsfghsfghsfghsfghsfghc39c1bee75c0b4bfc2d9f4a405aef449b25","redacted_image":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted:ve3b56a45","redacted_name":"redacted-redacted","docker_id":"b431f9788f46sd5f4ds65f4sd56f4sd65f4d336fff4ca8030a216ecb9e0a","host":"ip-0.0.0.0.region-#.compute.internal","namespace_name":"namespace","pod_id":"podpodpod-296c-podpod-8954-podpodpod","pod_name":"redacted-redacted-redacted-7dcbfd4969-mb5f5"},
Which makes the logs unusable unless expanded, and once expanded the logs look like this:
2022-06-23T14:21:34.207+02:00
{
"kubernetes": {
"container_hash": "145236632541.lfl.ecr.region.amazonaws.com/redacted#sha256:59392fab7hsfghsfghsfghsfghsfghsfghc39c1bee75c0b4bfc2d9f4a405aef449b25",
"container_image": "145236632541.lfl.ecr.region-#.amazonaws.com/redacted:ve3b56a45",
"container_name": "redacted-redacted",
"docker_id": "b431f9788f46sd5f4ds65f4sd56f4sd65f4d336fff4ca8030a216ecb9e0a",
"host": "ip-0.0.0.0.region-#.compute.internal",
"namespace_name": "redacted",
"pod_id": "podpodpod-296c-podpod-8954-podpodpod",
"pod_name": "redacted-redacted-redacted-7dcbfd4969-mb5f5"
},
"log": "[23/06/2022 12:21:34] loglineloglinelogline\ loglineloglinelogline \n",
"stream": "stdout"
}
{"kubernetes":{"redacted_redacted":"145236632541.lfl.ecr.region-#.amazonaws.com/redacted#sha256:59392fab7hsfghsfghsfghsfghsfghsfghc39c1bee75c0b4bfc2d9f4a405aef449b25","redacted_image
Which is also a bit horrible because every line is flooded with Kubernetes data.
I would like to remove the Kubernetes data from the logs completely,
But I would like to keep using $(kubernetes['container_name']) as the log stream name so that the logs are properly named.
I have tried using filters with Remove_key and LUA scripts that would remove the Kubernetes data. But as soon as something removes it, the log stream cannot be named $(kubernetes['container_name']).
I have found very little documentation on this.
And have not found a proper way to remove Kubernetes data and to keep my log_stream_name as my container_name.
Here is the raw with the fluent bit config that I used:
https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/fluent-bit/fluent-bit-compatible.yaml
Any help would be appreciated.
There is an instruction https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-logs-FluentBit.html / (Optional) Reducing the log volume from Fluent Bit
Just adding nest filter in the log config. Eg
user-api.conf: |
[INPUT]
Name tail
Tag user-api.*
Path /var/log/containers/user-api*.log
Docker_Mode On
Docker_Mode_Flush 5
Docker_Mode_Parser container_firstline_user
Parser docker
DB /var/fluent-bit/state/flb_user_api.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
Rotate_Wait 30
storage.type filesystem
Read_from_Head ${READ_FROM_HEAD}
[FILTER]
Name kubernetes
Match user-api.*
Kube_URL https://kubernetes.default.svc:443
Kube_Tag_Prefix user-api.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Labels Off
Annotations Off
[FILTER]
Name grep
Match user-api.*
Exclude log /.*"GET \/ping HTTP\/1.1" 200.*/
[FILTER]
Name nest
Match user-api.*
Operation lift
Nested_under kubernetes
Add_prefix Kube.
[FILTER]
Name modify
Match user-api.*
Remove kubernetes.kubernetes.host
Remove Kube.container_hash
Remove Kube.container_image
Remove Kube.container_name
Remove Kube.docker_id
Remove Kube.host
Remove Kube.pod_id
[FILTER]
Name nest
Match user-api.*
Operation nest
Wildcard Kube.*
Nested_under kubernetes
Remove_prefix Kube.
[OUTPUT]
Name cloudwatch_logs
Match user-api.*
region ${AWS_REGION}
log_group_name /aws/containerinsights/${CLUSTER_NAME}/user-api
log_stream_prefix app-
auto_create_group true
extra_user_agent container-insights

kubernetes container_name got null in fluentdconfiguration

I try to get log from my application container and attach fluentd log agent as sidecar container in my project. And I want to get which log is coming from which application in my Kibana dashboard. That's why I configured like that in fluentd.
<source>
#id fluentd-containers.log
#type tail
path /var/log/containers/mylog*.log
pos_file /var/log/es-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S.%NZ
tag kubernetes.myapp.container
read_from_head true
<parse>
#type none
</parse>
</source>
<filter kubernetes**>
#type record_transformer
enable_ruby true
<record>
service_name ${tag_parts[1]}
instance_name ${record["kubernetes"]["container_name"]}
log_type ${tag_parts[2]}
host_name ${hostname}
send_to "ES"
</record>
</filter>
<match kubernetes.**>
#type stdout
</match>
But when I deployed it, ${[record[""]["container_name"]} got null as displaying unknown placeholder ${record["kubernetes"]["container_name"]}. Please help me how to resolve it, thanks.
Got that error message
0 dump an error event: error_class=RuntimeError error="failed to
expand record[\"kubernetes\"][\"container_name\"] : error =
undefined method []' for nil:NilClass" location="/fluentd/vendor/bundle/ruby/2.6.0/gems/fluentd-1.11.2/lib/fluent/plugin/filter_record_transformer.rb:310:in rescue in expand'" tag="kubernetes.myapp.container" time=2020-09-23
11:29:05.705209241 +0000 record={"message"=>"{"log":"I0923
11:28:59.157177 1 main.go:71] Health check
succeeded\n","stream":"stderr","time":"2020-09-23T11:28:59.157256887Z"}"}
`
The record doesn't contain the required fields that you want to access i.e. record["kubernetes"]["container_name"].
You need to make sure that it has those fields.
Please go through Container Deployment and kubernetes_metadata_filter plugin for the detailed information on this.

td-agent is unable to ship logs from file when the file contains a single multiline log

td-agent unable to ship logs from line when log file contains single multiline logs. The logs are not picked up by td-agent until a new line is added
Installed td-agent on a windows machine. configured the td-agent.conf file to pick logs from a file containing single multiline log. The logs are not shipped until a new line is added to the file
td-agent.conf
<source>
#type tail
path "C:/abc.txt"
pos_file etc/td-agent/pos/abc-file.pos
tag abc-file-test
multiline_flush_interval 5s
format multiline
<parse>
#type multiline
format_firstline /^2019*/
format1 /^(?<message>.*)/
</parse>
read_from_head true
</source>
<filter abc-file-**>
#type record_modifier
<record>
entity "abc"
component ${tag}
hostname "#{Socket.gethostname}"
</record>
</filter>
<match abc-file-**>
#type kafka_buffered
brokers "localhost:9092"
default_topic abc-topic
flush_interval 5s
kafka_agg_max_bytes 1000000
max_send_limit_bytes 10000000
discard_kafka_delivery_failed true
output_data_type json
compression_codec gzip
max_send_retries 1
required_acks 1
get_kafka_client_log true
</match>
abc.txt log file:
2019-04-12 12:09:45 INFO abc.java exception occured at com.*************
at com.**************************
at com.************************
The logs should flow to kafka but it doesn't
It is the limitation of in_tail plugin.
How about using fluent-plugin-concat with multiline_end_regexp parameter?

Fluentd create a tag based on a field value

I have a Kubernetes cluster in which i'm trying to aggregate container logs on the nodes and send them to MongoDB. However i need to be able to send the log records to different MongoDB servers based on values in the log record it self.
I'm using the fluent-plugin-kubernetes_metadata_filter plugin to attach additional information from Kubernetes to the log record. One of those fields are kubernetes_namespace_name. Is it possible to use that field to create a tag which i can use to match against the mongodb output plugin.
For example. Below i'm using only one output, but the idea is to have multiple and let fluent send the logs to that mongodb database based on the value in the field kubernetes_namespace_name:
<source>
#type tail
#label #KUBERNETES
path /var/log/containers/*.log
pos_file /var/log/es-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S
tag kubernetes.*
format json
keep_time_key true
read_from_head true
</source>
<label #KUBERNETES>
<filter kubernetes.**>
#type kubernetes_metadata
kubernetes_url "#{ENV['K8S_HOST_URL']}"
bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token
ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
include_namespace_id true
</filter>
<filter kubernetes.**>
#type flatten_hash
separator _
</filter>
# < Tag 'kubernetes.namespace.default' is created here somehow >
<match kubernetes.namespace.default>
#type mongo
host "#{ENV['MONGO_HOST']}"
port "#{ENV['MONGO_PORT']}"
database "#{ENV['MONGO_DATABASE']}"
collection "#{ENV['MONGO_COLLECTION']}"
capped
capped_size 1024m
user "#{ENV['MONGO_USER']}"
password "#{ENV['MONGO_PASSWORD']}"
time_key time
flush_interval 10s
</match>
</label>
instead of using the tag, you can use the message content to do the filtering using Fluentd's grep filter. You can add the filter after the kubernetes meta data filter, and before the data flattener. This allows you to specify the key kubernetes_namespace_name and then route according to the value within. As you may have additional MongoDB outputs using labels can help separate the process workflows.
Documentation: https://docs.fluentd.org/v0.12/articles/filter_grep
Example:
<filter kubernetes.**>
#type grep
<regexp>
key kubernetes_namespace_name
pattern cool
</regexp>
</filter>
<YOUR MONGO CONFIG HERE>