containerd multiline logs parsing with fluentbit - kubernetes

After shifting from Docker to containerd as docker engine used by our kubernetes, we are not able to show the multiline logs in a proper way by our visualization app (Grafana) as some details prepended to the container/pod logs by the containerd itself (i.e. timestamp, stream & log severity to be specific it is appending something like the following and as shown in the below sample: 2022-07-25T06:43:17.20958947Z stdout F  ) which make some confusion for the developers and the application owners.
I am showing here a dummy sample of the logs generated by the application and how it got printed in the nodes of kuberenetes'nodes after containerd prepended the mentioned details.
The following logs generated by the application (kubectl logs ):
2022-07-25T06:43:17,309ESC[0;39m dummy-[txtThreadPool-2] ESC[39mDEBUGESC[0;39m
  ESC[36mcom.pkg.sample.ComponentESC[0;39m - Process message meta {
  timestamp: 1658731397308720468
  version {
      major: 1
      minor: 0
      patch: 0
  }
}
when I check the logs in the filesystem (/var/log/container/ABCXYZ.log) :
2022-07-25T06:43:17.20958947Z stdout F 2022-07-25T06:43:17,309ESC[0;39m dummy-[txtThreadPool-2]
ESC[39mDEBUGESC[0;39m
ESC[36mcom.pkg.sample.ComponentESC[0;39m - Process message meta {
2022-07-25T06:43:17.20958947Z stdout F timestamp: 1658731449723010774
2022-07-25T06:43:17.209593379Z stdout F version {
2022-07-25T06:43:17.209595933Z stdout F major: 14
2022-07-25T06:43:17.209598466Z stdout F minor: 0
2022-07-25T06:43:17.209600712Z stdout F patch: 0
2022-07-25T06:43:17.209602926Z stdout F }
2022-07-25T06:43:17.209605099Z stdout F }
I am able to parse the multiline logs with fluentbit but the problem is I am not able to remove the details injected by containerd ( >> 2022-07-25T06:43:17.209605099Z stdout F .......). So is there anyway to configure containerd to not prepend these details somehow in the logs and print them as they are generated from the application/container ?
On the other hand is there any plugin to remove such details from fluentbit side .. as per the existing plugins none of them can manipulate or change the logs (which is logical  as the log agent should not do any change on the logs).
Thanks in advance.

This is the workaround I followed to show the multiline log lines in Grafana by applying extra fluentbit filters and multiline parser.
1- First I receive the stream by tail input which parse it by a multiline parser (multilineKubeParser).
2- Then another filter will intercept the stream to do further processing by a regex parser (kubeParser).
3- After that another filter will remove the details added by the containerd by a lua parser ().
fluent-bit.conf: |-
[SERVICE]
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
Flush 1
Daemon Off
Log_Level warn
Parsers_File parsers.conf
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
multiline.Parser multilineKubeParser
Exclude_Path /var/log/containers/*_ABC-logging_*.log
DB /run/fluent-bit/flb_kube.db
Mem_Buf_Limit 5MB
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Merge_Log On
Merge_Parser kubeParser
K8S-Logging.Parser Off
K8S-Logging.Exclude On
[FILTER]
Name lua
Match kube.*
call remove_dummy
Script filters.lua
[Output]
Name grafana-loki
Match kube.*
Url http://loki:3100/api/prom/push
TenantID ""
BatchWait 1
BatchSize 1048576
Labels {job="fluent-bit"}
RemoveKeys kubernetes
AutoKubernetesLabels false
LabelMapPath /fluent-bit/etc/labelmap.json
LineFormat json
LogLevel warn
labelmap.json: |-
{
"kubernetes": {
"container_name": "container",
"host": "node",
"labels": {
"app": "app",
"release": "release"
},
"namespace_name": "namespace",
"pod_name": "instance"
},
"stream": "stream"
}
parsers.conf: |-
[PARSER]
Name kubeParser
Format regex
Regex /^([^ ]*).* (?<timeStamp>[^a].*) ([^ ].*)\[(?<requestId>[^\]]*)\] (?<severity>[^ ]*) (?<message>[^ ].*)$/
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
Time_Keep On
Time_Offset +0200
[MULTILINE_PARSER]
name multilineKubeParser
type regex
flush_timeout 1000
rule "start_state" "/[^ ]* stdout .\s+\W*\w+\d\d\d\d-\d\d-\d\d \d\d\:\d\d\:\d\d,\d\d\d.*$/" "cont"
rule "cont" "/[^ ]* stdout .\s+(?!\W+\w+\d\d\d\d-\d\d-\d\d \d\d\:\d\d\:\d\d,\d\d\d).*$/" "cont"
filters.lua: |-
function remove_dummy(tag, timestamp, record)
new_log=string.gsub(record["log"],"%d+-%d+-%d+T%d+:%d+:%d+.%d+Z%sstdout%sF%s","")
new_record=record
new_record["log"]=new_log
return 2, timestamp, new_record
end
As I mentioned this is a workaround till I can find any other/better solution.

From the configuration options for containerd, it appears there's no way to configure logging in any way. You can see the config doc here.
Also, I checked at the logging code inside containerd and it appears this is prepended to the logs as they are redirected from the stdout of the container. You can see that the testcase here checks for appropriate fields by "splitting" the log line received. It checks for a tag and a stream entry prepended to the content of the log. I suppose that's the way logs are processed in containerd.
The best thing to do would be to open an issue in the project with your design requirement and perhaps the team can develop configurable stdout redirection for you.

This might help. They use a custom regex to capture the message log and the rest of the info in the logs and then use lift in the nest filter to flatten the json.
https://github.com/microsoft/fluentbit-containerd-cri-o-json-log

Related

Fluentd tail reads every time from the start of the file

I am trying to forward syslogs,
I start the td-agent with read_from_head param set to false.
So on checking logs nothing happens, then I added a line to the target file via
vscode so there wasn't issue with the inode changing.
The issue that I face is that upon any addition of log lines to the target file, the
pos_file gets updated and reads from the top and after adding
another line td-agent collects from the top again.
How does fluentd in_tail works?
<source>
#type tail
format none
path /home/user/Documents/debugg/demo.log
pos_file /var/log/td-agent/demo.log.pos
read_from_head false
follow_inodes true
tag syslog
<source>

Error when try apply configmap to auth with EKS cluster

i have the follow question. i try connect to eks cluster using a Terraform with Gitlab CI/CD , i receive the error message , but when try it in my compute , this error dont appear, someone had same error ?
$ terraform output authconfig > authconfig.yaml
$ cat authconfig.yaml
<<EOT
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: "arn:aws:iam::503655390180:role/clusters-production-workers"
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
EOT
$ kubectl create -f authconfig.yaml -n kube-system
error: error parsing authconfig.yaml: error converting YAML to JSON: yaml: line 2: mapping values are not allowed in this context
The output is including EOT(EndOfText) marks since it is generated as a multiline string originally.
as documentation suggests (terrafom doc link)
Don't use "heredoc" strings to generate JSON or YAML. Instead, use the
jsonencode function or the yamlencode function so that Terraform can
be responsible for guaranteeing valid JSON or YAML syntax.
use json encoding or yaml encoding before building output.
If you want to continue like this with what you have now then try to give these options with output -json or -raw
terraform output -json authconfig > authconfig.yaml
or
terraform output -raw authconfig > authconfig.yaml
The error message tells you the authconfig.yaml file can not be converted from YAML to JSON, suggesting it's not a valid yaml
The cat authconfig.yaml you're showing us includes some <<EOT and EOT tags. I would suggest to remove those, before running kubectl create -f
Your comment suggests you knew this already - then why didn't you ask about terraform, rather than showing us kubectl create failing? From your post, it really sounded like you copy/pasted the output of your job, without even reading it.
So, obviously, the next step is to terraform output -raw, or -json, there are several mentions in their docs, or knowledge base, a google search would point you to:
https://discuss.hashicorp.com/t/terraform-outputs-with-heredoc-syntax-leaves-eot-in-file/18584/7
https://www.terraform.io/docs/cli/commands/output.html
Last: we could ask why? Why would you terraform output > something, when you can have terraform write a file?
While as a general rule, whenever writing terraform stdout/stderr to files, I strongly suggest going with no-color.

Telegraf & InfluxDB: how to convert PROCSTAT's pid from field to tag?

Summary: I am using telegraf to get procstat into InfluxDB. I want to convert the pid from an integer field to a TAG so that I can do group by on it in Influx.
Details:
After a lot of searching I found the following on some site but it seems to be doing the opposite (converts tag into a field). I am not sure how to deduce the opposite conversion syntax from it:
[processors]
[[processors.converter]]
namepass = [ "procstat",]
[processors.converter.tags]
string = [ "cmdline",]
I'm using Influx 1.7.9
The correct processor configuration to convert pid as tag is as below.
[processors]
[[processors.converter]]
namepass = [ "procstat"]
[processors.converter.fields]
tag = [ "pid"]
Please refer the documentation of converter processor plugin
https://github.com/influxdata/telegraf/tree/master/plugins/processors/converter
In the latest version of telegraf pid can be stored as tag by specifying it in the input plugin configuration. A converter processor is not needed here.
Mention pid_tag = true in the configuration. However be aware of the performance impact of having pid as a tag when processes are short lived.
P.S: You should try to upgrade your telegraf version to 1.14.5. There is a performance improvement fix for procstat plugin in this version.
Plugin configuration reference https://github.com/influxdata/telegraf/tree/master/plugins/inputs/procstat
Sample config.
# Monitor process cpu and memory usage
[[inputs.procstat]]
## PID file to monitor process
pid_file = "/var/run/nginx.pid"
## executable name (ie, pgrep <exe>)
# exe = "nginx"
## pattern as argument for pgrep (ie, pgrep -f <pattern>)
# pattern = "nginx"
## user as argument for pgrep (ie, pgrep -u <user>)
# user = "nginx"
## Systemd unit name
# systemd_unit = "nginx.service"
## CGroup name or path
# cgroup = "systemd/system.slice/nginx.service"
## Windows service name
# win_service = ""
## override for process_name
## This is optional; default is sourced from /proc/<pid>/status
# process_name = "bar"
## Field name prefix
# prefix = ""
## When true add the full cmdline as a tag.
# cmdline_tag = false
## Add the PID as a tag instead of as a field. When collecting multiple
## processes with otherwise matching tags this setting should be enabled to
## ensure each process has a unique identity.
##
## Enabling this option may result in a large number of series, especially
## when processes have a short lifetime.
# pid_tag = false

Logstash not reading from file input

I'm running this implementation of the ELK stack, which is pretty straightforward and easy to configure.
I can push TCP input through the stack using netcat like so:
nc localhost 5000 < /Users/me/path/to/logs/appOne.log
nc localhost 5000 < /Users/me/path/to/logs/appOneStackTrace.log
nc localhost 5000 < /Users/me/path/to/logs/appTwo.log
nc localhost 5000 < /Users/me/path/to/logs/appTwoStackTrace.log
But I cannot get the Logstash to read the file paths I specify in the config:
input {
tcp {
port => 5000
}
file {
path => [
"/Users/me/path/to/logs/appOne.log",
"/Users/me/path/to/logs/appOneStackTrace.log",
"/Users/me/path/to/logs/appTwo.log",
"/Users/me/path/to/logs/appTwoStackTrace.log"
]
type => "log"
start_position => "beginning"
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
Here is the startup output from the stack regarding logstash input:
logstash_1 | [2019-01-28T17:44:33,206][INFO ][logstash.inputs.tcp ] Starting tcp input listener {:address=>"0.0.0.0:5000", :ssl_enable=>"false"}
logstash_1 | [2019-01-28T17:44:34,037][INFO ][logstash.inputs.file ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/usr/share/logstash/data/plugins/inputs/file/.sincedb_a1605b28f1bc77daf785a8805c32f578", :path=>["/Users/me/path/to/logs/appOne.log", "/Users/me/path/to/logs/appOneStackTrace.log", "/Users/me/path/to/logs/appTwo.log", "/Users/me/path/to/logs/appTwoStackTrace.log"]}
There is no indication the pipeline has any issues starting.
I've also checked that the log files have been updated since the display of TCP input, and they have. The last Logstash-specific log from the ELK stack comes from either startup or the TCP input.
Here is my entire Logstash start-up logging in case that's helpful:
logstash_1 | Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
logstash_1 | [2019-01-29T13:32:19,391][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
logstash_1 | [2019-01-29T13:32:19,415][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.5.4"}
logstash_1 | [2019-01-29T13:32:23,989][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
logstash_1 | [2019-01-29T13:32:24,648][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://elasticsearch:9200/]}}
logstash_1 | [2019-01-29T13:32:24,908][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://elasticsearch:9200/"}
logstash_1 | [2019-01-29T13:32:25,046][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
logstash_1 | [2019-01-29T13:32:25,051][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
logstash_1 | [2019-01-29T13:32:25,108][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//elasticsearch:9200"]}
logstash_1 | [2019-01-29T13:32:25,229][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
logstash_1 | [2019-01-29T13:32:25,276][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"#timestamp"=>{"type"=>"date"}, "#version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
logstash_1 | [2019-01-29T13:32:25,327][INFO ][logstash.inputs.tcp ] Starting tcp input listener {:address=>"0.0.0.0:5000", :ssl_enable=>"false"}
logstash_1 | [2019-01-29T13:32:25,924][INFO ][logstash.inputs.file ] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"/usr/share/logstash/data/plugins/inputs/file/.sincedb_143c07d174c46eeab78b902edb3b1289", :path=>["/Users/me/path/to/logs/appOne.log", "/Users/me/path/to/logs/appOneStackTrace.log", "/Users/me/path/to/logs/appTwo.log", "/Users/me/path/to/logs/appTwoStackTrace.log"]}
logstash_1 | [2019-01-29T13:32:25,976][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x4d1515ce run>"}
logstash_1 | [2019-01-29T13:32:26,088][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
logstash_1 | [2019-01-29T13:32:26,106][INFO ][filewatch.observingtail ] START, creating Discoverer, Watch with file and sincedb collections
logstash_1 | [2019-01-29T13:32:26,432][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
I found the issue - I needed to map the log files from the host into the container(Docker noob). The local paths I was specifying in the Logstash config were fine for TCP, but were unavailable inside the container without volume mapping.
First, I created the container's internal log directories in the Dockerfile for Logstash:
RUN mkdir /usr/share/appOneLogs
RUN mkdir /usr/share/appTwoLogs
Then I volume-mapped my host's log directories into them in the docker-elk/docker-compose.yml file where Logstash is configured:
logstash:
build:
context: logstash/
args:
ELK_VERSION: $ELK_VERSION
volumes:
- ./logstash/config/logstash.yml:/usr/share/logstash/config/logstash.yml:ro
- ./logstash/pipeline:/usr/share/logstash/pipeline:ro
- /Users/me/path/to/appOne/logs:/usr/share/appOneLogs # this bit
- /Users/me/path/to/appTwo/logs:/usr/share/appTwoLogs # and this bit
ports:
- "5000:5000"
...
Finally, I replaced the paths in logstash/pipelines/logstash.config with the directories created in the Dockerfile:
file {
path => [
"/usr/share/appOneLogs",
"/usr/share/appTwoLogs",
]
}
Also of note, I removed start_position => "beginning" from the file input definition as this overrides the default behavior to treat files like live streams and thus start at the end.

Fluentd parser plugin

I am trying to implement a parser plugin for fluentd. Below are the configuration file and the plugin file.
Fluentd config file.
<source>
type syslog
port 9010
bind x.x.x.x
tag flog
format flog_message
</source>
Plugin file
module Fluent
class TextParser
class ElogParser < Parser
Plugin.register_parser("flog_message", self)
config_param :delimiter, :string, :default => " " # delimiter is configurable with " " as default
config_param :time_format, :string, :default => nil # time_format is configurable
# This method is called after config_params have read configuration parameters
def configure(conf)
if #delimiter.length != 1
raise ConfigError, "delimiter must be a single character. #{#delimiter} is not."
end
# TimeParser class is already given. It takes a single argument as the time format
# to parse the time string with.
#time_parser = TimeParser.new(#time_format)
end
def call(text)
# decode text
# ...
# decode text
yield result_hash
end
end
end
end
However the method call is not executed after running fluentd. Any help is greatly appreciated.
Since v0.12, use parse instead of call.
docs.fluentd.org was outdated so I just updated the article: http://docs.fluentd.org/articles/plugin-development#parser-plugins
Sorry for forget to update the document...