Fluentd sending to Splunk HEC: Want to set sourcetype to the namespace - kubernetes

Is it possible to programatically set the sourcetype to be the namespace from where the logs were generated? I am using the fluentd plugin to send data to the Splunk http event collector. Elsewhere, it was recommended to use ${record['kubernetes']['namespace_name'] to set the index name to be the namespace name. When I do this for sourcetype, that actual text just shows up in Splunk rather than translating to the specific namespace names.
#include systemd.conf
#include kubernetes.conf
<match kubernetes.var.log.containers.fluentd**>
type null
</match>
<match **>
type splunk-http-eventcollector
all_items true
server host:port
token ****
index kubernetes
protocol https
verify false
sourcetype ${record['kubernetes']['namespace_name']
source kubernetes
buffer_type memory
buffer_queue_limit 16
chunk_limit_size 8m
buffer_chunk_limit 150k
flush_interval 5s
</match>

If you have not defined a sourcetype in an appropriate props.conf (and associated transforms.conf), Splunk will try to determine the sourcetype based on heuristics
Those heuristics are not generally very accurate on custom data sources
Instead of trying to "programatically set the sourcetype to be the namespace from where the logs were generated", add a field whose contents indicate the namespace from which the logs are generated (eg "namespace")
It's much simpler, extends your logging more efficiently, and doesn't require the definition of scores or hundreds or thousands of individual sourcetypes

Related

Fluentd incorrectly routing logs to its own STDOUT

I have a GKE cluster, in which I'm using Fluentd Kubernetes Daemonset v1.14.3 (docker image: fluent/fluentd-kubernetes-daemonset:v1.14.3-debian-gcs-1.1) to collect the logs from certain containers and forward them to a GCS Bucket.
My configuration takes logs from /var/log/containers/*.log, filter the containers based on some kubernetes annotations and then upload them to GCS using a plugin.
In most cases, this works correctly, but I'm currently stuck with some weird issue:
certain containers' logs are sometimes printed into fluentd own stdout. Let me elaborate:
Assume we have a container called helloworld which runs echo "HELLO WORLD".
Soon after the container starts, I can see in fluentd's own logs:
2022-07-20 13:29:04 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/my-pod_my-namespace_helloworld-7e6359514a5601e5ad1823d145fd3b73f7b65648f5cb760f2c1855dabe27d606.log"
...
HELLO WORLD
...
This .log file contains the standard output of my docker container ("HELLO WORLD" in some json structured format).
Normally, fluentd should tail this log, and send the messages to the GCS plugin, which should upload them to the destination bucket. But sometimes, the logs are printed directly into fluentd's own output, instead of being passed to the plugin.
I'd appreciate any help to root-cause this issue.
Things I've looked into
I have increased the verbosity of the logs using fluentd -vv, but found nothing relevant.
This happens kind of randomly, but only for certain containers. It never happens for some containers, but for others it sometimes does, and sometimes it doesn't.
There's nothing special about the containers presenting this issue.
We don't have any configuration for fluentd to stream the logs to stdout. We triple checked this.
The issue happens on a GKE cluster (1.21.12-gke.1700).
Below the fluentd configuration being used:
<label #FLUENT_LOG>
<match fluent.**>
#type null
</match>
</label>
<source>
#type tail
#id in_tail_container_logs
path /var/log/containers/*.log
exclude_path ["/var/log/containers/fluentd-*", "/var/log/containers/fluentbit-*", "/var/log/containers/kube-*", "/var/log/containers/pdsci-*", "/var/log/containers/gke-*"]
pos_file /var/log/fluentd-containers.log.pos
tag "kubernetes.*"
refresh_interval 1s
read_from_head true
follow_inodes true
<parse>
#type json
time_format %Y-%m-%dT%H:%M:%S.%NZ
keep_time_key true
</parse>
</source>
<filter kubernetes.**>
#type kubernetes_metadata
#id filter_kube_metadata
kubernetes_url "#{'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
verify_ssl true
ca_file "#{ENV['KUBERNETES_CA_FILE']}"
watch false # Don't watch for changes in container metadata
de_dot false # Don't replace dots in labels and annotations
skip_labels false
skip_master_url true
skip_namespace_metadata true
annotation_match ["app.*\/log-.+"]
</filter>
<filter kubernetes.**>
#type grep
<regexp>
key $.kubernetes.namespace_name
pattern /^my-namespace$/
</regexp>
<regexp>
key $['kubernetes']['labels']['example.com/collect']
pattern /^yes$/
</regexp>
</filter>
<match kubernetes.**>
# docs: https://github.com/daichirata/fluent-plugin-gcs
#type gcs
#id out_gcs
project "#{ENV['GCS_BUCKET_PROJECT']}"
bucket "#{ENV.fetch('GCS_BUCKET_PROJECT')"
object_key_format %Y%m%d/%H%M/%{$.kubernetes.pod_name}_${$.kubernetes.container_name}_${$.docker.container_id}/%{index}.%{file_extension}
store_as json
<buffer time,$.kubernetes.pod_name,$.kubernetes.container_name,$.docker.container_id>
#type file
path /var/log/fluentd-buffers/gcs.buffer
timekey 30
timekey_wait 5
timekey_use_utc true # use utc
chunk_limit_size 1MB
flush_at_shutdown true
</buffer>
<format>
#type json
</format>
</match>

How to use Swagger in Quarkus with Ingress-Ngnix Kubernetes

Good Afternoon. I'm trying to use Swagger in Quarkus and locally it works great for me, however when I upload it to the production environment I'm using Ingress-Nginx as a reverse proxy in a Kubernetes cluster and I'm running into a problem, I don't allows you to view the Swagger interface:
Postman Local:
Swaager Local:
Postman Kubernetes Environment with Ingress-Nginx:
Swaager-Ui in Kubernetes Environment with Ingress-Nginx:
My application.properties:
quarkus.datasource.db-kind=oracle
quarkus.datasource.jdbc.driver=oracle.jdbc.driver.OracleDriver
#quarkus.datasource.jdbc.driver=io.opentracing.contrib.jdbc.TracingDriver
quarkus.datasource.jdbc.url=jdbc:oracle:thin:#xxxxxxxxxxxx:1522/IVR
quarkus.datasource.username=${USERNAME_CONNECTION_BD:xxxxxxxx}
quarkus.datasource.password=${PASSWORD_CONNECTION_BD:xxxxxxxx.}
quarkus.http.port=${PORT:8082}
quarkus.http.ssl-port=${PORT-SSl:8083}
# Send output to a trace.log file under the /tmp directory
quarkus.log.file.path=/tmp/trace.log
quarkus.log.console.format=%d{HH:mm:ss} %-5p [%c{2.}] (%t) %s%e%n
# Configure a named handler that logs to console
quarkus.log.handler.console."STRUCTURED_LOGGING".format=%e%n
# Configure a named handler that logs to file
quarkus.log.handler.file."STRUCTURED_LOGGING_FILE".enable=true
quarkus.log.handler.file."STRUCTURED_LOGGING_FILE".format=%e%n
# Configure the category and link the two named handlers to it
quarkus.log.category."io.quarkus.category".level=INFO
quarkus.log.category."io.quarkus.category".handlers=STRUCTURED_LOGGING,STRUCTURED_LOGGING_FILE
quarkus.ssl.native=true
quarkus.http.ssl.certificate.key-store-file=${UBICATION_CERTIFICATE_SSL:srvdevrma1.jks}
quarkus.http.ssl.certificate.key-store-file-type=${TYPE_CERTIFICATE_SSL:JKS}
quarkus.http.ssl.certificate.key-store-password=${PASSWORD_CERTIFICATE_SSL:xxxxxxx}
quarkus.http.ssl.certificate.key-store-key-alias=${ALIAS_CERTIFICATE_SSL:xxxxxxxxx}
quarkus.native.add-all-charsets=true
quarkus.swagger-ui.path=/api/FindPukCodeBS/swagger-ui
quarkus.smallrye-openapi.path=/api/FindPukCodeBS/swagger
mp.openapi.extensions.smallrye.info.title=FindPukCodeBS
%dev.mp.openapi.extensions.smallrye.info.title=FindPukCodeBS
%test.mp.openapi.extensions.smallrye.info.title=FindPukCodeBS
mp.openapi.extensions.smallrye.info.version=1.0.1
mp.openapi.extensions.smallrye.info.description=Servicio que consulta el codigo puk asociado a una ICCID (SIMCARD)
mp.openapi.extensions.smallrye.info.termsOfService=Your terms here
mp.openapi.extensions.smallrye.info.contact.email=xxxxxxxxxxxxxxxxxxxx.com
mp.openapi.extensions.smallrye.info.contact.name=xxxxxxxxxxxxxxxxxx#telefonica.com
mp.openapi.extensions.smallrye.info.contact.url=http://exampleurl.com/contact
mp.openapi.extensions.smallrye.info.license.name=Apache 2.0
mp.openapi.extensions.smallrye.info.license.url=https://www.apache.org/licenses/LICENSE-2.0.html
What can be done in these cases?
The Swagger-UI is included by default only in dev mode.
To enable it on your application, you must set this parameter:
quarkus.swagger-ui.always-include=true
This parameter is build time, so you can't change it on your deploy. You must set it into your application.properties.
Reference
https://quarkus.io/guides/all-config#quarkus-swagger-ui_quarkus-swagger-ui-swagger-ui

fluentbit writes to /var/log/messages

I'm running fluentbit (td-agent-bit) on a CentOS system in order to output all logs in a centralized system. Everytime fluentbit pushes a record to the remote location, it adds a record in /var/log/messages as well, leading up to a huge log filesize.
Jul 21 08:48:53 hostname td-agent-bit: [2020/07/21 08:48:53] [ info] [out_azure] customer_id=XXXXXXXXXXXXXXXXXXXXXXXX, HTTP status=200
Any idea how can I stop a service (td-agent-bit) from writing to /var/log/messages? Couldn't find any configuration parameter (e.g. verbose) in fluentbit documentation. Thanks!
Your log_level is "info" which includes a lot of messages of the pipeline. You can either decrease the log level inside the output section of the plugin to "error" only, e.g:
[OUTPUT]
name azure
match *
log_level error
note: you can decrease the general log_level also in the main [SERVICE] section.

Fluentd create a tag based on a field value

I have a Kubernetes cluster in which i'm trying to aggregate container logs on the nodes and send them to MongoDB. However i need to be able to send the log records to different MongoDB servers based on values in the log record it self.
I'm using the fluent-plugin-kubernetes_metadata_filter plugin to attach additional information from Kubernetes to the log record. One of those fields are kubernetes_namespace_name. Is it possible to use that field to create a tag which i can use to match against the mongodb output plugin.
For example. Below i'm using only one output, but the idea is to have multiple and let fluent send the logs to that mongodb database based on the value in the field kubernetes_namespace_name:
<source>
#type tail
#label #KUBERNETES
path /var/log/containers/*.log
pos_file /var/log/es-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S
tag kubernetes.*
format json
keep_time_key true
read_from_head true
</source>
<label #KUBERNETES>
<filter kubernetes.**>
#type kubernetes_metadata
kubernetes_url "#{ENV['K8S_HOST_URL']}"
bearer_token_file /var/run/secrets/kubernetes.io/serviceaccount/token
ca_file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
include_namespace_id true
</filter>
<filter kubernetes.**>
#type flatten_hash
separator _
</filter>
# < Tag 'kubernetes.namespace.default' is created here somehow >
<match kubernetes.namespace.default>
#type mongo
host "#{ENV['MONGO_HOST']}"
port "#{ENV['MONGO_PORT']}"
database "#{ENV['MONGO_DATABASE']}"
collection "#{ENV['MONGO_COLLECTION']}"
capped
capped_size 1024m
user "#{ENV['MONGO_USER']}"
password "#{ENV['MONGO_PASSWORD']}"
time_key time
flush_interval 10s
</match>
</label>
instead of using the tag, you can use the message content to do the filtering using Fluentd's grep filter. You can add the filter after the kubernetes meta data filter, and before the data flattener. This allows you to specify the key kubernetes_namespace_name and then route according to the value within. As you may have additional MongoDB outputs using labels can help separate the process workflows.
Documentation: https://docs.fluentd.org/v0.12/articles/filter_grep
Example:
<filter kubernetes.**>
#type grep
<regexp>
key kubernetes_namespace_name
pattern cool
</regexp>
</filter>
<YOUR MONGO CONFIG HERE>

Howto add a connector to rest-service-component

In my Mule 2.X configuration I need to post a message as a URL parameter. Therefore I created a service that has an inbound and then tries to send the message using rest-service-component, as follows:
<service name="myService">
<inbound>
<vm:inbound-endpoint path="path/inbound" synchronous="true" connector-ref="myVmConnector"/>
</inbound>
<http:rest-service-component serviceUrl="http://www.domain.com/path/insert.asp" httpMethod="POST">
<http:payloadParameterName value="data_xml"/>
</http:rest-service-component>
</service>
But when I process a message through it I receive the following information:
Message : There are at least 2 connectors matching protocol "http", so the connector to use must be specified on the endpoint using the 'connector' property/attribute (java.lang.IllegalStateException)
Type : org.mule.transport.service.TransportFactoryException
Code : MULE_ERROR--2
http://www.mulesource.org/docs/site/current2/apidocs/org/mule/transport/service/TransportFactoryException.html
Normally this error can occur when you have multiple HTTP connectors configured and then you need to specify the connector on the endpoint (connector-ref). But the rest-service-component does not have such an attribute or child elements (http://www.mulesoft.org/documentation-3.2/display/MULE2USER/HTTP+Transport)
This is a very old bug that remained unresolved:
https://www.mulesoft.org/jira/browse/MULE-4272
Try to use http:outbound-endpoint instead.
UPDATE:
Try using a Groovy script component in place of the rest component to create calls to dynamic URLs in Mule 2. Something like this might work:
eventContext.sendEvent(message,"http://www.domain.com/path/insert.asp?data_xml=${payload}")
Depending on your payload, you may also have to URL encode that.