I am trying to get my Filebeat to deliver logs to multiple instances of Logstash. Filebeat can have multiple outputs, but they are then loadbalanced.
I have tried setting up filebeat with:
output.logstash.hosts: ["IP1:5044", "IP2:5044"]
output.logstash.loadbalance: false
Which gives me loadbalancing IP1 until that node fails, and then switches to IP2.
Loadbalance = false, is defined as default when there are multiple hosts
Also, I have tried:
output.logstash.hosts: ["IP1:5044", "IP2:5044"]
output.logstash.loadbalance: true
Which gives me loadbalancing logs being sent. One log entry goes to IP1, and next one goes to IP2.
Aiming for completely redundant ELK pipelines where filebeat is supplying both at the same time.
What is the question here? Filebeat will always wait for an acknowledgement from Logstash and either method will retry if they try to send data to an unreachable instance.
The loadbalance attribute will only change if you send data to only one host (and switch on failure) or distribute amongst all of them.
You can only send each event to one Logstash instance; otherwise you'd have duplicate event in Elasticsearch in the end.
Related
I would like to create a cronjob via Knative that sends healthcheck messages to my Kafka topic every 10 minutes.
Then we will have a separate endpoint created, that will receive these messages and pass some response to a receiver (healthcheck).
Tried using KafkaBinding, which seems suitable, but cannot apply this template (TLS): https://github.com/knative/docs/tree/main/code-samples/eventing/kafka/binding
I also find it odd, that regular KafkaBinding template contains apiVersion: bindings.knative.dev/v1beta1, while the one with TLS: sources.knative.dev/v1beta1.
Haven't found much documentation on how to create a Cronjob sending messages on Kafka, which is then grabbed using KafkaSource and passed using broker+trigger to my service on K8s.
Anyone got it implemented right? Or found another way for this?
The idea is test the whole flow including KafkaSource, which seems a little bit unstable.
We just migrated to a kubernetes cluster, I was wondering if it is possible to send a kafka event when a container/pod finishes automatically with the stdout as message. Right now we are using fluentd with elastic search but the output of a pod is used as input for the next one, we need to poll constantly elastic search for when the output is ready and that causes performance issues on overall execution
I'm not sure of your current setup but my first thought would jump to:
Use something such as fluentd or Logstash on it's own pod per node
Configure volume access to Kubernetes log folder /var/log/containers/*
Use the Kafka output for either fluentd or Logstash with file input (tail) on the logging folder
This approach would require the configuration above on each node however but requires minimal configuration of logging locations etc..
It's not something I've personally configured but have considered it for the future.
More info here
I would like to know if it is possible to send notification using yaml config if the kubernetes job fails?
For example, I have a kubetnetes job which runs once in everyday. Now i have been running a jenkins job to check and send notification if the job fails. Do we have any options to get notification from kubernetes jobs directly if it fails? It should be something like we add in job yaml
I'm not sure about any built in notification support. That seems like the kind of feature you can find in external dedicated monitoring/notification tools such as Prometheus or Logstash output.
For example, you can try this tutorial to leverage the prometheus metrics generated by default in many kubernetes clusters: https://medium.com/#tristan_96324/prometheus-k8s-cronjob-alerts-94bee7b90511
Or you can theoretically setup Logstash and monitor incoming logs sent by filebeat and conditionally send alerts as part of the output stage of the pipelines via the "email output plugin"
Other methods exist as well as mentioned in this similar issue: How to send alerts based on Kubernetes / Docker events?
For reference, you may also wish to read this request as discussed in github: https://github.com/kubernetes/kubernetes/issues/22207
Is there any documentation out there on sending logs from containers in K8s to an external ELK cluster running on EC2 instances?
We're in the process of trying to Kubernetes set up and I'm trying to figure out how to get the logging to work correctly. We already have an ELK stack setup on EC2 for current versions of the application but most of the documentation out there seems to be referring to ELK as it's deployed to the K8s cluster.
I am also working on the same cause.
First you should know what driver is being used by your docker containers to manage the logs (json driver/ journald etc - read here).
After that you should use some log collector in your architecture to send the logs to the Logstash endpoint. You can use filebeat/fluent bit. They are light weight alternatives to logstash/fluentd respectively. You must use one of them and not directly send your logs to logstash via syslog since these log shippers have a special functionality of enriching your logs with kubernetes metadata of the respective containers.
There might be lot of challenges after that. Parsing log data (multiline logs for example) etc. For an efficient pipeline, it’s better to do most of the work (i.e. extracting the date object from the logs etc) at the log sender side, than using the common logstash for this purpose that might be a bottle-neck.
Note that in case the container logs are not sent to stdout/stderr but written else-where, you might need to run filebeat/fluent-bit as side-car with your containers.
As for the links for documentation are concerned, I myself didn’t find anything documented in a single place on this, but the keywords that I mentioned over, reading about them I got to know many things.
Hope this helps.
Before looking at Kubernetes, we are writing all our logs to stdout(according to 12-factor-app) and using logspout to collect the logs to Logstash. And in Logstash we then route logs to different targets:
InfluxDB+Grafana: to monitor application metrics(e.g., how long does a certain calculation takes)
Riemann: to alert if some performance thresholds are crossed
How these things can be done in Kubernetes?
I know that with Heapster you can see JVM level graphs(memory usages, etc) or even maybe Heapster can send events to Riemann in order to alert some system level statistics(e.g., disk is full). But for stuff on the application level, what would be the right approach then?
Heapster should be grabbing the stdout from the containers as well and can send the data to different backends (sinks). It would essentially be an API call with the data. Check out: https://github.com/kubernetes/heapster/blob/master/docs/sink-configuration.md
I'm not 100% sure on stdout being the only method for a 12fa, but we use a in-house logging lib that also streams the stdout to our logging engine (graylog). That happens inside the app so that the log messages are preserved as a full 'event' vs heapster or other stdout scrapings treating each line as an event.