Kafka Streams vs Logstash [closed] - apache-kafka

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I am currently choosing between Kafka Streams or Logstash for real-time log collect, transformation and enrichment and finally send to Elasticsearch. The logs comes from different IT network devices such as firewalls, switches, access-points etc.
Since both Kafka Streams and Logstash have almost similar functionalities, is there benefits choose 1 over another? (Performance? Easy to deploy?)
Thanks

Kafka Streams and Logstash are two completely different things
Kafka Streams is a client library that you can use to write an application to stream and process data stored in Kafka Brokers, you need to write your own application in Java.
Logstash is an ETL tool that you can use to extract/receive data from multiple sources, process this data using a wide range of filters and send it to different outputs, like elasticsearch, file, s3, kafka and many others.
It is very common to use Logstash and Kafka together, which Kafka working as a message queue for the messages that logstash will consume and process, you have shippers like Filebeat sending data to Kafka Brokers and then you use Logstash to consume this data.
You can build your own applications in Java using the Kafka Streams library to collect, process and ship the data to Elasticsearch, but this will be very complex in comparison with using the tools of the stack, Filebeat to collect logs, Logstash to receive/process, Elasticsearch to store.

Related

Spring batch vs Kafka Streams [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 19 days ago.
Improve this question
I have to implement a solution that consists of processing a large amount of data by applying business requirement rules. The input and the output will be a file.
I haven't been using Kafka before, I am wondering if I can use Kafka streams to process these rules or use spring batch combined with Kafka streams.
Is there any other frameworks/technologies that can be used in Java?
Thank you
Kafka Streams is a stream processing solution; what you're talking about is more of a batch workload. The difficulties you will encounter using KStreams are:
Kafka Streams doesn't have a good way of working with files as input and output.
In Stream Processing, there's no real concept of "beginning" and "end," whereas I gather from the nature of your question that you do indeed have a beginning and end in your use-case.
As such I would recommend another batch solution.

Load testing kafka with multiple producers and multiple consumers [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 months ago.
Improve this question
I am trying to load test my Kafka cluster with multiple producers and multiple consumers. I came across a lot of available tools and article but all of them generates load(Producer) from a single machine and similarly reads(Consumer) from a single machine.
I am looking for a tool which can be deployed across/spawn multiple producers and consumers and load test a given kafka cluster.
As input, we can give the number of producers and consumers.
It can then spawn those number of machines with producers and consumers (On AWS, Azure or GCP). Or we can spawn machines manually and then the tool can initiate producer and consumer on them.
Post that it load test's the target kafka cluster.
At the end, it gives out test results like, write/sec, read/sec etc.
Tools/Articles I checked are:
https://www.blazemeter.com/blog/apache-kafka-how-to-load-test-with-jmeter
https://medium.com/selectstarfromweb/lets-load-test-kafka-f90b71758afb
Load testing with Kafka and Storm
Load test kafka consumer
Load testing a kafka consumer
The very first article neither mentions nor assumes any limitations regarding the number of consumers/producers.
Just put the Samplers for different Kafka instances (or different topics or whatever is your test scenario) under different JMeter Thread Groups and you will be able to concurrently stress multiple endpoints.
If you prefer doing it from different machines - you can run JMeter in distributed mode and point different JMeter slave machines to stress different endpoints using If Controller and __machineName() or __machineIP() functions combination.

Which Messaging System to be used? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I would like to transfer data from one db system to any other db systems. which messaging system(Kafka, ActiveMQ, RabbitMQ.....likewise) would be better to achieve this with much throughput and performance.
I guess the answer for this type of questions is "it depends"
You probably could find many information on the internet about comparison between these message brokers,
as far as I can share from our experience and knowledge, Kafka and its ecosystem tools like kafka connect , introducing your requested behavior with source connectors and sink connectors using kafka in the middle,
Kafka connect is a framework which allows adding plugin called connectors
Sink connectors- reads from kafka and send that data to target system
Source connector- read from source store and write to kafka
Using kafka connect is "no code", calling rest api to set configuration of the connectors.
Kafka is distributed system that supports very high throughout with low latency. It supports near real time streaming of data.
Kafka is highly adopted by the biggest companies around the world.
There are many tools and vendors that support your use case, they vary in price and support, it depends from which sources you need to take data and to which targets you wish to write, should it be cdc/near real time or "batch" copy

Apache Kafka Streams API vs KSQL [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I'm want to learn about the differences between the two methods. I developed a project so It aggregates some data using Apache Kafka Streams API. And after that, I got on some solutions which are written with KSQL.
I've never got experienced with KSQL so I would like to learn when and which approach should select for aggregate some stuff? Could I use KSQL instead of Kafka Streams?
There's a blog post somewhere that talks about the "Kafka abstraction funnel"
KSQL doesn't provide as much flexibility as Kafka Streams, which in turn, abstracts many details of the core consumer/producer API.
If you have people more familiar with SQL, and not so good at other client libraries, you'd use KSQL. If you run into a feature not supported by KSQL (think, custom, complex data types) or need to embed streaming logic into a larger application without needing to remotely query the KsqlDB rest api, use Kafka Streams

Does kafka have any default web UI [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a couple questions in Kafka.
1) Does Kafka have a default web UI?
2) How can we gracefully shutdown a standalone kafka server, kafka console-
consumer/console-producer.
Any solutions will be highly appreciated.
Thank you.
1) No Kafka does not have a default UI.
There are however a number of third party tools that can graphically display Kafka resources. Just Google for kafka ui and pick the tool that displays what you want and you like the most.
2) To gracefully shutdown a Kafka broker, just send a SIGTERM to the Kafka process and it will properly shutdown. This can be done via the ./bin/kafka-server-stop.sh tool.
If it's part of a cluster, new leaders will be elected on other brokers, otherwise it will simply cleanly close all its resources. Note that depending on the number of partitions, this can take a few minutes.
You can try Landoop Kafka UI: https://github.com/Landoop/fast-data-dev
They provide a nice Web-UI for Kafka topics, Avro schemata, Kafka Connect and much more.