What is the difference between KafkaSpout and KafkaBolt object ? Usually KafkaSpout is used for reading data from kafka producers, but why we use KafkaBolt ?
Bolts in Storm write data. Spouts are Kafka consumers. You read from the broker directly, not from producers.
For example, you can use a Spout to read anything, transform that data within the topology, then setup a Kafka Bolt to produce data into Kafka
Related
Is it possible to send a message from a Java producer into a Kafka topic and consume the same message from same topic through a python consumer?
I'm only able to produce and consume data from python but I want producer data from java in Kafka topic and want to consume data through python consumer.
Yes, sending a message from any language producer to a Kafka topic is possible, and then consuming the same message from the same topic using any language consumer. In your case Java and python respectively.
I want producer data from java
That's exactly what bin/kafka-console-producer built-in script does. It is a shell command wrapped around Java code.
If you can consume those records from your existing Python consumer, that's exactly what you are asking for.
I am trying to integrate MongoDB and Storm-Kafka, Kafka Producer produces data from MongoDB but it fails to fetch from Consumer side.
Kafka version :0.10.*
Storm version :1.2.1
Do i need to add any functionality in Consumer?
What are the advantages of using Apache Storm's KafkaBolt in apache storm 1.2.2 instead of using the kafka producer apis directly from the bolt in topology to publish to downstream kafka topics?
Where are Kafka Partitions and corresponding Offsets stored while consuming messages from Kafka using Apache Storm Trident ? I could find something in Storm Zookeeper under ls /transactional/<StreamName>/coordinator/meta. But I am unable to understand what are these offsets and to which partition that they belong to ? How can I check Consumer Lag while running Trident Topology ?
both "Kafka Spout" and "Kafka Consumer" do retrieve data from the Kafka Brokers, the spout so far i know is for communicating with Storm, and the Consumer is with whatever else.
-but still, what is the difference technically?
-or what would be the difference between If i pulled out the data using a Consumer then receive it using a "Storm Spout" and between if i just used a "Kafka Spout" then add it to my Storm Topology Builder's setSpout(); function
-and when to use Consumer, or a Kafka Spout
A/the "Kafka Spout" is a Storm-specific adapter to read data from Kafka into a Storm topology. Behind the scenes, the Kafka spout actually uses Kafka's built-in "Kafka consumer" client.
Technically, the difference is that the Kafka spout is a kind of a Storm-aware "wrapper" on top of Kafka's consumer client.
In Storm, you should typically always use the included Kafka spout (see https://github.com/apache/storm/tree/master/external/storm-kafka or, for a spout implementation that uses Kafka's so-called "new" consumer client, https://github.com/apache/storm/tree/master/external/storm-kafka-client). It would be a very rare case to implement your own -- perhaps the most likely case would be if there is a bug in the existing Kafka spout that you need to work around until the Storm project fixes the bug upstream.