Mirror data in Kafka 0.8.2.1 cluster to Kafka 2.2.0 cluster - apache-kafka

I wanted use Apache Spark Structured Streaming along with Kafka, Spark Structured Streaming Supports Kafka 0.10 and above and my Kafka cluster uses kafka version 0.8.2.1 . I want to replicate some of the topics from current kafka 0.8.2.1 cluster to new Kafka Cluster which is based on 2.2.0.
To do this i tried using kafka-console-consumer on Kafka 2.2.0 cluster to listen the messages from kafka cluster 0.8.2.1 and piped the result of kafka-console-consumer to kafka-console-producer on the the kafka 2.2.0 cluster. But that didn't kafka-console-consumer on Kafka 2.2.0 cluster was not able to receive any messages.

As of now I have solved this problem by reading the data from kafka 0.8.2.1 cluster using the Java Client APIs and I am writing the data read from older kafka cluster(0.8.2.1) to newer kafka cluster(2.2.0) using the client APIs.
Can anyone suggest some better ways to mirror two kafka clusters running different versions of Kafka?

Related

How to make a Data Pipeline from MQTT to KAFKA Broker to MongoDB?

How can I make a data pipeline, I am sending data from MQTT to KAFKA topic using Source Connector. and on the other side, I have also connected Kafka Broker to MongoDB using Sink Connector. I am having trouble making a data pipeline that goes from MQTT to KAFKA and then MongoDB. Both connectors are working properly individually. How can I integrate them?
here is my MQTT Connector
MQTT Connector
Node 1 MQTT Connector
Message Published from MQTT
Kafka Consumer
Node 2 MongoDB Connector
MongoDB
that is my MongoDB Connector
MongoDB Connector
It is hard to tell what exactly the problem is without more logs, please provide your connect.config as well, please check /status of your connector, I still did not understand exactly what the issue you are facing, you are saying that , MQTT SOURCE CONNECTOR sending messages successfully to KAFKA TOPIC and your MONGO DB SINK CONNECTOR successfully reading this KAFKA TOPIC and write to your mobgodb, hence your pipeline, Where is the error? Is your KAFKA is the same KAFKA? Or separated different KAFKA CLUSTERS? Seems like both localhost, but is it the same machine?
Please elaborate and explain what are you expecting? What does "pipeline" means in your word?
You need both connectors to share same kafka cluster, what does node1 and node2 mean is it seperate kafka instance? Your connector need to connect to the same kafka "node" / cluster in order to share the data inside the kafka topic one for input and one for output, share your bootstrap service parameters, share your server.properties as well of the kafka
In order to run two different connect clusters inside same kafka , you need to set in different internal topics for each connect cluster
config.storage.topic
offset.storage.topic
status.storage.topic

Apache Kafka spout is not working on Consumer Side

I am trying to integrate MongoDB and Storm-Kafka, Kafka Producer produces data from MongoDB but it fails to fetch from Consumer side.
Kafka version :0.10.*
Storm version :1.2.1
Do i need to add any functionality in Consumer?

Kafka producer api vs Apache storm's KafkaBolt

What are the advantages of using Apache Storm's KafkaBolt in apache storm 1.2.2 instead of using the kafka producer apis directly from the bolt in topology to publish to downstream kafka topics?

Can we use zookeeper to store offset in kafka apache metamodel

Apache MetaModel kafka consumer not working for zookeepers offset storage.
I am using Apache MetaModel 5.1 and kafka version 0.10.2.1. I am facing issue with kafka consumer(metamodel internal consumer) as its not consuming any messages from topic
my test environment kafka offset storage is zookeeper. when I tried with changing offset storage to KAFKA(on different environment), consumer worked fine.
As of now I don't want to change offset storage to kafka so is there any other way to fix this issue on Apache MetaModel kafka consumer side?

Confluent 5.0.0 kafka connect hdfs sink: Cannot describe the kafka connect consumer group lag after upgrade

We upgraded from Confluent 4.0.0 to 5.0.0, after upgrading we cannot list the kafka connect hdfs sink connector consumer lag.
$ /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server <hostname>:9092 --list | grep scribe_log_backend
Note: This will not show information about old Zookeeper-based consumers.
connect-kconn_scribe_log_backend
$ /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server <hostname>:9092 --group connect-kconn_scribe_log_backend --describe
Note: This will not show information about old Zookeeper-based consumers.
$
Were there any modifications done to the consumer group command in kafka 2.0/confluent 5.0.0 ? How do i track the lag we need to alert based on this lag?
Our brokers run on kafka version 1.1.0.
Also cannot see the connect consumer group in kafka manager after the upgrade.
There is no issue with kafka connect as the connector is able to write to hdfs.
Thanks.
Kafka Connect HDFS connector no longer commits offsets so there is nothing to base lag calculation on.
PS.
The recovery is based on file naming convention is HDFS, the file names have partition and offset info.