Streaming data from Postgres with Debezium to Azure Event Hubs - debezium

I have Postgres DB and created the Event Hub namespace on Azure.
I need to create a docker-compose file with Debezium settings that will capture DB data changes, transform messages and topics, and send them to Azure Event Hub.
Which docker images should I use for this flow? Debezium server, Kafka connect......

Related

Mongodb kafka connector is running but the data is not getting published in sink cluster

I was using mongodb kafka connector on confluent cloud and the data source and sink cluster was in mongodb. Although the connectors on confluent cloud is running, the source connector on confluent shows the spike as well as count of message processed each time when the data is being inserted in source cluster, the data is not getting published in the sink cluster (source and sink cluster belongs to two different account of mongodb).. Can somebody tell me why is it not able to transmit the data.
As both the connectors are connected successfully and they are up and running ,I was expecting that data which is being added to mongodb source cluster, it should get reflected in sink cluster.

Schema Registry URL for IIDR CDC Kafka subscription

I have created a cluster Amazon MSK. Also, created an EC2 instance and installed Kafka on it to create a topic in Amazon MSK. I am able to produce/consume messages on the topic using Kafka scripts.
I have also installed the IIDR Replication agent on an EC2 instance. The plan is to migrate DB2 table data into the Amazon MSK topic.
In the IDR Management console, I am able to add the IIDR replication server as the target.
Now when creating the subscription, it is asking for ZooKeeper URL and Schema Registry URL. I can get the Zookeeper endpoints from Amazon MSK.
What value to provide for the schema registry URL as there's none created?
Thanks for your help.
If you do not need to specify a schema registry because say you are using a KCOP that generate JSON, just put in a dummy value. Equally if you are specifying a list of Kafka brokers in the kafkaconsumer.propertie and the kafkaproducer.properties files in the CDC instance.conf directory you can put in dummy values for the zookeeper fields.
Hope this helps
Robert

Stream both schema and data changes from MySQL to MySQL using Kafka Connect

How we can stream schema and data changes along with some kind of transformations into another MySQL instance using Kafka connect source connector.
Is there a way to propagate schema changes also if I use Kafka's Python library(confluent_kafka) to consume and transform messages before loading into target DB.
You can use Debezium to stream MySQL binlogs into Kafka. Debezium is built upon Kafka Connect framework.
From there, you can use whatever client you want, including Python, to consume and transform the data.
If you want to write to MySQL, you can use Kafka Connect JDBC sink connector.
Here is an old post on this topic - https://debezium.io/blog/2017/09/25/streaming-to-another-database/

Debezium Embedded Engine with AWS Kinesis - PostgreSQL snapshot load and Transaction metadata stream

I'd like to use Debezium Embedded Engine with AWS Kinesis in order to load initial snapshot of PostgreSQL database and then continuously perform a CDC.
I know, that with Kafka Connect I'll have Transaction metadata topic out of the box in order to check transaction boundaries.
How about the same but with Debezium Embedded Engine and AWS Kinesis ( https://debezium.io/blog/2018/08/30/streaming-mysql-data-changes-into-kinesis/ ) Will I have Kinesis Transaction metadata stream in this case? Also, will Debezium Embedded Engine perform initial snapshot of the existing PostgreSQL data?
UPDATED
I implemented test EmbeddedEngine application with PostgreSQL:
engine = EmbeddedEngine.create()
.using(config)
.using(this.getClass().getClassLoader())
.using(Clock.SYSTEM)
.notifying(this::sendRecord)
.build();
Right now, inside my 'sendRecord(SourceRecord record)' method I can see the correct topics for each database table which participate in transaction, for example:
private void sendRecord(SourceRecord record) {
String streamName = streamNameMapper(record.topic());
System.out.println("streamName: " + streamName);
results to the following output:
streamName: kinesis.public.user_states
streamName: kinesis.public.tasks
within the same txId=1510
but I still can't see Transaction metadata stream.
How to correctly get Transaction metadata stream with Debezium EmbeddedEngine?
If you are not specific about using just Debezium Embedded Engine then there is an option provided by Debezium itself and it is called Dewbezium Server( Internally I believe it makes use of Debezium Engine).
It is a good alternative to making use of Kafka and it supports Kinesis, Google PubSub, Apache Pulsar as of now for CDC.
Here is an article that you can refer to
https://xyzcoder.github.io/2021/02/19/cdc-using-debezium-server-mysql-kinesis.html

Debezium CDC connector to EventHub

I am trying to use Debezium CDC connector to send data from the SQL server to Azure EventHub. But table is not getting created in EventHub from the SQL server. I am not getting any error as well. All the defaults topics are created in eventhub when i started the connector
followed this doc https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-kafka-connect-tutorial
and worked fine. Will CDC connector works fine with eventhub.. any idea?`
Debezium today only supports Kafka as out of box connector. If you need to write notifications to any other sink, you need to implement in embedded mode. See https://debezium.io/documentation/reference/1.0/operations/embedded.html