Debezium CDC connector to EventHub - apache-kafka

I am trying to use Debezium CDC connector to send data from the SQL server to Azure EventHub. But table is not getting created in EventHub from the SQL server. I am not getting any error as well. All the defaults topics are created in eventhub when i started the connector
followed this doc https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-kafka-connect-tutorial
and worked fine. Will CDC connector works fine with eventhub.. any idea?`

Debezium today only supports Kafka as out of box connector. If you need to write notifications to any other sink, you need to implement in embedded mode. See https://debezium.io/documentation/reference/1.0/operations/embedded.html

Related

Stream both schema and data changes from MySQL to MySQL using Kafka Connect

How we can stream schema and data changes along with some kind of transformations into another MySQL instance using Kafka connect source connector.
Is there a way to propagate schema changes also if I use Kafka's Python library(confluent_kafka) to consume and transform messages before loading into target DB.
You can use Debezium to stream MySQL binlogs into Kafka. Debezium is built upon Kafka Connect framework.
From there, you can use whatever client you want, including Python, to consume and transform the data.
If you want to write to MySQL, you can use Kafka Connect JDBC sink connector.
Here is an old post on this topic - https://debezium.io/blog/2017/09/25/streaming-to-another-database/

Use Kafka connect with AWS documentDB

I'm trying to use AWS DocumentDB as a sink for storing data received from Kafka and was wondering if the MongoDB Kafka connector works with DocumentDB as its documentation mentions that it is compatible with MongoDB drivers.
https://www.mongodb.com/docs/kafka-connector/current/
https://aws.amazon.com/documentdb/
If not this connector what is the alternate way other than building a custom kafka connect?
You can use MongoDB Kafka connector with DocumentDB for source as well Sink.
Kafka Connector worker(with Mongodb Kafka connector) can be run in distributed mode using containers as well as EC2 hosts.
You can refer blog here which has step by step details
https://aws.amazon.com/blogs/database/stream-data-with-amazon-documentdb-and-amazon-msk-using-a-kafka-connector/

How can I increase the tasks.max for debezium sql connnector?

I tried setting the configuration for debezium MySQL connector for the property
'tasks.max=50'
But the connector in logs shows error as below:
'java.lang.IllegalArgumentException: Only a single connector task may be started'
I am using MSK Connector with debezium custom plugin and Debezium version 1.8.
It's not possible.
The database bin log must be read sequentially by only one task.
Run multiple connectors for different tables if you want to distribute workload

Kafka Connector to IBM DB2

I'm currently working in a Mainframe Technology where we store the data in IBM DB2.
We got a new requirement to use scalable process to migrate the data to a new messaging platform including new database. For that we have identified Kafka is a suitable solution with either KSQLDB or MONGODB.
Can someone able to tell or direct me on how can we connect to IBM DB2 from Kafka to import the data and place it in either KSQLDB or MONGODB?
Any help is much appreciated.
To import the data from IBM DB2 into Kafka, You need to use any connector like the Debezium connector for DB2.
The information regarding the connector can be found in the following.
https://debezium.io/documentation/reference/connectors/db2.html
Connector Configuration
You can also use JDBC Source Connector for the same functionality. The following links are helpful for the configuration.
https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/
A Simple diagram for events flows from RDMS to Kafka topic.
After placing the data into Kafka, we need to transfer that data MongoDb. We need to use Mongo Db Connector to transfer the data from Kafka to mongo Db.
https://www.mongodb.com/blog/post/getting-started-with-the-mongodb-connector-for-apache-kafka-and-mongodb-atlas
https://www.confluent.io/hub/mongodb/kafka-connect-mongodb

AWS: What is the right way of PostgreSQL integration with Kinesis?

The aim that I want to achieve:
is to be notified about DB data updates, for this reason, I want to build the following chain: PostgreSQL -> Kinesis -> Lambda.
But I am now sure how to notify Kisesis properly about DB changes?
I saw a few examples where peoples try to use Postgresql triggers to send data to Kinesis.
some people use wal2json concept.
So I have some doubts about which option to choose, that why I am looking for advice.
You can leverage Debezium to do the same.
Debezium connectors can also be intergrated within the code, using Debezium Engine and you can add transformation or filtering logic(if you need) before pushing the changes out to Kinesis.
Here's a link that explains about Debezium Postgres Connector.
Debezium Server( Internally I believe it makes use of Debezium Engine).
It supports Kinesis, Google PubSub, Apache Pulsar as of now for CDC from Databases that Debezium Supports.
Here is an article that you can refer to for step by step configuration of Debezium Server
[https://xyzcoder.github.io/2021/02/19/cdc-using-debezium-server-mysql-kinesis.html][1]