Adding new Debezium Connector to Apache Kafka restarts snapshot - apache-kafka

I'm using debezium/Kafka(v1.0) that uses Apache Kafka 2.4.
In addition, I have deployed a debezium mysql connector that monitor some tables configured to take snapshot at beginning, at this point is all good.
After a time I have the need to start monitoring other tables, so I create another connector, this time without snapshot because is not needed.
This produces that the first connector start to take the snapshot again.
Is this an expected behavior?
How is the procedure to start monitoring new tables without the others connector take the snapshot again?
thanks in advance.
EDIT configs added:
First Connector
{
"name": "orion_connector_con_snapshot_prod_v1",
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"database.hostname": "my_host",
"database.port": "3306",
"database.user": "my_db",
"database.password": "*********************",
"database.server.name": "orion_kafka",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "history_orion",
"database.history.skip.unparseable.ddl": "true",
"database.history.store.only.monitored.tables.ddl": "true",
"table.whitelist": "my_db.my_table_1,my_db.my_table_2,my_db.my_table_3",
"snapshot.mode": "when_needed",
"snapshot.locking.mode": "none"
}
The second connector that start the problem
{
"name": "nexo_impactos_connector_sin_snapshot_v1",
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"database.hostname": "my_host",
"database.port": "3306",
"database.user": "my_db",
"database.password": "*********************",
"database.server.name": "nexo_kafka",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "nexo_impactos",
"database.history.skip.unparseable.ddl": "true",
"database.history.store.only.monitored.tables.ddl": "true",
"table.whitelist": "other_db.other_table",
"snapshot.mode": "schema_only",
"snapshot.locking.mode": "none"
}

Related

MongoDbConnector publish multiple collections to only topic Kafka

Below is my MongoDbConnector configuration:
{
"connector.class": "io.debezium.connector.mongodb.MongoDbConnector",
"collection.include.list": "dbname.messages,dbname.comments",
"mongodb.password": "mongodbpassword",
"tasks.max": "1",
"database.history.kafka.topic": "dev.dbhistory.unwrap_with_key_id_8",
"mongodb.user": "mongodbuser",
"heartbeat.interval.ms": "90000",
"mongodb.name": "analytics",
"snapshot.delay.ms": "120000",
"key.converter.schemas.enable": "false",
"poll.interval.ms": "3000",
"value.converter.schemas.enable": "false",
"mongodb.authsource": "admin",
"errors.tolerance": "all",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"mongodb.hosts": "rs0/ip:27017",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"database.include.list": "dbname",
"snapshot.mode": "initial"
}
I need this publish to only topic, but it create two topic is analytics.dbname.messages and analytics.dbname.messages. How can i do?
My english is not good! Thanks!

How to configure kafka connect use Avro Schema?

I have starting to learn Avro. i want to implement it in kafka connect. I use a configuration like the following. Is this the right configuration?
{
"name": "surveyWawancara-connector",
"config": {
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"key.deserializer": "org.apache.kafka.connect.json.JsonDeserializer",
"database.user": "Acquisition.ro",
"database.dbname": "acquisition",
"value.deserializer": "org.apache.kafka.connect.json.JsonDeserializer",
"tasks.max": "1",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "http://localhost:8081",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://localhost:8081",
"database.history.kafka.bootstrap.servers": "beta-kafka-brokers.amq-streams-beta.svc:9092",
"database.history.kafka.topic": "schema-changes.sl.surveyWawancara",
"time.precision.mode": "connect",
"database.server.name": "beta-sl-bn",
"database.port": "1433",
"table.whitelist": "dbo.SurveyWawancara",
"key.converter.schemas.enable": "true",
"database.hostname": "10.7.76.62",
"database.password": "Acquisition_ro231!",
"value.converter.schemas.enable": "true",
"name": "surveyWawancara-connector",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter": "org.apache.kafka.connect.json.JsonConverter"
}
}
You've duplicated the converter fields, but these properties are correct, yes
"key.converter": "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url": "http://localhost:8081",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://localhost:8081",
Avro always has a schema, so these do nothing *.schemas.enable and can be removed. Similarly, the deserializer class configs are not applicable to Connect and are encompassed by the converter configs, so should be removed as well
Worth mentioning that the key format does not have to (and often doesn't) match the value's

Debezium Connector - read from beginning and stop working connector

I am trying to use Debezium to connect to my Postgres database. I would like to copy data from a specific table. Using this configuration I only copy the newest data. Should I only change a snapshot.mode?
"name": "prod-contact-connect",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.user": "user",
"database.dbname": "db_name",
"slot.name": "debezium_contact",
"tasks.max": "1",
"database.history.kafka.bootstrap.servers": "localhost:9092",
"publication.name": "dbz_publication",
"transforms": "unwrap",
"database.server.name": "connect.prod.contact",
"database.port": "5432",
"plugin.name": "pgoutput",
"table.whitelist": "specific_table_name",
"database.sslmode": "disable",
"database.hostname": "localhost",
"database.password": "pass",
"name": "prod-contact-connect",
"transforms.unwrap.add.fields": "op,table,schema,name",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"snapshot.mode": "never"
}
}
by the way, how can I stop working the debezium connector for a moment? There is some enable flag?

Debezium stops after initial sync

The initial sync works as expected but then the connector just stops and does not care about further table changes. There are no errors thrown and the connector is still marked as active and running.
Database: Amazon Postgres v10.7
Debezium config:
"name": "postgres_cdc",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "...",
"database.port": "5432",
"database.user": "...",
"database.password": "...",
"database.dbname": "...",
"database.server.name": "...",
"table.whitelist": "public.table1,public.table2,public.table3",
"plugin.name": "pgoutput",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"transforms": "unwrap, route, extractId",
"transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
"transforms.unwrap.drop.tombstones": false,
"transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
"transforms.route.regex": "[^.]+\\.[^.]+\\.(.+)",
"transforms.route.replacement": "postgres_$1",
"transforms.extractId.type": "org.apache.kafka.connect.transforms.ExtractField$Key",
"transforms.extractId.field": "id"
}
}
Any thoughts about what the problem could be?
Edit:
Log-Errors:
ERROR WorkerSourceTask{id=postgres_cdc-0} Failed to flush, timed out while waiting for producer to flush outstanding 75687 messages (org.apache.kafka.connect.runtime.WorkerSourceTask)
ERROR WorkerSourceTask{id=postgres_cdc-0} Failed to commit offsets (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter)

Multiple replication slot for debezium connector

I want to create multiple debezium connector with different replication slot. But I am Unable to create multiple replication slot for postgres debezium connector.
I am using docker container for Postgres & kafka. I tried setting up max_replication_slots = 2 in postgressql.conf file & also different slot.name. but still it did not create 2 replication slot for me.
{
"config": {
"batch.size": "49152",
"buffer.memory": "100663296",
"compression.type": "lz4",
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.dbname": "Db1",
"database.hostname": "DBhost",
"database.password": "dbpwd",
"database.port": "5432",
"database.server.name": "serve_name",
"database.user": "usename",
"decimal.handling.mode": "double",
"hstore.handling.mode": "json",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"name": "debezium-702771",
"plugin.name": "wal2json",
"schema.refresh.mode": "columns_diff_exclude_unchanged_toast",
"slot.drop_on_stop": "true",
"slot.name": "debezium1",
"table.whitelist": "tabel1",
"time.precision.mode": "adaptive_time_microseconds",
"transforms": "Reroute",
"transforms.Reroute.topic.regex": "(.*).public.(.*)",
"transforms.Reroute.topic.replacement": "$1.$2",
"transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081"
},
"name": "debezium-702771",
"tasks": [],
"type": "source"
}
{
"config": {
"batch.size": "49152",
"buffer.memory": "100663296",
"compression.type": "lz4",
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.dbname": "Db1",
"database.hostname": "DBhost",
"database.password": "dbpwd",
"database.port": "5432",
"database.server.name": "serve_name",
"database.user": "usename",
"decimal.handling.mode": "double",
"hstore.handling.mode": "json",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"name": "debezium-702772",
"plugin.name": "wal2json",
"schema.refresh.mode": "columns_diff_exclude_unchanged_toast",
"slot.drop_on_stop": "true",
"slot.name": "debezium2",
"table.whitelist": "tabel1",
"time.precision.mode": "adaptive_time_microseconds",
"transforms": "Reroute",
"transforms.Reroute.topic.regex": "(.*).public.(.*)",
"transforms.Reroute.topic.replacement": "$1.$2",
"transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081"
},
"name": "debezium-702772",
"tasks": [],
"type": "source"
}
It creates multiple connector but not multiple replication slot even after giving different slot name. Do I need to do anything over here.