Whenever I restart the debezium kafka-connect container, or deploy another instance, I get the following error:
io.debezium.jdbc.JdbcConnectionException: ERROR: replication slot "debezium" already exists
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.initReplicationSlot(PostgresReplicationConnection.java:136)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.<init>(PostgresReplicationConnection.java:79)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.<init>(PostgresReplicationConnection.java:38)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$ReplicationConnectionBuilder.build(PostgresReplicationConnection.java:349)
at io.debezium.connector.postgresql.PostgresTaskContext.createReplicationConnection(PostgresTaskContext.java:80)
at io.debezium.connector.postgresql.RecordsStreamProducer.<init>(RecordsStreamProducer.java:75)
at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:112)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:157)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: ERROR: replication slot "debezium" already exists
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2412)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2125)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:297)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:428)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:301)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:287)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:264)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:260)
at org.postgresql.replication.fluent.logical.LogicalCreateSlotBuilder.make(LogicalCreateSlotBuilder.java:48)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.initReplicationSlot(PostgresReplicationConnection.java:102)
... 14 more
I'm using this image: https://github.com/debezium/docker-images/tree/master/connect/0.8
And have config for it like this:
{
"name":"record-loader-connector",
"config":{
"connector.class":"io.debezium.connector.postgresql.PostgresConnector",
"database.dbname":"record_loader?ssl",
"database.user":"postgres",
"database.hostname":"redacted",
"database.history.kafka.bootstrap.servers":"redacted",
"database.history.kafka.topic":"dbhistory.recordloader",
"database.password":"redacted",
"name":"record-loader-connector",
"database.server.name":"recordLoaderDb",
"database.port":"20023",
"table.whitelist":".*sync"
},
"tasks":[
{
"connector":"record-loader-connector",
"task":0
}
],
"type":"source"
}
I've noticed these two config options (slot.name and slot.drop_on_stop), but it is not clear to me if/how I should change them:
http://debezium.io/docs/connectors/postgresql/#connector-properties
If you deploy multiple instances of the Debezium Postgres connector, you must make sure to use distinct replication slot names. You can specify a name when setting up the connector:
{
"name": "inventory-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "postgres",
"database.password": "postgres",
"database.dbname" : "postgres",
"database.server.name": "dbserver1",
"database.whitelist": "inventory",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "schema-changes.inventory",
"slot.name" : "my-slot-name"
}
}
I can't reproduce the issue you describe when restarting a given connector instance. It should detect that the slot already exists and re-use that one (one possible cause may be that you altered the logical decoding plug-in, too ("decoderbufs" vs. "wal2json")?. If you have a reproducer for this, could you please open an entry in our bug tracker?
To proceed, you can manually delete the slot in Postgres:
select pg_drop_replication_slot('debezium');
Related
I'm using confluent kafka http sink connector, I want to filter this condition - if item.productID=12 then allow this event or ignore. But my filter condition not working
"transforms.filterExample1.filter.condition": "$.[*][?(#.item.productID == '12')]",
What I'm doing wrong here?
Could you please help me to fix this issue?
{
"connector.class": "io.confluent.connect.http.HttpSinkConnector",
"confluent.topic.bootstrap.servers": "localhost:9092",
"topics": "http-messages",
"tasks.max": "1",
"http.api.url": "http://localhost:8080/api/messages",
"reporter.bootstrap.servers": "localhost:9092",
"transforms.filter.type": "org.apache.kafka.connect.transforms.Filter",
"transforms": "filterExample1",
"transforms.filterExample1.type": "io.confluent.connect.transforms.Filter$Value",
"transforms.filterExample1.filter.condition": "$.[*][?(#.item.productID == '12')]",
"transforms.filterExample1.filter.type": "include",
"transforms.filterExample1.missing.or.null.behavior": "fail",
"reporter.error.topic.name": "error-responses",
"reporter.result.topic.name": "success-responses",
"reporter.error.topic.replication.factor": "1",
"confluent.topic.replication.factor": "1",
"value.converter.schemas.enable": "false",
"name": "HttpSink",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"reporter.result.topic.replication.factor": "1"
}
My Event
[
{
"name":"apple",
"salary":"3243",
"item":{
"productID":"12"
}
}
]
Since your data is in a JSON array, this transform wont work.
Tested with local data using the latest version of that transform, and saw this log from the Connect server
Caused by: org.apache.kafka.connect.errors.DataException: Only Map objects supported in absence of schema for [filtering record without schema], found: java.util.ArrayList
at io.confluent.connect.transforms.util.Requirements.requireMap(Requirements.java:30) ~[?:?]
at io.confluent.connect.transforms.Filter.shouldDrop(Filter.java:218) ~[?:?]
at io.confluent.connect.transforms.Filter.apply(Filter.java:161) ~[?:?]
"Map Objects" implying JSON objects
Also, you have a setting transforms.filter.type that's not doing anything
I am completely new to Kafka, trying to get the data from Mysql using Kafka, For that I have used two jars kafka-connect-jdbc-5.3.1 and mysql-connector-java-8.0.17 kept inside the lib folder of Kafka
(Path: ....\kafka_2.12-2.4.1\libs), I am using Windows10 with java-8.
I followed this tutorial : https://www.youtube.com/watch?v=r7LUbtOFcQI
Once I start the connector getting below issue. Any solutions will be appreciated.
E:\2020\kafka_2.12-2.4.1>.\bin\windows\connect-distributed.bat .\config\connect-distributed.properties
[2020-04-30 17:46:49,773] WARN could not get type for name org.jboss.resource.adapter.jdbc.vendor.MySQLExceptionSorter from any class loader (org.reflections.Reflections)
org.reflections.ReflectionsException: could not get type for name org.jboss.resource.adapter.jdbc.vendor.MySQLExceptionSorter
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:390)
at org.reflections.Reflections.expandSuperTypes(Reflections.java:381)
at org.reflections.Reflections.<init>(Reflections.java:126)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader$InternalReflections.<init>(DelegatingClassLoader.java:428)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:327)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:263)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:211)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:204)
at org.apache.kafka.connect.runtime.isolation.Plugins.<init>(Plugins.java:60)
at org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:91)
at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:78)
Caused by: java.lang.ClassNotFoundException: org.jboss.resource.adapter.jdbc.vendor.MySQLExceptionSorter
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:388)
... 10 more
[2020-04-30 17:46:49,837] WARN could not get type for name org.osgi.framework.BundleListener from any class loader (org.reflections.Reflections)
org.reflections.ReflectionsException: could not get type for name org.osgi.framework.BundleListener
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:390)
at org.reflections.Reflections.expandSuperTypes(Reflections.java:381)
at org.reflections.Reflections.<init>(Reflections.java:126)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader$InternalReflections.<init>(DelegatingClassLoader.java:428)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(DelegatingClassLoader.java:327)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(DelegatingClassLoader.java:263)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(DelegatingClassLoader.java:211)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:204)
at org.apache.kafka.connect.runtime.isolation.Plugins.<init>(Plugins.java:60)
at org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:91)
at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:78)
Caused by: java.lang.ClassNotFoundException: org.osgi.framework.BundleListener
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:388)
... 10 more
[2020-04-30 17:46:49,848] WARN could not get type for name io.netty.internal.tcnative.CertificateVerifier from any class loader (org.reflections.Reflections)
org.reflections.ReflectionsException: could not get type for name io.netty.internal.tcnative.CertificateVerifier
The post request I am sending
{
"name": "jdbc_source_mysql_03",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:mysql://localhost:3306/user",
"connection.user": "root",
"connection.password": "root",
"topic.prefix": "mysql-02-",
"table.whitelist": "test",
"mode": "incrementing",
"incrementing.column.name": "id",
"poll.interval.ms": "1000",
"timestamp.column.name": "modified",
"tasks.max": "2",
"name": "jdbc_source_mysql_03"
},
"tasks": [],
"type": "source"
Using Confluent Docker all in one package with tag 5.4.1; I am struggling to get a jdbc sink connector up and running.
When I launch the following connector:
{
"name": "mySink",
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"topics": [
"myTopic"
],
"connection.url": "jdbc:sqlserver://sqlserver:1433;databaseName=myDB",
"connection.user": "user",
"connection.password": "**********",
"dialect.name": "SqlServerDatabaseDialect",
"insert.mode": "insert",
"table.name.format": "TableSink",
"pk.mode": "kafka",
"fields.whitelist": [
"offset",
"value"
],
"auto.create": "true"
}
(Some attributes edited)
I get the following exception on from the connect container:
ERROR Plugin class loader for connector: 'io.confluent.connect.jdbc.JdbcSinkConnector' was not found.
I have the following environment variable for connect:
CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
And if I check in the container I have:
/usr/share/java/kafka-connect-jdbc
With the following files:
./jtds-1.3.1.jar
./common-utils-5.4.0.jar
./slf4j-api-1.7.26.jar
./kafka-connect-jdbc-5.4.0.jar
./postgresql-9.4.1212.jar
./sqlite-jdbc-3.25.2.jar
./mssql-jdbc-8.2.0.jre8.jar
./mssql-jdbc-8.2.0.jre13.jar
./mssql-jdbc-8.2.0.jre11.jar
The only changes I have made from the base connect image are the mssql jdbc drivers. These are working fine for a jdbc source connector.
Extra information as requested:
Output from curl -s localhost:8083/connector-plugins|jq '.[].class'
"io.confluent.connect.activemq.ActiveMQSourceConnector"
"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector"
"io.confluent.connect.gcs.GcsSinkConnector"
"io.confluent.connect.ibm.mq.IbmMQSourceConnector"
"io.confluent.connect.jdbc.JdbcSinkConnector"
"io.confluent.connect.jdbc.JdbcSourceConnector"
"io.confluent.connect.jms.JmsSourceConnector"
"io.confluent.connect.s3.S3SinkConnector"
"io.confluent.connect.storage.tools.SchemaSourceConnector"
"io.confluent.kafka.connect.datagen.DatagenConnector"
"org.apache.kafka.connect.file.FileStreamSinkConnector"
"org.apache.kafka.connect.file.FileStreamSourceConnector"
"org.apache.kafka.connect.mirror.MirrorCheckpointConnector"
"org.apache.kafka.connect.mirror.MirrorHeartbeatConnector"
"org.apache.kafka.connect.mirror.MirrorSourceConnector"
docker-compose from:
https://github.com/confluentinc/examples/tree/5.4.1-post/cp-all-in-one
Images:
confluentinc/cp-ksql-cli:5.4.1
confluentinc/cp-enterprise-control-center:5.4.1
confluentinc/cp-ksql-server:5.4.1
cnfldemos/cp-server-connect-datagen:0.2.0-5.4.0
confluentinc/cp-kafka-rest:5.4.1
confluentinc/cp-schema-registry:5.4.1
confluentinc/cp-server:5.4.1
confluentinc/cp-zookeeper:5.4.1
I have been trying to implement the confluent kafka-connect image to connect or our on prem S3. We have successfully written to s3 from the box using Boto3. So we know it is not a connection issue.
Depending on what converters I use..they produce differing errors.
Here are the environment variables running in the docker container.
CONNECT_CONFIG_STORAGE_TOPIC=__kafka-connect-config
CONNECT_OFFSET_STORAGE_TOPIC=__kafka-connect-offsets
CONNECT_STATUS_STORAGE_TOPIC=__kafka-connect-status
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR=3
CONNECT_CONFIG_STORAGE_PARTITIONS=1
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR=3
CONNECT_OFFSET_STORAGE_PARTITIONS=1
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR=3
CONNECT_STATUS_STORAGE_PARTITIONS=1
CONNECT_REST_ADVERTISED_HOST_NAME=hostname
CONNECT_REST_ADVERTIZED_LISTENER=listener
CONNECT_INTERNAL_KEY_CONVERTER=org.apache.kafka.connect.json.JsonConverter
CONNECT_INTERNAL_VALUE_CONVERTER=org.apache.kafka.connect.json.JsonConverter
CONNECT_KEY_CONVERTER=org.apache.kafka.connect.json.JsonConverter
CONNECT_VALUE_CONVERTER=io.confluent.connect.avro.AvroConverter
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL=http://schema-registry:8081
CONNECT_KEY_CONVERTER_SCHEMAS_ENABLED=false
CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLED=true
CONNECT_REST_ADVERTISED_PORT=8083
CONNECT_REPLICATION_FACTOR=2
CONNECT_GROUP_ID=APP-CONNECT
CONNECT_CONSUMER_BOOTSTRAP_SERVERS=SASL_SSL://server-1.com:9092,SASL_SSL://server-2.com:9092,SASL_SSL://server-3.com:9092
CONNECT_BOOTSTRAP_SERVERS=SASL_SSL://server-1.com:9092,SASL_SSL://server-2.com:9092,SASL_SSL://server-3.com:9092
CONNECT_CONSUMER_SECURITY_PROTOCOL=SASL_SSL
CONNECT_CONSUMER_SASL_JAAS_CONFIG=org.apache.kafka.common.security.plain.PlainLoginModule required username='admin' password='pw';
CONNECT_CONSUMER_SSL_PROTOCOL=SSL
CONNECT_CONSUMER_SSL_TRUSTSTORE_LOCATION=/etc/kafka/secrets/kafka.client.truststore.jks
CONNECT_CONSUMER_SSL_TRUSTSTORE_PASSWORD=password
CONNECT_CONSUMER_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM=
CONNECT_CONSUMER_SASL_MECHANISM=PLAIN
CONNECT_LOG4J_OPTS=-Dlog4j.configuration=file:/etc/kafka_connect/log4j/log4j.properties
CONNECT_OFFSET_FLUSH_INTERVAL_MS=10000
CONNECT_PLUGIN_PATH=/usr/share/java,/usr/share/confluent-hub-components
CONNECT_REST_PORT=8083
CONNECT_SECURITY_PROTOCOL=SASL_SSL
CONNECT_SASL_JAAS_CONFIG=org.apache.kafka.common.security.plain.PlainLoginModule required username='admin' password='pw';
CONNECT_SASL_MECHANISM=PLAIN
CONNECT_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM=
CONNECT_SSL_PROTOCOL=SSL
CONNECT_SSL_TRUSTSTORE_LOCATION=/etc/kafka/secrets/kafka.client.truststore.jks
CONNECT_SSL_TRUSTSTORE_PASSWORD=password
CONNECT_ZOOKEEPER_CONNECT=SASL_SSL://server-1.com:9092,SASL_SSL://server-2.com:9092,SASL_SSL://server-3.com:9092
{
"connector.class": "io.confluent.connect.s3.S3SinkConnector",
"consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username='admin' password='pw';",
"flush.size": "1500",
"topics": "inventory",
"tasks.max": "2",
"rotate.interval.ms": "1000",
"consumer.sasl.mechanism": "PLAIN",
"store.url": "http://s3-server:9020",
"format.class": "io.confluent.connect.s3.format.avro.AvroFormat",
"internal.key.converter.schemas.enable": "false",
"internal.value.converter.schemas.enable": "false",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"key.converter.schemas.enabled": "false",
"value.converter.schemas.enabled": "true",
"partitioner.class": "io.confluent.connect.storage.partitioner.DefaultPartitioner",
"schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
"name": "inventory-2",
"consumer.security.protocol": "SASL_SSL",
"storage.class": "io.confluent.connect.s3.storage.S3Storage",
"s3.bucket.name": "inventory-stage"
}
I get what appears to be a successful startup. However when I check the bucket; I do not have any objects there. I have confirmed using the kafka-avro-consule-consumer that avro messages do exist in the topic.
[2019-04-11 18:14:52,612] INFO [Consumer clientId=consumer-42, groupId=connect-inventory-2] Resetting offset for partition inventory-0 to offset 9. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2019-04-11 18:14:52,614] INFO Opening record writer for: topics/inventory/partition=2/inventory+2+0000000008.avro (io.confluent.connect.s3.format.avro.AvroRecordWriterProvider)
[2019-04-11 18:14:52,621] INFO [Consumer clientId=consumer-42, groupId=connect-inventory-2] Resetting offset for partition inventory-1 to offset 8. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2019-04-11 18:14:52,621] WARN Property 'rotate.interval.ms' is set to '1000ms' but partitioner is not an instance of 'io.confluent.connect.storage.partitioner.TimeBasedPartitioner'. This property is ignored. (io.confluent.connect.s3.TopicPartitionWriter)
[2019-04-11 18:14:52,621] WARN Property 'rotate.interval.ms' is set to '1000ms' but partitioner is not an instance of 'io.confluent.connect.storage.partitioner.TimeBasedPartitioner'. This property is ignored. (io.confluent.connect.s3.TopicPartitionWriter)
[2019-04-11 18:14:52,626] INFO Opening record writer for: topics/inventory/partition=1/inventory+1+0000000008.avro (io.confluent.connect.s3.format.avro.AvroRecordWriterProvider)
[2019-04-11 18:14:52,645] INFO Opening record writer for: topics/inventory/partition=0/inventory+0+0000000009.avro (io.confluent.connect.s3.format.avro.AvroRecordWriterProvider)
When I change the value converter to the AvroConverter. Thinking that the messages are in Avro and will need to be converted out to be consumed by the connector API.
{
"connector.class": "io.confluent.connect.s3.S3SinkConnector",
"consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username='admin' password='pw';",
"flush.size": "1500",
"topics": "inventory",
"tasks.max": "2",
"rotate.interval.ms": "1000",
"consumer.sasl.mechanism": "PLAIN",
"store.url": "http://s3-server:9020",
"format.class": "io.confluent.connect.s3.format.avro.AvroFormat",
"internal.key.converter.schemas.enable": "false",
"internal.value.converter.schemas.enable": "false",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://schema-registry:8081",
"key.converter.schemas.enabled": "false",
"value.converter.schemas.enabled": "true",
"partitioner.class": "io.confluent.connect.storage.partitioner.DefaultPartitioner",
"schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
"name": "inventory-2",
"consumer.security.protocol": "SASL_SSL",
"storage.class": "io.confluent.connect.s3.storage.S3Storage",
"s3.bucket.name": "inventory-stage"
}
This indicates that the avro converter cannot find the schema with the ID of 41. However, that ID exists in the schema registry. See Below
[2019-04-11 18:26:56,813] ERROR WorkerSinkTask{id=inventory-2-1} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:514)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:491)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:322)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:226)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:194)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.connect.errors.DataException: inventory
at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:103)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:514)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 13 more
Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 41
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Subject not found.; error code: 40401
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:209)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:235)
at io.confluent.kafka.schemaregistry.client.rest.RestService.lookUpSubjectVersion(RestService.java:302)
at io.confluent.kafka.schemaregistry.client.rest.RestService.lookUpSubjectVersion(RestService.java:290)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getVersionFromRegistry(CachedSchemaRegistryClient.java:129)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getVersion(CachedSchemaRegistryClient.java:230)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.schemaVersion(AbstractKafkaAvroDeserializer.java:184)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:153)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserializeWithSchemaAndVersion(AbstractKafkaAvroDeserializer.java:215)
at io.confluent.connect.avro.AvroConverter$Deserializer.deserialize(AvroConverter.java:139)
at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:87)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:514)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:514)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:491)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:322)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:226)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:194)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2019-04-11 18:26:56,814] ERROR WorkerSinkTask{id=inventory-2-1} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
[2019-04-11 18:26:56,815] INFO [Consumer clientId=consumer-44, groupId=connect-inventory-2] Sending LeaveGroup request to coordinator localhost:9092 (id: 2147483644 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
{
"subject": "inventory-com.company.dcp.event.schema.shotify.SongCreatedEvent",
"version": 1,
"id": 41,
"schema": "{\"type\":\"record\",\"name\":\"SongCreatedEvent\",\"namespace\":\"com.company.dcp.event.schema.shotify\",\"doc\":\"Information about the Song Added event\",\"fields\":[{\"name\":\"eventHeader\",\"type\":{\"type\":\"record\",\"name\":\"EventHeader\",\"namespace\":\"com.company.commons.shotify\",\"fields\":[{\"name\":\"id\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"}},{\"name\":\"time\",\"type\":{\"type\":\"long\",\"logicalType\":\"timestamp-millis\"}},{\"name\":\"type\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"}},{\"name\":\"source\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"}}]}},{\"name\":\"song\",\"type\":{\"type\":\"record\",\"name\":\"Song\",\"namespace\":\"com.company.commons.shotify\",\"fields\":[{\"name\":\"title\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"doc\":\"Title of the Song\"},{\"name\":\"artist\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"doc\":\"The song composer\"},{\"name\":\"duration\",\"type\":\"int\",\"doc\":\"Song Duration in minutes\"},{\"name\":\"bitrate\",\"type\":\"int\",\"doc\":\"Song Bitrate, measured in kilobytes per second\"},{\"name\":\"lyrics\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"doc\":\"Lyrics of the Song\"},{\"name\":\"fileURL\",\"type\":{\"type\":\"string\",\"avro.java.string\":\"String\"},\"doc\":\"Unique file Reference to the song\"}]}}],\"version\":\"2\"}"
}
I try to ignore bad messages in sink connector with errors.tolerance: all option. Full connector configuration:
{
"name": "crm_data-sink_pandora",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": 6,
"topics": "crm_account_detail,crm_account_on_competitors,crm_event,crm_event_participation",
"connection.url": "jdbc:postgresql://dburl/service?prepareThreshold=0",
"connection.user": "pandora.app",
"connection.password": "*******",
"dialect.name": "PostgreSqlDatabaseDialect",
"insert.mode": "upsert",
"pk.mode": "record_value",
"pk.fields": "guid",
"table.name.format": "pandora.${topic}",
"errors.tolerance": "all",
"errors.log.enable":true,
"errors.log.include.messages":true,
"errors.deadletterqueue.topic.name":"crm_data_deadletterqueue",
"errors.deadletterqueue.context.headers.enable":true
}
}
Target table DDL:
create table crm_event_participation
(
guid char(36) not null
constraint crm_event_participation_pkey
primary key,
created_on timestamp,
created_by_guid char(36),
modified_on timestamp,
modified_by_guid char(36),
process_listeners integer,
event_guid char(36),
event_response varchar(250),
note varchar(500),
is_from_group boolean,
contact_guid char(36),
target_item integer,
account_guid char(36),
employer_id integer
);
Connector starts successfully, but it fails if error occurs (e.g. missing field).
curl -X GET http://kafka-connect:9092/connectors/crm_data-sink_pandora/status:
{
"name": "crm_data-sink_pandora",
"connector": {
"state": "RUNNING",
"worker_id": "192.168.2.254:10900"
},
"tasks": [
{
"state": "FAILED",
"trace":
"org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:586)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:322)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:225)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:193)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.connect.errors.ConnectException: Table \"pandora\".\"crm_event_participation\" is missing fields ([SinkRecordField{schema=Schema{STRING}, name='event_id', isPrimaryKey=false}, SinkRecordField{schema=Schema{STRING}, name='event_response_guid', isPrimaryKey=false}]) and auto-evolution is disabled
at io.confluent.connect.jdbc.sink.DbStructure.amendIfNecessary(DbStructure.java:140)
at io.confluent.connect.jdbc.sink.DbStructure.createOrAmendIfNecessary(DbStructure.java:73)
at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:84)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:65)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:73)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:564)
... 10 more",
"id": 0,
"worker_id": "192.168.2.254:10900"
}
...
]
}
Log with exception:
[2019-03-29 16:59:30,924] INFO Unable to find fields [SinkRecordField{schema=Schema{INT32}, name='process_listners', isPrimaryKey=false}] among column names [employer_id, modified_on, modified_by_guid, contact_guid, target_item, guid, created_on, process_listeners, event_guid, created_by_guid, is_from_group, account_guid, event_response, note] (io.confluent.connect.jdbc.sink.DbStructure)
[2019-03-29 16:59:30,924] ERROR WorkerSinkTask{id=crm_data-sink_pandora-1} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. (org.apache.kafka.connect.runtime.WorkerSinkTask)
org.apache.kafka.connect.errors.ConnectException: Table "pandora"."crm_event_participation" is missing fields ([SinkRecordField{schema=Schema{INT32}, name='process_listners', isPrimaryKey=false}]) and auto-evolution is disabled at io.confluent.connect.jdbc.sink.DbStructure.amendIfNecessary(DbStructure.java:140)
at io.confluent.connect.jdbc.sink.DbStructure.createOrAmendIfNecessary(DbStructure.java:73)
at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:84)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:65)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:73)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:564)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:322)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:225)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:193)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Please explain me what could be wrong in connector configuration? I use Kafka 2.0.0 and JdbcSinkConnector 5.1.0.
In your Kafka message you have a field process_listners. Column with that name is not present in your table.
I think you have typo. In table you have column process_listeners, not process_listners.
errors.tolerance property apply only to errors during Converting messages.
More regarding errors.tolerance you can read: kafka connect - jdbc sink sql exception