Error when defining Camel S3 source connector - apache-kafka

I am trying to define a Camel S3 Source Connector in our confluent environment.
This is the configuration I am using
{
"name": "CamelAWSS3SourceConnector",
"config": {
"connector.class": "org.apache.camel.kafkaconnector.awss3.CamelAwss3SourceConnector",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.camel.kafkaconnector.awss3.converters.S3ObjectConverter",
"camel.source.maxPollDuration": "10000",
"topics": "TEST-S3-SOURCE-POC",
"camel.source.path.bucketNameOrArn": "json-poc",
"camel.component.aws-s3.region": "us-east-1",
"tasks.max": "2",
"camel.source.endpoint.autocloseBody": "true"
}
}
And this is the error I receive, when I try to define the connector
{
"error_code": 405,
"message": "HTTP 405 Method Not Allowed"
} {
"name": "CamelAWSS3SourceConnector",
"connector": {
"state": "RUNNING",
"worker_id": "confluent-connect-server2:8083"
},
"tasks": [{
"id": 0,
"state": "FAILED",
"worker_id": "confluent-connect-server2",
"trace": "org.apache.kafka.connect.errors.ConnectException: Failed to create and start Camel context
at org.apache.camel.kafkaconnector.CamelSourceTask.start(CamelSourceTask.java:118)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:215)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:184)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:234)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Option bucketNameOrArn is required when creating endpoint uri with syntax aws-s3://bucketNameOrArn
at org.apache.camel.support.component.EndpointUriFactorySupport.buildPathParameter(EndpointUriFactorySupport.java:53)
at org.apache.camel.component.aws.s3.S3EndpointUriFactory.buildUri(S3EndpointUriFactory.java:103)
at org.apache.camel.kafkaconnector.utils.TaskHelper.buildUrl(TaskHelper.java:68)
at org.apache.camel.kafkaconnector.CamelSourceTask.start(CamelSourceTask.java:98)
... 8 more"
}],
"type": "source"
}
What would be the cause of above error?
I am told that since the connect server is an ec2 instance, I don't have to define AWS parameters here. Is that correct?
Thank you
Note: Wanted to add that there are two connect servers and the error shows up for only one of them in the output

Just wanted to post an answer here in case someone else runs into this.
The issue was when using PUT to add or update a connector the url format of the curl request needs to be http(s)://<serverurl>/<connectorName>/config. My json contained the key "name:" and url was just http(s)://. Initially, I created the connector using POST and then was trying to update using PUT to update ( actually, add ) the bucketNameOrArn key which was not getting updated in reality.

Related

Kafka Connect Mongo Sink connector is failing due to schema

I'm doing my first steps in the kafka-connect area and things are not going as expected.
I created a Mongo sink connector (kafka to mongo), but it fails to consume the messages with the following error:
Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:489)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:469)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:325)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:228)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:184)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:234)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.connect.errors.DataException: Failed to deserialize data for topic test_topic to Avro:
at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:114)
at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:87)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:489)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)
... 13 more
I don't wanna use any schema but for some reason it fails on the deserialize, probably because the schema config is missing.
My configurations:
{
"name": "kafka-to-mongo",
"config": {
"connector.class": "com.mongodb.kafka.connect.MongoSinkConnector",
"tasks.max": "1",
"topics": "test_topic",
"connection.uri": "mongodb://mongodb4:27017/test-db",
"database": "auto-resolve",
"value.converter.schemas.enable": "false",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"collection": "test_collection",
"document.id.strategy": "com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy",
"document.id.strategy.partial.value.projection.list": "id",
"document.id.strategy.partial.value.projection.type": "AllowList",
"writemodel.strategy": "com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneBusinessKeyStrategy"
}
}
Example of a message in the topic:
{
"_id": "62a6e4d88c1e1e0011902616",
"type": "alert",
"key": "test444",
"timestamp": 1655104728,
"source_system": "api.test",
"tags": [
{
"type": "aaa",
"value": "bbb"
},
{
"type": "fff",
"value": "rrr"
}
],
"ack": "no",
"main": "nochange",
"all_ids": []
}
Any ideas? Thanks!

Kafka sink connector --> postgres, fails with avro JSON data

I setup a Kafka JDBC sink to send events to a PostgreSQL.
I wrote this simple producer that sends JSON with schema (avro) data to a topic as follows:
producer.py (kafka-python)
biometrics = {
"heartbeat": self.pulse, # integer
"oxygen": self.oxygen,# integer
"temprature": self.temprature, # float
"time": time # string
}
avro_value = {
"schema": open(BASE_DIR+"/biometrics.avsc").read(),
"payload": biometrics
}
producer.send("biometrics",
key="some_string",
value=avro_value
)
Value Schema:
{
"type": "record",
"name": "biometrics",
"namespace": "athlete",
"doc": "athletes biometrics"
"fields": [
{
"name": "heartbeat",
"type": "int",
"default": 0
},
{
"name": "oxygen",
"type": "int",
"default": 0
},
{
"name": "temprature",
"type": "float",
"default": 0.0
},
{
"name": "time",
"type": "string"
"default": ""
}
]
}
Connector config (without hosts, passwords etc)
{
"name": "jdbc_sink",
"connector.class": "io.aiven.connect.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.storage.StringConverter ",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"topics": "biometrics",
"insert.mode": "insert",
"auto.create": "true"
}
But my connector fails hard, with three errors and I am unable to spot the reason for any of them:
TL;DR; log Version
(Error 1) Caused by: org.apache.kafka.connect.errors.DataException: biometrics
(Error 2) Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
(Error 3) Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
Full log
org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:206)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:132)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:498)
at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:475)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:325)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:229)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:185)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:235)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.connect.errors.DataException: biometrics
at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:98)
at org.apache.kafka.connect.storage.Converter.toConnectData(Converter.java:87)
at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$1(WorkerSinkTask.java:498)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:156)
at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:190)
... 13 more
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id -1
Caused by: org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
Could someone help me understand those errors and the underlying reason?
The error is because you need to use the JSONConverter class w/ value.converter.schemas.enabled=true in your Connector since that was what was produced, but the schema payload is not an Avro schema represntation of the payload, so it might still fail with those changes alone...
If you want to actually send Avro, then use the AvroProducer in confluent-kafka library, which requires running the Schema Registry.

Kafka Connect JDBC Sink "schema does not exist"

I am tryin to sink data into postgresql with kafka connect but I am getting the error that the schema does not exist.
Is it possible that the name of the topic, that includes dots, makes the problem, because the error mentioned that the schema "logstash" does not exist, and this is the string till the first dot?
ERROR:
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:568)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:326)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:228)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:184)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:234)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.connect.errors.ConnectException: java.sql.SQLException: org.postgresql.util.PSQLException: ERROR: schema \"logstash\" does not exist
Position: 14
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:87)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:546)
... 10 more
Caused by: java.sql.SQLException: org.postgresql.util.PSQLException: ERROR: schema \"logstash\" does not exist
Position: 14
... 12 more
Sink config:
{
"name": "jdbc.apache.access.log.sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"topics": "logstash.apache.access.log",
"connection.url": "jdbc:postgresql://<IP_OF_POSTGRESQL>:5432/kafka",
"connection.user": "kafka",
"connection.password": "<PASSWORD>",
"insert.mode": "upsert",
"pk.mode": "kafka",
"auto.create": true,
"auto.evolve": true,
"value.converter.schema.registry.url": "http://schema-registry:8081",
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "true"
}
}
Schema (called with API):
{
"subject": "logstash.apache.access.log-value",
"version": 3,
"id": 3,
"schema": "{\"type\":\"record\",\"name\":\"log\",\"namespace\":\"value_logstash.apache.access\",\"fields\":[{\"name\":\"clientip\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"verb\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"response\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"request\",\"type\":[\"null\",\"string\"],\"default\":null},{\"name\":\"bytes\",\"type\":[\"null\",\"string\"],\"default\":null}]}"
}
EDITED:
I tried to create a new topic with underscores. It looks like the dots are really the cause of the error. Is there any solution I can avoid it or do I made a mistake in my configuration...?
You should be able to use a RegexRouter SMT to rename a topic to something without periods before the Sink action of the database write.

S3 sink record field TimeBasedPartitioner not working

I am trying to deploy s3 sink connector where s3 partitions needs to be based on a field coming in data props.eventTime
Following is my config :
{
"name" : "test_timeBasedPartitioner",
"connector.class": "io.confluent.connect.s3.S3SinkConnector",
"partition.duration.ms": "3600000",
"s3.region": "us-east-1",
"topics.dir": "dream11",
"flush.size": "50000",
"topics": "test_topic",
"s3.part.size": "5242880",
"tasks.max": "5",
"timezone": "Etc/UTC",
"locale": "en",
"format.class": "io.confluent.connect.s3.format.json.JsonFormat",
"partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
"schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
"storage.class": "io.confluent.connect.s3.storage.S3Storage",
"rotate.schedule.interval.ms": "1800000",
"path.format": "'EventDate'=YYYYMMdd",
"s3.bucket.name": "test_bucket",
"partition.duration.ms": "86400000",
"timestamp.extractor": "RecordField",
"timestamp.field": "props.eventTime"
}
Following is my sample json that is present in kafka topic :
{
"eventName": "testEvent",
"props": {
"screen_resolution": "1436x720",
"userId": 0,
"device_name": "1820",
"eventTime": "1565792661712"
}
}
And the exception that I am getting is :
org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:546)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:302)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:205)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:173)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Invalid format: "1564561484906" is malformed at "4906"
at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826)
at io.confluent.connect.storage.partitioner.TimeBasedPartitioner$RecordFieldTimestampExtractor.extract(TimeBasedPartitioner.java:281)
at io.confluent.connect.s3.TopicPartitionWriter.executeState(TopicPartitionWriter.java:199)
at io.confluent.connect.s3.TopicPartitionWriter.write(TopicPartitionWriter.java:176)
at io.confluent.connect.s3.S3SinkTask.put(S3SinkTask.java:195)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:524)
... 10 more
Is there something that I am missing here to configure?
Any help would be appreciated.
Your field props.eventTime is coming in as microsecond and not millisecond.
This can be identified in the stack trace and by inspecting the relevant code in org.joda.time doParseMillis method, which is used by the Connector partitioner TimeBasedPartitioner and its timestamp extractor from message payload RecordFieldTimestampExtractor when the timestamp.field is a STRING:
Caused by: java.lang.IllegalArgumentException: Invalid format: "1564561484906" is malformed at "4906"
at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187)
at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826)
at io.confluent.connect.storage.partitioner.TimeBasedPartitioner$RecordFieldTimestampExtractor.extract(TimeBasedPartitioner.java:281)
You could follow one of these solutions:
Write your own TimestampExtractor to support microsecond. You can check how to write a custom TimestampExtractor here.
Change/Transform your source data to include a field that comes in with millisecond instead of microsecond
Followup on some issues where default TimestampExtractor flexibility is discussed and suggest/contribute to have it support your use case.

Kafka Connect JDBC sink connector not working

I am trying to use Kafka Connect JDBC sink connector to insert data into Oracle but it is throwing an error . I have tried with all the possible configurations of the schema. Below is the examples .
Please suggest if I am missing anything below are my configurations files and errors.
Case 1- First Configuration
internal.value.converter.schemas.enable=false .
so I am getting the
[2017-08-28 16:16:26,119] INFO Sink task WorkerSinkTask{id=oracle_sink-0} finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:233)
[2017-08-28 16:16:26,606] INFO Discovered coordinator dfw-appblx097-01.prod.walmart.com:9092 (id: 2147483647 rack: null) for group connect-oracle_sink. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:597)
[2017-08-28 16:16:26,608] INFO Revoking previously assigned partitions [] for group connect-oracle_sink (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:419)
[2017-08-28 16:16:26,609] INFO (Re-)joining group connect-oracle_sink (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:432)
[2017-08-28 16:16:27,174] INFO Successfully joined group connect-oracle_sink with generation 26 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:399)
[2017-08-28 16:16:27,176] INFO Setting newly assigned partitions [DJ-7, DJ-6, DJ-5, DJ-4, DJ-3, DJ-2, DJ-1, DJ-0, DJ-9, DJ-8] for group connect-oracle_sink (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:262)
[2017-08-28 16:16:28,580] ERROR Task oracle_sink-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerSinkTask:455)
org.apache.kafka.connect.errors.ConnectException: No fields found using key and value schemas for table: DJ
at io.confluent.connect.jdbc.sink.metadata.FieldsMetadata.extract(FieldsMetadata.java:190)
at io.confluent.connect.jdbc.sink.metadata.FieldsMetadata.extract(FieldsMetadata.java:58)
at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:65)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:62)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:66)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:435)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:251)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2nd Configuration -
internal.key.converter.schemas.enable=true
internal.value.converter.schemas.enable=true
Log:
[2017-08-28 16:23:50,993] INFO Revoking previously assigned partitions [] for group connect-oracle_sink (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:419)
[2017-08-28 16:23:50,993] INFO (Re-)joining group connect-oracle_sink (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:432)
[2017-08-28 16:23:51,260] INFO (Re-)joining group connect-oracle_sink (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:432)
[2017-08-28 16:23:51,381] INFO Successfully joined group connect-oracle_sink with generation 29 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:399)
[2017-08-28 16:23:51,384] INFO Setting newly assigned partitions [DJ-7, DJ-6, DJ-5, DJ-4, DJ-3, DJ-2, DJ-1, DJ-0, DJ-9, DJ-8] for group connect-oracle_sink (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:262)
[2017-08-28 16:23:51,727] ERROR Task oracle_sink-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:148)
org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.
at org.apache.kafka.connect.json.JsonConverter.toConnectData(JsonConverter.java:308)
Oracle connector.properties looks like
name=oracle_sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=DJ
connection.url=jdbc:oracle:thin:#hostname:port:sid
connection.user=username
connection.password=password
#key.converter=org.apache.kafka.connect.json.JsonConverter
#value.converter=org.apache.kafka.connect.json.JsonConverter
auto.create=true
auto.evolve=true
Connect-Standalone.properties
My JSON looks like -
{"Item":"12","Sourcing Reason":"corr","Postal Code":"l45","OrderNum":"10023","Intended Node Distance":1125.8,"Chosen Node":"34556","Quantity":1,"Order Date":1503808765201,"Intended Node":"001","Chosen Node Distance":315.8,"Sourcing Logic":"reducesplits"}
Per the documentation
The sink connector requires knowledge of schemas, so you should use a suitable converter e.g. the Avro converter that comes with the schema registry, or the JSON converter with schemas enabled.
So if your data is JSON you would have the following configuration:
[...]
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "true",
[...]
The error you see in the second instance is pertinent -- JsonConverter with schemas.enable requires "schema" and "payload" fields - the JSON you share does not meet this required format.
Here's a simple example of a valid JSON message with schema and payload :
{
"schema": {
"type": "struct",
"fields": [{
"type": "int32",
"optional": true,
"field": "c1"
}, {
"type": "string",
"optional": true,
"field": "c2"
}, {
"type": "int64",
"optional": false,
"name": "org.apache.kafka.connect.data.Timestamp",
"version": 1,
"field": "create_ts"
}, {
"type": "int64",
"optional": false,
"name": "org.apache.kafka.connect.data.Timestamp",
"version": 1,
"field": "update_ts"
}],
"optional": false,
"name": "foobar"
},
"payload": {
"c1": 10000,
"c2": "bar",
"create_ts": 1501834166000,
"update_ts": 1501834166000
}
}
What's your source for the data that you're trying to land to Oracle? If it's Kafka Connect inbound then you simply use the same converter configuration (Avro + Confluent Schema Registry) would be easier and more efficient. If it's a custom application, you'll need to get it to either (a) use the Confluent Avro serialiser or (b) write the JSON in the required format above, providing the schema of the payload inline with the message.
I've the same problem, after the read this post. I has been resolved with JDBC Sink MySQL
Below my Kafka Connect Configuration, as additional information:
curl --location --request POST 'http://localhost:8083/connectors/' \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "jdbc-sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"topics": "ttib-transactions",
"connection.url": "jdbc:mysql://172.17.0.1:6603/tt-tran?verifyServerCertificate=true&useSSL=false",
"connection.user": "root",
"connection.password": "*******",
"value.converter.schema.registry.url": "https://psrc-j55zm.us-central1.gcp.confluent.cloud",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "true",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": "false",
"insert.mode": "insert",
"batch.size":"0",
"table.name.format": "${topic}",
"pk.fields" :"id"
}
}'