Kafka Connect Hdfs Sink Connector - Class io.confluent.connect.hdfs.string.StringFormat could not be found - apache-kafka

Hi i am trying to move csv data from kafka to hdfs using hdfs sink connector and below are the properties i used
Connect.properties
name=hdfs-sink
connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
format.class=io.confluent.connect.hdfs.string.StringFormat
tasks.max=1
topics=topic_name
hadoop.conf.dir=/etc/hadoop/conf
hdfs.url=hdfs://nameservice1/dir
flush.size=3
hdfs.authentication.kerberos=true
connect.hdfs.principal=principal
connect.hdfs.keytab=principal.keytab
hdfs.namenode.principal=principal
partitioner.class=io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner
partition.duration.ms=300000
path.format=path.format='year'=YYYY/'month'=MM/'day'=dd
locale=en
timezone=EST
worker properties
bootstrap.servers=kafkaserver
plugin.path=/opt/confluent/share/java
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
offset.storage.file.filename=/tmp/connect.offsets
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
consumer.group.id=connect_group
consumer.auto.offset.reset=earliest
and i use confluent-5.0.1
but i get the below exception when i run kafka connect
java.util.concurrent.ExecutionException:
org.apache.kafka.connect.runtime.rest.errors.BadRequestException:
Connector configuration is invalid and contains the following 1
error(s): Invalid value io.confluent.connect.hdfs.string.StringFormat
for configuration format.class: Class
io.confluent.connect.hdfs.string.StringFormat could not be found.**
You can also find the above list of errors at the endpoint
/{connectorType}/config/validate at
org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:79)
at
org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:66)
at
org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:110)
Caused by:
org.apache.kafka.connect.runtime.rest.errors.BadRequestException:
Connector configuration is invalid and contains the following 1
error(s): Invalid value io.confluent.connect.hdfs.string.StringFormat
for configuration format.class: Class
io.confluent.connect.hdfs.string.StringFormat could not be found. You
can also find the above list of errors at the endpoint
/{connectorType}/config/validate at
org.apache.kafka.connect.runtime.AbstractHerder.maybeAddConfigErrors(AbstractHerder.java:423)
at
org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:189)
at
org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)

Related

kafka SMT keeps failing to extract json field to use as message key

I am using leneses.io s3 source connector to read json files and trying to set message key using SMT.
Here is the config used for connector on AWS MSK
connector.class=io.lenses.streamreactor.connect.aws.s3.source.S3SourceConnector
tasks.max=1
topics=topic_3
connect.s3.vhost.bucket=true
connect.s3.aws.auth.mode=Credentials
connect.s3.aws.access.key=<<access key>>
connect.s3.aws.region=eu-central-1
connect.s3.aws.secret.key=<<secret key>>
schema.enable=false
connect.s3.kcql=INSERT INTO topic_3 SELECT * FROM bucket1:json STOREAS `JSON` WITH_FLUSH_COUNT = 1
aws.region=eu-central-1
aws.custom.endpoint=https://s3.eu-central-1.amazonaws.com
transforms.createKey.type=org.apache.kafka.connect.transforms.ValueToKey
transforms=createKey
key.converter.schemas.enable=false
transforms.createKey.fields=id
value.converter.schemas.enable=false
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
I can't get the SMT to work and running into below error
[Worker-0d3e3af50908b12ee] [2022-04-13 11:43:08,461] ERROR [dev2-s3-source-connector-4|task-0] Error encountered in task dev2-s3-source-connector-4-0. Executing stage 'TRANSFORMATION' with class 'org.apache.kafka.connect.transforms.ValueToKey'. (org.apache.kafka.connect.runtime.errors.LogReporter:66)
[Worker-0d3e3af50908b12ee] org.apache.kafka.connect.errors.DataException: Only Map objects supported in absence of schema for [copying fields from value to key], found: java.lang.String
P.s. if the SMT commands were removed from config then json files are being read into kafka topic with no issues (but the message key is empty)

Confluent Kafka Connector throws "No suitable driver found for jdbc:postgresql"

I am working on Multinode Confluent kafka setup with 3 kafka brokers where there separate node for KSQL & schema-registry and running connector in separate node. All the mapping between those brokers was successful.
I am trying to load AVRO format data into postgresql table from kSQL stream topic "twitter-csv_avro" using confluent postgresql sink connector. Here is the postgres-sink.properties file
# key.converter=org.apache.kafka.connect.storage.StringConverter
# key.converter.schema.registry.url=http://host:8081
# value.converter=io.confluent.connect.avro.AvroConverter
# value.converter.schema.registry.url=http://host:8081
# schema.registry.url=http://host:8081
#Postgres sink connector id name which must be unique for each connector
name=twitter-streaming-postgres-sink
# Name of the connector class to be run
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
# Max number of tasks to spawn for this connector instance
tasks.max=1
# Kafka topic name which is input data to postgresql table
topics=TWITTER_CSV_AVRO
# Postgresql Database connection details
connection.url=jdbc:postgresql://host:5432/d2insights?user=postgres&password=*******
key.converter.schemas.enable=false
value.converter.schemas.enable=true
auto.create=false
auto.evolve=true
key.ignore=true
# Postgresql Table name
table.name.format=twitter
# Primary key configuration that
# pk.mode=none
#record_key=??
Here is the connect-avro-standalone.properties file
bootstrap.servers=XXX.XXX.XX.XX:9092,XXX.XXX.XX.XX:9092,XXX.XXX.XX.XX:9092
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter.schema.registry.url=http://XXX.XXX.XX.XX:8081
key.converter.enhanced.avro.schema.support=true
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://XXX.XXX.XX.XX:8081
value.converter.enhanced.avro.schema.support=true
offset.storage.file.filename=/tmp/connect.offsets
rest.port=9080
plugin.path=/usr/share/java/kafka-connect-jdbc
As i am using confluent-5.2.1 version, I am using default postgresql-42.2.10.jar which is under /usr/share/java/kafka-connect-jdbc and postgresql-10.12 version database.
Here is command that i am using to run the connector
connect-standalone /etc/schema-registry/connect-avro-standalone-postgres.properties /etc/kafka/connect-postgresql-sink1.properties
Although postgresql jar exists I am getting below exception
[2020-08-25 15:42:29,378] INFO Unable to connect to database on attempt 2/3. Will retry in 10000 ms. (io.confluent.connect.jdbc.util.CachedConnectionProvider:99)
java.sql.SQLException: No suitable driver found for jdbc:postgresql://XXX.XXX.XX.XX:5432/d2insights
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at io.confluent.connect.jdbc.dialect.GenericDatabaseDialect.getConnection(GenericDatabaseDialect.java:224)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.newConnection(CachedConnectionProvider.java:93)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.getConnection(CachedConnectionProvider.java:62)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:56)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:74)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:539)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:322)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Can someone suggest or tell me what I am doing wrong over here.

Not able to create jdbc-sink connector: ERROR Plugin class loader for connector: 'io.confluent.connect.jdbc.JdbcSinkConnector'

I am trying to create jdbc sink connector on kafka-connect(not on confluent), and I am struggling with the following error:
ERROR Plugin class loader for connector: 'io.confluent.connect.jdbc.JdbcSinkConnector'
I have set below path in CLASSPATH, and placed all jdbc connect related JARs.
C:\Oviyan\Software\kafka_2.12-2.3.1\libs
C:\Oviyan\Software\confluentinc-kafka-connect-jdbc-5.3.1\lib
Config properties:
connect-standalone.properties:
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
plugin.path=C:<user-name>\Software\kafka_2.12-2.3.1\libs\kafka-connect-jdbc
rest.port=8086
rest.host.name=localhost
Jdbc-connect-config.properties:
name=test-sink-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=test-topic
connection.url=jdbc:postgres://localhost:5432/<db-name>
connection.user=
connection.password=
auto.create=true
Error Log:
ERROR Plugin class loader for connector: 'io.confluent.connect.jdbc.JdbcSinkConnector' was not found.
Returning: org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader#3a1dd365
org.apache.kafka.connect.errors.ConnectException: java.sql.SQLException: No suitable driver found
for jdbc:postgres://localhost:5432/kafka-test1
at java.lang.Thread.run(Unknown Source)
Caused by: java.sql.SQLException: No suitable driver found for jdbc:postgres://localhost:5432/kafka-
test1
Appreciate any help!

MongoDb Debezium - "Connector config contains no connector type"

I am trying to do a POC for Kafka and debezium.
I have started kafka and zookeeper and they are working... now when I try to load kafka-connect (I am kind of new to this...) I get this error that I just can't understand what I am doing wrong.
Note: I have tested all of this with the Debezium tutorial docker images, but I would like to connect from a remote server and I thought it would be easier to install everything without docker to play with the configuration
starting the connect with the following command
./connect-standalone.sh ~/kafka/config/connect-standalone.properties ~/kafka/config/connect-standalone-worker.properties ~/kafka/config/debezium-connector.properties
connect-standalone.properties
bootstrap.servers=localhost:9092
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.flush.interval.ms=10000
plugin.path=/home/ubuntu/kafka/plugins
connect-standalone-worker.properties
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/home/user/offest
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
debezium-connector.properties
name=my-connector
connector.class=io.debezium.connector.mongodb.MongoDbConnector
include.schema.changes=false
mongodb.name=mymongo
collection.whitelist=my.collection
tasks.max=1
mongodb.hosts=A.B.C.D:27017
I get the following when running the connect:
[2018-12-27 15:31:41,995] ERROR Failed to create job for /home/ubuntu/kafka/config/connect-standalone-worker.properties (org.apache.kafka.connect.cli.ConnectStandalone:102)
[2018-12-27 15:31:41,996] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:113)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {internal.key.converter=org.apache.kafka.connect.json.JsonConverter, offset.storage.file.filename=/home/user/offest, internal.value.converter.schemas.enable=false, internal.value.converter=org.apache.kafka.connect.json.JsonConverter, value.converter=org.apache.kafka.connect.json.JsonConverter, internal.key.converter.schemas.enable=false, key.converter=org.apache.kafka.connect.json.JsonConverter} contains no connector type
at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:79)
at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:66)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:110)
Caused by: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {internal.key.converter=org.apache.kafka.connect.json.JsonConverter, offset.storage.file.filename=/home/user/offest, internal.value.converter.schemas.enable=false, internal.value.converter=org.apache.kafka.connect.json.JsonConverter, value.converter=org.apache.kafka.connect.json.JsonConverter, internal.key.converter.schemas.enable=false, key.converter=org.apache.kafka.connect.json.JsonConverter} contains no connector type
at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:259)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:189)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)
[2018-12-27 15:31:41,997] INFO Kafka Connect stopping (org.apache.kafka.connect.runtime.Connect:65)
connect-standalone.properties and connect-standalone-worker.properties need to be one file.
The error is saying that connect-standalone-worker.properties has no connector.class value (which it shouldn't because it's the worker properties, not a connector)
The command you're trying to run should look like
connect-standalone worker.properties connector1.properties [connector2.properties ... ]

Connector config contains no connector type

I'm trying to use JDBC Connector to connect to a PostgreSQL database on my cluster (the database is not directly managed by the cluster).
I've been calling the Kafka Connect with the following command:
connect-standalone.sh worker.properties jdbc-connector.properties
This is the content of the worker.propertiesfile:
class=io.confluent.connect.jdbc.JdbcSourceConnector
name=test-postgres-1
tasks.max=1
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/home/user/offest
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
connection.url=jdbc:postgresql://database-server.url:port/database?user=user&password=password
And this are the content of the jdbc-connector.properties:
mode=incrementing
incrementing.column.name=id
topic.prefix=test-postgres-jdbc-
When I try to launch the connector with the above command it crashes with the following error:
[2018-04-16 11:39:08,164] ERROR Failed to create job for jdbc.properties (org.apache.kafka.connect.cli.ConnectStandalone:88)
[2018-04-16 11:39:08,166] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:99)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {mode=incrementing, incrementing.column.name=pdv, topic.prefix=test-postgres-jdbc-} contains no connector type
at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:80)
at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:67)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:96)
Caused by: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {mode=incrementing, incrementing.column.name=id, topic.prefix=test-postgres-jdbc-} contains no connector type
at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:233)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:158)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:93)
After noting that the connector causing the error displayed only informations from jdbc-connector.properties I've tried merging the two files together, but then the command terminates abruptly (without creating a topic or an offset file) with the following output:
[SLF4J infos...]
[2018-04-16 11:48:54,620] INFO Usage: ConnectStandalone worker.properties connector1.properties [connector2.properties ...] (org.apache.kafka.connect.cli.ConnectStandalone:59)
You need to have most of those properties in the jdbc-connector.properties, not the worker.properties. See https://docs.confluent.io/current/connect/connect-jdbc/docs/source_config_options.html for a full list of config options that go in the connector configuration (jdbc-connector.properties in your example).
Try this:
worker.properties:
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/home/user/offest
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
jdbc-connector.properties:
class=io.confluent.connect.jdbc.JdbcSourceConnector
name=test-postgres-1
tasks.max=1
mode=incrementing
incrementing.column.name=id
topic.prefix=test-postgres-jdbc-
connection.url=jdbc:postgresql://database-server.url:port/database?user=user&password=password
You can see some more examples with Kafka Connect here:
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
https://www.confluent.io/blog/blogthe-simplest-useful-kafka-connect-data-pipeline-in-the-world-or-thereabouts-part-2/
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-3/