Hi i am trying to move csv data from kafka to hdfs using hdfs sink connector and below are the properties i used
Connect.properties
name=hdfs-sink
connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
format.class=io.confluent.connect.hdfs.string.StringFormat
tasks.max=1
topics=topic_name
hadoop.conf.dir=/etc/hadoop/conf
hdfs.url=hdfs://nameservice1/dir
flush.size=3
hdfs.authentication.kerberos=true
connect.hdfs.principal=principal
connect.hdfs.keytab=principal.keytab
hdfs.namenode.principal=principal
partitioner.class=io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner
partition.duration.ms=300000
path.format=path.format='year'=YYYY/'month'=MM/'day'=dd
locale=en
timezone=EST
worker properties
bootstrap.servers=kafkaserver
plugin.path=/opt/confluent/share/java
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
offset.storage.file.filename=/tmp/connect.offsets
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
consumer.group.id=connect_group
consumer.auto.offset.reset=earliest
and i use confluent-5.0.1
but i get the below exception when i run kafka connect
java.util.concurrent.ExecutionException:
org.apache.kafka.connect.runtime.rest.errors.BadRequestException:
Connector configuration is invalid and contains the following 1
error(s): Invalid value io.confluent.connect.hdfs.string.StringFormat
for configuration format.class: Class
io.confluent.connect.hdfs.string.StringFormat could not be found.**
You can also find the above list of errors at the endpoint
/{connectorType}/config/validate at
org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:79)
at
org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:66)
at
org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:110)
Caused by:
org.apache.kafka.connect.runtime.rest.errors.BadRequestException:
Connector configuration is invalid and contains the following 1
error(s): Invalid value io.confluent.connect.hdfs.string.StringFormat
for configuration format.class: Class
io.confluent.connect.hdfs.string.StringFormat could not be found. You
can also find the above list of errors at the endpoint
/{connectorType}/config/validate at
org.apache.kafka.connect.runtime.AbstractHerder.maybeAddConfigErrors(AbstractHerder.java:423)
at
org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:189)
at
org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)
Related
I am using leneses.io s3 source connector to read json files and trying to set message key using SMT.
Here is the config used for connector on AWS MSK
connector.class=io.lenses.streamreactor.connect.aws.s3.source.S3SourceConnector
tasks.max=1
topics=topic_3
connect.s3.vhost.bucket=true
connect.s3.aws.auth.mode=Credentials
connect.s3.aws.access.key=<<access key>>
connect.s3.aws.region=eu-central-1
connect.s3.aws.secret.key=<<secret key>>
schema.enable=false
connect.s3.kcql=INSERT INTO topic_3 SELECT * FROM bucket1:json STOREAS `JSON` WITH_FLUSH_COUNT = 1
aws.region=eu-central-1
aws.custom.endpoint=https://s3.eu-central-1.amazonaws.com
transforms.createKey.type=org.apache.kafka.connect.transforms.ValueToKey
transforms=createKey
key.converter.schemas.enable=false
transforms.createKey.fields=id
value.converter.schemas.enable=false
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.storage.StringConverter
I can't get the SMT to work and running into below error
[Worker-0d3e3af50908b12ee] [2022-04-13 11:43:08,461] ERROR [dev2-s3-source-connector-4|task-0] Error encountered in task dev2-s3-source-connector-4-0. Executing stage 'TRANSFORMATION' with class 'org.apache.kafka.connect.transforms.ValueToKey'. (org.apache.kafka.connect.runtime.errors.LogReporter:66)
[Worker-0d3e3af50908b12ee] org.apache.kafka.connect.errors.DataException: Only Map objects supported in absence of schema for [copying fields from value to key], found: java.lang.String
P.s. if the SMT commands were removed from config then json files are being read into kafka topic with no issues (but the message key is empty)
I am working on Multinode Confluent kafka setup with 3 kafka brokers where there separate node for KSQL & schema-registry and running connector in separate node. All the mapping between those brokers was successful.
I am trying to load AVRO format data into postgresql table from kSQL stream topic "twitter-csv_avro" using confluent postgresql sink connector. Here is the postgres-sink.properties file
# key.converter=org.apache.kafka.connect.storage.StringConverter
# key.converter.schema.registry.url=http://host:8081
# value.converter=io.confluent.connect.avro.AvroConverter
# value.converter.schema.registry.url=http://host:8081
# schema.registry.url=http://host:8081
#Postgres sink connector id name which must be unique for each connector
name=twitter-streaming-postgres-sink
# Name of the connector class to be run
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
# Max number of tasks to spawn for this connector instance
tasks.max=1
# Kafka topic name which is input data to postgresql table
topics=TWITTER_CSV_AVRO
# Postgresql Database connection details
connection.url=jdbc:postgresql://host:5432/d2insights?user=postgres&password=*******
key.converter.schemas.enable=false
value.converter.schemas.enable=true
auto.create=false
auto.evolve=true
key.ignore=true
# Postgresql Table name
table.name.format=twitter
# Primary key configuration that
# pk.mode=none
#record_key=??
Here is the connect-avro-standalone.properties file
bootstrap.servers=XXX.XXX.XX.XX:9092,XXX.XXX.XX.XX:9092,XXX.XXX.XX.XX:9092
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter.schema.registry.url=http://XXX.XXX.XX.XX:8081
key.converter.enhanced.avro.schema.support=true
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://XXX.XXX.XX.XX:8081
value.converter.enhanced.avro.schema.support=true
offset.storage.file.filename=/tmp/connect.offsets
rest.port=9080
plugin.path=/usr/share/java/kafka-connect-jdbc
As i am using confluent-5.2.1 version, I am using default postgresql-42.2.10.jar which is under /usr/share/java/kafka-connect-jdbc and postgresql-10.12 version database.
Here is command that i am using to run the connector
connect-standalone /etc/schema-registry/connect-avro-standalone-postgres.properties /etc/kafka/connect-postgresql-sink1.properties
Although postgresql jar exists I am getting below exception
[2020-08-25 15:42:29,378] INFO Unable to connect to database on attempt 2/3. Will retry in 10000 ms. (io.confluent.connect.jdbc.util.CachedConnectionProvider:99)
java.sql.SQLException: No suitable driver found for jdbc:postgresql://XXX.XXX.XX.XX:5432/d2insights
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at io.confluent.connect.jdbc.dialect.GenericDatabaseDialect.getConnection(GenericDatabaseDialect.java:224)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.newConnection(CachedConnectionProvider.java:93)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.getConnection(CachedConnectionProvider.java:62)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:56)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:74)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:539)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:322)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Can someone suggest or tell me what I am doing wrong over here.
I am trying to create jdbc sink connector on kafka-connect(not on confluent), and I am struggling with the following error:
ERROR Plugin class loader for connector: 'io.confluent.connect.jdbc.JdbcSinkConnector'
I have set below path in CLASSPATH, and placed all jdbc connect related JARs.
C:\Oviyan\Software\kafka_2.12-2.3.1\libs
C:\Oviyan\Software\confluentinc-kafka-connect-jdbc-5.3.1\lib
Config properties:
connect-standalone.properties:
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
plugin.path=C:<user-name>\Software\kafka_2.12-2.3.1\libs\kafka-connect-jdbc
rest.port=8086
rest.host.name=localhost
Jdbc-connect-config.properties:
name=test-sink-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=test-topic
connection.url=jdbc:postgres://localhost:5432/<db-name>
connection.user=
connection.password=
auto.create=true
Error Log:
ERROR Plugin class loader for connector: 'io.confluent.connect.jdbc.JdbcSinkConnector' was not found.
Returning: org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader#3a1dd365
org.apache.kafka.connect.errors.ConnectException: java.sql.SQLException: No suitable driver found
for jdbc:postgres://localhost:5432/kafka-test1
at java.lang.Thread.run(Unknown Source)
Caused by: java.sql.SQLException: No suitable driver found for jdbc:postgres://localhost:5432/kafka-
test1
Appreciate any help!
I am trying to do a POC for Kafka and debezium.
I have started kafka and zookeeper and they are working... now when I try to load kafka-connect (I am kind of new to this...) I get this error that I just can't understand what I am doing wrong.
Note: I have tested all of this with the Debezium tutorial docker images, but I would like to connect from a remote server and I thought it would be easier to install everything without docker to play with the configuration
starting the connect with the following command
./connect-standalone.sh ~/kafka/config/connect-standalone.properties ~/kafka/config/connect-standalone-worker.properties ~/kafka/config/debezium-connector.properties
connect-standalone.properties
bootstrap.servers=localhost:9092
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.flush.interval.ms=10000
plugin.path=/home/ubuntu/kafka/plugins
connect-standalone-worker.properties
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/home/user/offest
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
debezium-connector.properties
name=my-connector
connector.class=io.debezium.connector.mongodb.MongoDbConnector
include.schema.changes=false
mongodb.name=mymongo
collection.whitelist=my.collection
tasks.max=1
mongodb.hosts=A.B.C.D:27017
I get the following when running the connect:
[2018-12-27 15:31:41,995] ERROR Failed to create job for /home/ubuntu/kafka/config/connect-standalone-worker.properties (org.apache.kafka.connect.cli.ConnectStandalone:102)
[2018-12-27 15:31:41,996] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:113)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {internal.key.converter=org.apache.kafka.connect.json.JsonConverter, offset.storage.file.filename=/home/user/offest, internal.value.converter.schemas.enable=false, internal.value.converter=org.apache.kafka.connect.json.JsonConverter, value.converter=org.apache.kafka.connect.json.JsonConverter, internal.key.converter.schemas.enable=false, key.converter=org.apache.kafka.connect.json.JsonConverter} contains no connector type
at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:79)
at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:66)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:110)
Caused by: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {internal.key.converter=org.apache.kafka.connect.json.JsonConverter, offset.storage.file.filename=/home/user/offest, internal.value.converter.schemas.enable=false, internal.value.converter=org.apache.kafka.connect.json.JsonConverter, value.converter=org.apache.kafka.connect.json.JsonConverter, internal.key.converter.schemas.enable=false, key.converter=org.apache.kafka.connect.json.JsonConverter} contains no connector type
at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:259)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:189)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)
[2018-12-27 15:31:41,997] INFO Kafka Connect stopping (org.apache.kafka.connect.runtime.Connect:65)
connect-standalone.properties and connect-standalone-worker.properties need to be one file.
The error is saying that connect-standalone-worker.properties has no connector.class value (which it shouldn't because it's the worker properties, not a connector)
The command you're trying to run should look like
connect-standalone worker.properties connector1.properties [connector2.properties ... ]
I'm trying to use JDBC Connector to connect to a PostgreSQL database on my cluster (the database is not directly managed by the cluster).
I've been calling the Kafka Connect with the following command:
connect-standalone.sh worker.properties jdbc-connector.properties
This is the content of the worker.propertiesfile:
class=io.confluent.connect.jdbc.JdbcSourceConnector
name=test-postgres-1
tasks.max=1
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/home/user/offest
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
connection.url=jdbc:postgresql://database-server.url:port/database?user=user&password=password
And this are the content of the jdbc-connector.properties:
mode=incrementing
incrementing.column.name=id
topic.prefix=test-postgres-jdbc-
When I try to launch the connector with the above command it crashes with the following error:
[2018-04-16 11:39:08,164] ERROR Failed to create job for jdbc.properties (org.apache.kafka.connect.cli.ConnectStandalone:88)
[2018-04-16 11:39:08,166] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:99)
java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {mode=incrementing, incrementing.column.name=pdv, topic.prefix=test-postgres-jdbc-} contains no connector type
at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:80)
at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:67)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:96)
Caused by: org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector config {mode=incrementing, incrementing.column.name=id, topic.prefix=test-postgres-jdbc-} contains no connector type
at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:233)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:158)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:93)
After noting that the connector causing the error displayed only informations from jdbc-connector.properties I've tried merging the two files together, but then the command terminates abruptly (without creating a topic or an offset file) with the following output:
[SLF4J infos...]
[2018-04-16 11:48:54,620] INFO Usage: ConnectStandalone worker.properties connector1.properties [connector2.properties ...] (org.apache.kafka.connect.cli.ConnectStandalone:59)
You need to have most of those properties in the jdbc-connector.properties, not the worker.properties. See https://docs.confluent.io/current/connect/connect-jdbc/docs/source_config_options.html for a full list of config options that go in the connector configuration (jdbc-connector.properties in your example).
Try this:
worker.properties:
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
offset.storage.file.filename=/home/user/offest
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.json.JsonConverter
jdbc-connector.properties:
class=io.confluent.connect.jdbc.JdbcSourceConnector
name=test-postgres-1
tasks.max=1
mode=incrementing
incrementing.column.name=id
topic.prefix=test-postgres-jdbc-
connection.url=jdbc:postgresql://database-server.url:port/database?user=user&password=password
You can see some more examples with Kafka Connect here:
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
https://www.confluent.io/blog/blogthe-simplest-useful-kafka-connect-data-pipeline-in-the-world-or-thereabouts-part-2/
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-3/