I am trying Kafka with Postgres Sink using JDBC-sink connector.
Exception:
INFO Unable to connect to database on attempt 1/3. Will retry in 10000 ms. (io.confluent.connect.jdbc.util.CachedConnectionProvider:91)
java.sql.SQLException: No suitable driver found for jdbc:postgresql://localhost:5432/casb
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.newConnection(CachedConnectionProvider.java:85)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.getValidConnection(CachedConnectionProvider.java:68)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:56)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:69)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:495)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:288)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Sink.properties:
name=test-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=fp_test
connection.url=jdbc:postgresql://localhost:5432/casb
connection.user=admin
connection.password=***
auto.create=true
I have set plugin.path=/usr/share/java/kafka-connect-jdbc
On /usr/share/java/kafka-connect-jdbc I have the following files:
kafka-connect-jdbc-4.0.0.jar , postgresql-9.4-1206-jdbc41.jar, sqlite-jdbc-3.8.11.2.jar and some other jars that basically come packaged along with confluent.
I then downloaded postgres-jdbc driver jar postgresql-42.2.2.jar, copied it in the same folder and tried again. Still the same exception.
Kindly help me out with this.
Setting plugin.path=/usr/share/java and CLASSPATH=/usr/share/java/kafka-connect-jdbc/ solved the issue.
Related
I referred this documentation:
http://itechseeker.com/en/tutorials-2/apache-cassandra/install-apache-cassandra/
https://lenses.io/blog/2018/03/cassandra-to-kafka/#:~:text=Adding%20the%20Cassandra%20Source%20connector,Connect%20through%20the%20REST%20API
For how to connect Kafka and Cassandra. Kafka and Cassandra's connection was successfully established. but when I tried to insert data using Kafka to Cassandra it was throwing CassandraJsonWriter Error. How to fix this issue and I am new to this concept.
connect-standalone.properties:
plugin.path=/kafka/plugins
key.converter=org.apache.kafka.connect.storage.StringConverter
key.converter.schemas.enable=false
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=false
Error Log:
[2022-06-28 16:45:03,699] ERROR [cassandra-sink-orders|task-0] There was an error writing the records Boxed Error
(com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter:172) java.util.concurrent.ExecutionException: Boxed Error
at scala.concurrent.impl.Promise$.resolver(Promise.scala:87) at scala.concurrent.impl.Promise$.scala$concurrent$impl$Promise$$resolveTry(Promise.scala:79) at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284) at scala.concurrent.Promise.complete(Promise.scala:53) at scala.concurrent.Promise.complete$(Promise.scala:52)
at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:187) at scala.concurrent.Promise.failure(Promise.scala:104) at scala.concurrent.Promise.failure$(Promise.scala:104) at scala.concurrent.impl.Promise$DefaultPromise.failure(Promise.scala:187) at com.datamountaineer.streamreactor.common.concurrent.ExecutorExtension$RunnableWrapper$.$anonfun$submit$1(ExecutorExtension.scala:34) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ExceptionInInitializerError at com.datamountaineer.streamreactor.common.converters.ToJsonWithProjections$.apply(ToJsonWithProjections.scala:97) at com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter.$anonfun$insert$2(CassandraJsonWriter.scala:185) at com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter.$anonfun$insert$2$adapted(CassandraJsonWriter.scala:179) at scala.collection.immutable.Map$Map1.foreach(Map.scala:154) at com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter.insert(CassandraJsonWriter.scala:179) at com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter.$anonfun$write$4(CassandraJsonWriter.scala:159) at com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraJsonWriter.$anonfun$write$4$adapted(CassandraJsonWriter.scala:157)
cassandra connector properties:
name=cassandra-sink-orders
connector.class=com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkConnector
tasks.max=1
topics=five-topic
connect.cassandra.port=9042
connect.cassandra.contact.points=localhost
connect.cassandra.key.space=testkeyspace
connect.cassandra.username=cassandra
connect.cassandra.password=cassandra
connect.cassandra.kcql=INSERT INTO emp SELECT * FROM five-topic
connect.progress.enabled=true
I get the below error when I am executing the command.. My kafka messages are not flowing down into hdfs
/bin/connect-distributed.sh /etc/schema-registry/connect-avro-distributed.properties /kafka-connect/quickstart-hdfs.properties
ERROR Failed to start task local-file-source-0 (org.apache.kafka.connect.runtime.Worker:456)
org.apache.kafka.common.config.ConfigException: Invalid value io.confluent.kafka.serializers.subject.TopicNameStrategy for configuration key.subject.name.strategy: Class io.confluent.kafka.serializers.subject.TopicNameStrategy could not be found.
at org.apache.kafka.common.config.ConfigDef.parseType(ConfigDef.java:718)
at org.apache.kafka.common.config.ConfigDef$ConfigKey.<init>(ConfigDef.java:1063)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:148)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:168)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:207)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:369)
at org.apache.kafka.common.config.ConfigDef.define(ConfigDef.java:382)
at io.confluent.kafka.serializers.AbstractKafkaSchemaSerDeConfig.baseConfigDef(AbstractKafkaSchemaSerDeConfig.java:153)
at io.confluent.connect.avro.AvroConverterConfig.<init>(AvroConverterConfig.java:27)
at io.confluent.connect.avro.AvroConverter.configure(AvroConverter.java:63)
at org.apache.kafka.connect.runtime.isolation.Plugins.newConverter(Plugins.java:263)
at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:434)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:865)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1600(DistributedHerder.java:110)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:880)
at org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:876)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
connect-distributedonly takes one property file, not two. The quickstart files are meant to be used with connect-standalone
The actual error you're getting about the topic name strategy is likely an issue/bug with the version of the Avro serializers you're using
I have a question about debezium and I was stuck for more than two weeks.
I use debezium to pull data from remote PostgreSQL every minute, and I always see the below error as output:
"option "include-unchanged-toast" = "0" is unknown".
the environment is as follows:
debezium 0.9.5 final
postgresql 10.8
kafka 2.1.1
the connector configuration is as follows:
name=diansheng340831
slot.name=diansheng340831
connector.class=io.debezium.connector.postgresql.PostgresConnector
database.hostname=*.*.*
database.port=5432
database.user=*
database.password=*
database.dbname=*
database.history.kafka.bootstrap.servers=localhost:9092
database.server.name=diansheng340831
plugin.name=wal2json
table.whitelist=*
transforms=dropFieldByValue
transforms.dropFieldByValue.type=org.apache.kafka.connect.transforms.ReplaceField$Value
transforms.dropFieldByValue.whitelist=after
max.batch.size=5120
snapshot.fetch.size=5120
database.tcpKeepAlive=true
schema.refresh.mode=columns_diff_exclude_unchanged_toast
max.queue.size=202400
snapshot.mode=never
the error is as follows:
ERROR unexpected exception while streaming logical changes (io.debezium.connector.postgresql.RecordsStreamProducer:147)
org.postgresql.util.PSQLException: ERROR: option "include-unchanged-toast" = "0" is unknown
Where: slot "guangdian10831", output plugin "wal2json", in the startup callback
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440)
at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1116)
at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
at org.postgresql.core.v3.replication.V3PGReplicationStream.read(V3PGReplicationStream.java:70)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.read(PostgresReplicationConnection.java:264)
at io.debezium.connector.postgresql.RecordsStreamProducer.streamChanges(RecordsStreamProducer.java:140)
at io.debezium.connector.postgresql.RecordsStreamProducer.lambda$start$0(RecordsStreamProducer.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2019-08-22 11:16:23,425] INFO WorkerSourceTask{id=guangdian10831-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask:397)
[2019-08-22 11:16:23,426] INFO WorkerSourceTask{id=guangdian10831-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask:414)
[2019-08-22 11:16:23,426] ERROR WorkerSourceTask{id=guangdian10831-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:177)
org.apache.kafka.connect.errors.ConnectException: An exception ocurred in the change event producer. This connector will be stopped.
at io.debezium.connector.base.ChangeEventQueue.throwProducerFailureIfPresent(ChangeEventQueue.java:170)
at io.debezium.connector.base.ChangeEventQueue.poll(ChangeEventQueue.java:151)
at io.debezium.connector.postgresql.PostgresConnectorTask.poll(PostgresConnectorTask.java:161)
at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:244)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:220)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Can somebody help me?
I have solved the problem.
I commit the following code in debezium source code and rebuild it.
#Override
public ChainedLogicalStreamBuilder tryOnceOptions(ChainedLogicalStreamBuilder builder) {
// return builder.withSlotOption("include-unchanged-toast", 0); ---original
return builder; ---modified
}
I started a Sink JDBC some weeks ago. Everything was fine until the logs started to thow this error:
[2019-06-27 11:35:44,121] WARN Write of 500 records failed, remainingRetries=10 (io.confluent.connect.jdbc.sink.JdbcSinkTask:68)
java.sql.BatchUpdateException: [Teradata JDBC Driver] [TeraJDBC 16.20.00.10] [Error 1338] [SQLState HY000] A failure occurred while executing a PreparedStatement batch request. Details of the failure can be found in the exception chain that is accessible with getNextException.
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:149)
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:138)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatchDMLArray(TDPreparedStatement.java:276)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatch(TDPreparedStatement.java:2754)
at io.confluent.connect.jdbc.sink.BufferedRecords.flush(BufferedRecords.java:99)
at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:78)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:62)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:66)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:429)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have already tried to low the batch.size property, evento as low as 100 and it is still failing.
Added connector status:
{"name":"teradata-sink-K_C_OSUSR_DGL_DFORM_I1-V2",
"connector":{
"state":"RUNNING",
"worker_id":"10.28.148.64:41029"},
"tasks":[{"state":"FAILED","trace":"org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:451)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)\n","id":0,"worker_id":"10.28.148.64:41029"}]}
I faced similar issue while trying to write to teradata from spark using JDBC. Apparently, this happens with teradata only. It happens when you try to write to one table using multiple JDBC connections parallely. In my case, spark was forking multiple jdbc connection to teradata while writing the data to Teradata.
Your options are:
You have to restrict your application not to make multiple jdbc
connections, you can monitor that from Viewpoint if you have access.
You can try providing a jdbc type=FASTLOAD and see if it is working.
like this
Or go for TPT which handles this issue internally.
Start working with Nifi and this is my first exercise.
So trying to put a csv file in a Postgres table. I defined my data base driver as shown in the picture.
The error is:
can't load jdbc driver class
in my log file I have this message:
ERROR [StandardProcessScheduler Thread-1] o.a.n.c.s.StandardControllerServiceNode DBCPConnectionPool[id=c25f8f91-0161-1000-a496-8910832bdbd8] F$
org.apache.nifi.reporting.InitializationException: Can't load Database Driver
at org.apache.nifi.dbcp.DBCPConnectionPool.getDriverClassLoader(DBCPConnectionPool.java:249)
at org.apache.nifi.dbcp.DBCPConnectionPool.onConfigured(DBCPConnectionPool.java:198)
at sun.reflect.GeneratedMethodAccessor437.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:47)
at org.apache.nifi.controller.service.StandardControllerServiceNode$2.run(StandardControllerServiceNode.java:409)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.postgresql.Driver
The postgres jdbc driver doesn't come packaged with nifi. You must
download the driver (jar) e.g. postgresql-42.2.24.jar
place it in the nifi/lib folder
restart nifi
Open your DBCPConnectionPool controller service properties
Set Database Driver Class Name to org.postgresql.Driver
Resources
https://jdbc.postgresql.org/download.html
https://docs.oracle.com/cd/E19509-01/820-3497/agqka/index.html
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-dbcp-service-nar/1.14.0/org.apache.nifi.dbcp.DBCPConnectionPool/index.html
The asker found the root cause:
I insert a return in the line after the jdbc name