can't load jdbc driver class PostgreSQL + NIFI - postgresql

Start working with Nifi and this is my first exercise.
So trying to put a csv file in a Postgres table. I defined my data base driver as shown in the picture.
The error is:
can't load jdbc driver class
in my log file I have this message:
ERROR [StandardProcessScheduler Thread-1] o.a.n.c.s.StandardControllerServiceNode DBCPConnectionPool[id=c25f8f91-0161-1000-a496-8910832bdbd8] F$
org.apache.nifi.reporting.InitializationException: Can't load Database Driver
at org.apache.nifi.dbcp.DBCPConnectionPool.getDriverClassLoader(DBCPConnectionPool.java:249)
at org.apache.nifi.dbcp.DBCPConnectionPool.onConfigured(DBCPConnectionPool.java:198)
at sun.reflect.GeneratedMethodAccessor437.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:47)
at org.apache.nifi.controller.service.StandardControllerServiceNode$2.run(StandardControllerServiceNode.java:409)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.postgresql.Driver

The postgres jdbc driver doesn't come packaged with nifi. You must
download the driver (jar) e.g. postgresql-42.2.24.jar
place it in the nifi/lib folder
restart nifi
Open your DBCPConnectionPool controller service properties
Set Database Driver Class Name to org.postgresql.Driver
Resources
https://jdbc.postgresql.org/download.html
https://docs.oracle.com/cd/E19509-01/820-3497/agqka/index.html
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-dbcp-service-nar/1.14.0/org.apache.nifi.dbcp.DBCPConnectionPool/index.html

The asker found the root cause:
I insert a return in the line after the jdbc name

Related

Sink kafka topic to mongodb

I have a project where I need to get data from JSON files using java and sink it into kafka topic, and then sink that data from the topic to mongodb.
I have found the kafka-mongodb connector, but the documentation is available only to connect using confluent plateform.
I have tried:
Download mongo-kafka-connect-1.2.0.jar from Maven.
Put the file in /kafka/plugins
Added this line "plugin.path=C:\kafka\plugins" in connect-standalone.properties.
created MongoSinkConnector.properties.
name=mongo-sink
topics=test
connector.class=com.mongodb.kafka.connect.MongoSinkConnector
tasks.max=1
key.ignore=true
# Specific global MongoDB Sink Connector configuration
connection.uri=mongodb://localhost:27017
database=student_kafka
collection=students
max.num.retries=3
retries.defer.timeout=5000
type.name=kafka-connect
and than I ran the command
.\bin\windows\connect-standalone.bat .\config\connect-standalone.properties .\config\MongoSinkConnector.properties
I got this error
[2020-08-09 20:18:30,329] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone)
java.util.concurrent.ExecutionException: java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: com/mongodb/ConnectionString
at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:115)
at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:99)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:118)
Caused by: java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: com/mongodb/ConnectionString
at com.mongodb.kafka.connect.sink.MongoSinkConfig.createConfigDef(MongoSinkConfig.java:248)
at com.mongodb.kafka.connect.sink.MongoSinkConfig.<clinit>(MongoSinkConfig.java:139)
at com.mongodb.kafka.connect.MongoSinkConnector.config(MongoSinkConnector.java:72)
at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:366)
at org.apache.kafka.connect.runtime.AbstractHerder.lambda$validateConnectorConfig$1(AbstractHerder.java:326)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.NoClassDefFoundError: com/mongodb/ConnectionString
... 10 more
Caused by: java.lang.ClassNotFoundException: com.mongodb.ConnectionString
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 10 more
EDIT:
Thanks to BogdanSucaciu for the help, I have found a solution for this.
You need to add the following jars into kafka/lib folder.
mongodb-driver-3.12.7.jar and mongodb-driver-core-3.12.7.jar and mongo-java-driver-3.12.6.jar and mongo-kafka-connect-1.0.1.jar.
PS: I have had some problems working with the latest mongo-kafka-connect. so I had to work with this version.
You are missing the MongoDB driver. The MongoDB connector jar contains only the classes relevant to Kafka Connect but it still needs a driver to be able to connect to a MongoDB instance. You would need to download that driver and copy the jar file to the same path where you've published your connector ( C:\kafka\plugins ).
To keep things clean you should also create another folder inside that plugins directory ( e.g: C:\kafka\plugins\mongodb ) and move all the stuff relevant to this connector there.
Later Edit:
I went through an old(er) setup that I had with Kafka Connect and MongoDB Sink connector and I found the following jars:
This makes me believe that the kafka-connect-mongdb jar and the mongodb-driver won't be enough. You can give it a try though.

NiFi Writing to HDFS Error: java.lang.IllegalArgumentException: Can not create a Path from an empty string

I am facing a problem with NiFi writing to HDFS. I am getting an error:
ERROR [Timer-Driven Process Thread-10] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=4af43efa-a8ff-18ac-0000-00002377fba5] Failed to properly initialize Processor. If still scheduled to run, NiFi will attempt to initialize and run the Processor again after the 'Administrative Yield Duration' has elapsed. Failure is due to java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:47)
at org.apache.nifi.controller.StandardProcessorNode.lambda$initiateStart$1(StandardProcessorNode.java:1364)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
at org.apache.hadoop.fs.Path.<init>(Path.java:134)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getConfigurationFromResources(AbstractHadoopProcessor.java:225)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java:254)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java:205)
... 15 common frames omitted
My HDFS configuration is:
Note: same configuration were applied on PutFile and it worked perfectly (Kafka.topic was not empty)
It seems when PutHDFS processor is trying to save the file looking for kafka.topic attribute associated with the flowfile but attribute not having any value to it.
Make sure you are having kafka.topic attribute having some value associate to it, we can use UpdataAttribute processor before PutHDFS to add the attribute.

JDBC Sink connector throwing java.sql.BatchUpdateException

I started a Sink JDBC some weeks ago. Everything was fine until the logs started to thow this error:
[2019-06-27 11:35:44,121] WARN Write of 500 records failed, remainingRetries=10 (io.confluent.connect.jdbc.sink.JdbcSinkTask:68)
java.sql.BatchUpdateException: [Teradata JDBC Driver] [TeraJDBC 16.20.00.10] [Error 1338] [SQLState HY000] A failure occurred while executing a PreparedStatement batch request. Details of the failure can be found in the exception chain that is accessible with getNextException.
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:149)
at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeBatchUpdateException(ErrorFactory.java:138)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatchDMLArray(TDPreparedStatement.java:276)
at com.teradata.jdbc.jdbc_4.TDPreparedStatement.executeBatch(TDPreparedStatement.java:2754)
at io.confluent.connect.jdbc.sink.BufferedRecords.flush(BufferedRecords.java:99)
at io.confluent.connect.jdbc.sink.BufferedRecords.add(BufferedRecords.java:78)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:62)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:66)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:429)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have already tried to low the batch.size property, evento as low as 100 and it is still failing.
Added connector status:
{"name":"teradata-sink-K_C_OSUSR_DGL_DFORM_I1-V2",
"connector":{
"state":"RUNNING",
"worker_id":"10.28.148.64:41029"},
"tasks":[{"state":"FAILED","trace":"org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:451)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250)
org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179)\n\tat org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139)
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)\n","id":0,"worker_id":"10.28.148.64:41029"}]}
I faced similar issue while trying to write to teradata from spark using JDBC. Apparently, this happens with teradata only. It happens when you try to write to one table using multiple JDBC connections parallely. In my case, spark was forking multiple jdbc connection to teradata while writing the data to Teradata.
Your options are:
You have to restrict your application not to make multiple jdbc
connections, you can monitor that from Viewpoint if you have access.
You can try providing a jdbc type=FASTLOAD and see if it is working.
like this
Or go for TPT which handles this issue internally.

Configuring a replica set of Spring Boot back-end that uses a Postgresql db

i am quite new to spring-boot. I have a typical application that consists of a front-end, back-end and postgresql database (see below).
I deployed this app to a local openshift with one pod for each. The app worked fine but when i try though to replicate the spring-boot-backend-pods i noticed that only one of the replicas can access the database. I thought at first that this is happening due to RWO Access Mode of the postgresql volume so i changed it to shared acess(RWX) but i still having the problem.
error:
java.lang.IllegalArgumentException: Sensor type ST0001 already exists
at com.foo.boo.SupplyChainController.addSensorType(SupplyChainController.kt:138) ~[classes!/:na]
at com.foo.boo.SupplyChainController.initializeSupplyChain(SupplyChainController.kt:203) ~[classes!/:na]
at com.foo.boo.service.SupplyChainService.instatiateSupplyChain(SupplyChainService.java:41) ~[classes!/:na]
at com.foo.boo.websocket.SupplyChainSocketController.instatiateSupplyChain(SupplyChainSocketController.java:32) ~[classes!/:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_131]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_131]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_131]
5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
at org.springframework.messaging.support.ExecutorSubscribableChannel$SendTask.run(ExecutorSubscribableChannel.java:138) [spring-messaging-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
Does any body know whats really happening here? I can provide more info if needed.

Kafka Connect: No suitable driver found

I am trying Kafka with Postgres Sink using JDBC-sink connector.
Exception:
INFO Unable to connect to database on attempt 1/3. Will retry in 10000 ms. (io.confluent.connect.jdbc.util.CachedConnectionProvider:91)
java.sql.SQLException: No suitable driver found for jdbc:postgresql://localhost:5432/casb
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.newConnection(CachedConnectionProvider.java:85)
at io.confluent.connect.jdbc.util.CachedConnectionProvider.getValidConnection(CachedConnectionProvider.java:68)
at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:56)
at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:69)
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:495)
at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:288)
at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Sink.properties:
name=test-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=fp_test
connection.url=jdbc:postgresql://localhost:5432/casb
connection.user=admin
connection.password=***
auto.create=true
I have set plugin.path=/usr/share/java/kafka-connect-jdbc
On /usr/share/java/kafka-connect-jdbc I have the following files:
kafka-connect-jdbc-4.0.0.jar , postgresql-9.4-1206-jdbc41.jar, sqlite-jdbc-3.8.11.2.jar and some other jars that basically come packaged along with confluent.
I then downloaded postgres-jdbc driver jar postgresql-42.2.2.jar, copied it in the same folder and tried again. Still the same exception.
Kindly help me out with this.
Setting plugin.path=/usr/share/java and CLASSPATH=/usr/share/java/kafka-connect-jdbc/ solved the issue.