We have modified the Kafka Connect JDBC to support a custom converter which will convert a single SinkRecord into multiple SinkRecords so to support transactional inserts. When creating a sink, one might specify in the configuration properties a class that implements SinkRecordConverter
We then tried packaged a uber jar with the implementation of this custom converter and we tried to deploy it in two ways:
We placed in the same folder of kafka-connect-jdbc
We modified the plugins.path in the connect-distributed.properties to /usr/local/share/java and we placed our converter in /usr/local/share/java/myconverter/myconverter-1.0.jar
Then we tried to deploy the sink, but in both cases the code that tries to create an instance of this converter by reflection fails with a java.lang.ClassNotFoundException.
We tried to debug the classloading issue by placing a breakpoint where the issue occurs in both cases:
In the first case the jar would appear as one of the jars on the URLClasspath
In the second case, it would not even appears as one of the jars on the URLClasspath
What is the correct way to add custom converters to kafka-connect-jdbc?
We had two issues in one:
To assembly the jars we were using a SBT plugin named oneJar, which creates a custom classloader
We needed to access that classes from inside an existing kafka connector (jdbc) and not from kafka-connect only.
The solution we found is the following:
We abandoned the uber jar and we deploy all the libs on the kafka-connect instance using sbt pack.
We place the jars physically on the same folder where kafka-connect-jdbc is located
Related
We are planning to create our own repo of connector (sink or source plugins) for Apache Kafka like one here
We tried to search for the documentation or help on how to create a plugin jar for Kafka.
There is no mention of developing a plugin in the official documentation from apache Kafka.
Any help or pointer will be helpful, can share it back with the open community once developed.
Here is guide on How to Build a Connector
As Well here is a Connector Developer Guide
Developing a connector only requires implementing two interfaces, the Connector and Task
Refer to the example source code for full examples for simple example
Once you’ve developed and tested your connector, you must package it so that it can be easily installed into Kafka Connect installations. The two techniques described here both work with Kafka Connect’s plugin path mechanism.
If you plan to package your connector and distribute it for others to use, you are obligated to properly license and copyright your own code and to adhere to the licensing and copyrights of all libraries your code uses and that you include in your distribution.
Creating an Archive
The most common approach to packaging a connector is to create a tarball or ZIP archive. The archive should contain a single directory whose name will be unique relative to other connector implementations, and will therefore often include the connector’s name and version. All of the JAR files and other resource files needed by the connector, including third party libraries, should be placed within that top-level directory. Note, however, that the archive should never include the Kafka Connect API or runtime libraries.
To install the connector, a user simply unpacks the archive into the desired location. Having the name of the archive’s top-level directory be unique makes it easier to unpack the archive without overwriting existing files. It also makes it easy to place this directory on Installing Connect Plugins or for older Kafka Connect installations to add the JARs to the CLASSPATH.
Creating an Uber JAR
An alternative approach is to create an uber JAR that contains all of the connector’s JAR files and other resource files. No directory internal structure is necessary.
To install, a user simply places the connector’s uber JAR into one of the directories listed in Installing Connect Plugins.
I created a shadow jar which included Kafka-client library. But this jar fails to work when placed in $cassaandra_home/conf/trigger directory. Is there a way to add the external jar of Kafka separately and then link it to the main jar?
It might have to be added to Cassandra's CLASSPATH environment variable. You can do this by adding a line in the cassandra-env.sh file, referencing its location:
CLASSPATH="$CLASSPATH:$CASSANDRA_HOME/lib/cassandra-ldap-3.11.4.jar"
The above line allows the use of Instaclustr's Cassandra LDAP Authenticator, referencing it from within Cassandra's lib/ dir (on each node). Give that a try.
I'm building a job in Scala to run on a Flink Cluster, that will store data in AWS S3, and I've some problems related to dependencies.
I've checked most of the question previously asked here, and to fix this I needed to add flink-s3-fs-hadoop-1.9.1.jar jar files to the $FLINK_HOME/plugins in order to run my job successfully:
My question is, should this be detected as been inside the fatjar generate by sbt assembly ? The files are inside the jar, but for some reason the Flink Cluster can't see them.
I know that in the documentation says that flink-s3-fs-hadoop-1.9.1.jar should download to $FLINK_HOME/plugins folder.
Filesystems cannot be bundled in the user-jar, they must be present either in /lib or /plugins.
The components that use filesystems aren't necessarily aware of the user-jar.
I am using Kafka connect to create a MQTT Kafka connection.I put all the kafka MQTT connector specific jar downloaded from confluent site to "/data" folder. And accordingly update the "connect-standalone.properties" file to reflect the plugin path i.e
plugin.path=/opt/kafka_2.11-2.1.1/libs,/data
When I run the Kafka Connect
./connect-standalone.sh ../config/connect-standalone.properties ../config/connect-mqtt-source.properties
I get following error :
[2019-07-18 10:26:05,823] INFO Loading plugin from:
/data/kafka-connect-mqtt-1.2.1.jar
(org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:220)
[2019-07-18 10:26:05,829] ERROR Stopping due to error
(org.apache.kafka.connect.cli.ConnectStandalone:128)
java.lang.NoClassDefFoundError:
com/github/jcustenborder/kafka/connect/utils/VersionUtil
at io.confluent.connect.mqtt.MqttSourceConnector.version(MqttSourceConnector.java:29)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.versionFor(DelegatingClassLoader.java:344)
screenshot as below :
Please note that "connect-utils-0.3.140.jar" is present in "/data" folder as highlighted by red underlines.
Now If I make a soft link screenshot below or copy all the jars from "/data" folder and update the plugin path to :
plugin.path=/opt/kafka_2.11-2.1.1/libs
Kafka connect works perfectly fine.
Any help why it does not work in the first scenario i.e kafka connector specific jars in different folders
From Kafka Connect user guide on Confluent page:
...
Kafka Connect isolates each plugin from one another so that libraries in one plugin are not affected by the libraries in any other plugins. This is very important when mixing and matching connectors from multiple providers.
A Kafka Connect plugin is:
an uber JAR containing all of the classfiles for the plugin and its third-party dependencies in a single JAR file; or
a directory on the file system that contains the JAR files for the plugin and its third-party dependencies.
In your case you have to put plugin jars in one folder, ex /data/pluginName not directly in /data/
More details can be found here: Installing Plugins
I have a very specific problem dealing with GWT. I have a web application and a jar file which contains the business logic. Inside this jar I use dozer mapper and I have the related config file inside the jar itself. The config file is under META-INF/dozer_mappings.xml. While in hosted mode it works perfectly, in web mode it has a problem. It says:
Unable to locate dozer mapping file [/META-INF/dozer_mappings.xml] in the classpath!
Actually I don't understand why it should change: if the file is not in the classpath it should not work in both the environments... Of course all my libraries are in the WEB-INF/lib folder. The one with the dozer configuration is there as well.