Kafka Connect Hbase sink - apache-kafka

I am trying to deploy hbase sink connector for kafka (https://github.com/mravi/kafka-connect-hbase). So I downloaded and configured Hbase and Confluent Platform as per step 1 & 2.
Then it says,
Copy hbase-sink.jar and hbase-sink.properties from the project build location to $CONFLUENT_HOME/share/java/kafka-connect-hbase
But I don't see hbase-sink.jar and hbase-sink.properties anywhere in the Hbase and confluent directory location. Any help where I can get them.

But I don't see hbase-sink.jar and hbase-sink.properties
Sounds like you've not cloned that repo and ran mvn clean package, then opened up the target directory
As the other answer says, that project seems abandoned.
Try looking at this one too https://docs.lenses.io/connectors/sink/hbase.html

Have you seen the Hbase connector from Confluent kafka-connect-hbase? The one which you are using seems to be abandoned (no commits for the last 4 years).
kafka-connect-hbase documentation

Related

Kafka Connect won't pick up custom connector

I am trying to use kafka connect in a docker container with a custom connector (PROGRESS_DATADIRECT_JDBC_OE_ALL.jar) to connect to an openedge database.
I have put the JAR file in the plugin path (usr/share/java) but it won't load as a connector.
COPY Openedge/PROGRESS_DATADIRECT_JDBC_OE_ALL.jar /usr/share/java/progress
I can load another (standard) connector by putting it in the plugin path. This works
COPY confluentinc-kafka-connect-jdbc-10.3.2 /usr/share/java/confluentinc-kafka-connect-jdbc-10.3.2
A little lost on how to move forward and I'm very new to kafka. My main sources of information are
openedge to kafka streaming and How to use Kafka connect
#OneCricketeer had the solution. As a retro for me and hopefully helpful to someone else, here are my steps to make this work.
Copy the JDBC Connector to CONNECT_PLUGIN_PATH and install with confluent hub install:
COPY confluentinc-kafka-connect-jdbc-10.3.2.zip /usr/share/java
RUN confluent-hub install --no-prompt /usr/share/java/confluentinc-kafka-connect-jdbc-10.3.2.zip
Copy the driver (I ended up using openedge.jar) to the path where other jars are located (like sqllite) according to #OneCricketeer suggestion.
COPY Openedge/openedge.jar /usr/share/confluent-hub-components/confluentinc-kafka-connect-jdbc/lib
Verify with this by enabling DEBUG as suggested by this page
Finally add a .properties file to create the connector. In my case based on the one in “openedge to kafka streaming” link above
JDBC Drivers are not Connect plugins, nor are they connectors themselves.
You'd need to set the JVM CLASSPATH environment variable for detecting JDBC Drivers, as with any Java process.
The instructions on the linked site suggest you should copy the JDBC Drivers into the directory for the existing Confluent JDBC connector. While you could use a Docker COPY command, the better way would be to use confluent-hub install

How to Integrate Rest API Source Connector with Kafka Connect?

I have Confluent 5.0 on my local machine and trying to reading data from Rest API using Rest API Source Connect which is not part of confluent. till now i have used confluent inbuilt connectors only. Rest API source connect is open source and available on github https://github.com/llofberg/kafka-connect-rest
I have downloaded this connector from github and got stuck here.
Can anybody tell me the process to integrate this connector with confluent or how can i use this to pull the data from Rest API?
Disclaimer: There is no single answer to add an external Kafka Connect plugin; Confluent provides the Kafka Connect Maven plugin, but that doesn't mean people use it or even Maven to package their code.
If it is not on the Confluent Hub, then you'll have to build it by hand.
1) Clone the repo, and build it (install Git and Maven first)
git clone https://github.com/llofberg/kafka-connect-rest && cd kafka-connect-rest
mvn clean package
2) Create a directory for it on all Connect workers, similar to the other Connectors of Confluent Platform
mkdir $CONFLUENT_HOME/share/java/kafka-connect-rest
3) Find each of the shaded JARs (this connector happens to make multiple JARs, I don't know why...)
find . -iname "*shaded.jar" -type f
./kafka-connect-transform-from-json/kafka-connect-transform-from-json-plugin/target/kafka-connect-transform-from-json-plugin-1.0-SNAPSHOT-shaded.jar
./kafka-connect-transform-add-headers/target/kafka-connect-transform-add-headers-1.0-SNAPSHOT-shaded.jar
./kafka-connect-transform-velocity-eval/target/kafka-connect-transform-velocity-eval-1.0-SNAPSHOT-shaded.jar
./kafka-connect-rest-plugin/target/kafka-connect-rest-plugin-1.0-SNAPSHOT-shaded.jar
4) Copy each of these files into the $CONFLUENT_HOME/share/java/kafka-connect-rest folder created in step 2 for each Connect worker
5) Make sure your plugin.path of the connect-*.properties file points at the full path to $CONFLUENT_HOME/share/java
At this point, you've done all the steps that are listed in the README to build the thing and setup the plugin path, just not in Docker.
6) Start Connect (Distributed)
7) Hit GET /connector-plugins to verify the thing loaded.
8) Configure and send JSON payload to POST /connectors
I have not used this connector before, so I do not know how to configure it. Maybe see the examples or follow along with #rmoff's blog post before the KSQL stuff

Kafka Confluent REST API: Kafka Included?

I have an exisiting Kafka Cluster. I want to install the Kafka REST Proxy:
https://github.com/confluentinc/kafka-rest
If I install confluent does that come with Kafka? I am afraid if I still it on my master Kafka node confluent will override all my settings and mess up my Kafka cluster.
How do you install Kafka REST when you have an existing Kafka cluster?
This is not made clear on their website. I have CentOS and was going to try:
sudo yum install confluent-platform-oss-2.11
Any help would be great....
Download the Confluent Platform tarball, extract it, (or preferrably use APT/YUM) then only configure and run the REST proxy via kafka-rest-start
I wouldn't recommend using APT/YUM to install the entire confluent platform if you already have an existing Kafka. You might be able to only install kafka-rest using it, though.
Alternatively, backup your existing Kafka and Zookeeper property files, then place the Confluent Platform on top of the existing files, keeping the original files. If your Kafka is an old release, take this as a good opportunity to schedule an upgrade. Downloading Confluent isn't going to overwrite anything for the upstream Apache projects version for the corresponding release. If anything, it's an extension

Connect Confluent with already existing three kafka brokers

I'm new in Confluent world, and I know how to start kafka, zookeepers from confluent, but it's not that what I need.
I have already 3 kafka nodes and 2 zookeepers installed by Ambari. Afterwards I downloaded 3.0.0 version of Confluent and now I want to connect Confluent with already running Kafka and zookeeper. I don't want to instance new kafka server or zookeeper server which confluent is giving.
Does anyone has an idea how to accomplish that, what to actually run from Confluent and what to change.
By now I was only chaning files in ./etc/kafka or ./etc/zookeeper which are in Confluent dir. Thank you!
clarify some basics about Confluent and how manage communication between Confluent and Kafka
First things first, there is no single application called "Confluent" that can be started all on its own.
There's is nothing to configure for Kafka or Zookeeper. The Confluent Platform doesn't add anything on top of the existing Apache Kafka you have (presumably, via Hortonworks or Cloudera).
In fact, those companies do add patches to Kafka that would be slightly different than the base Apache versions you would get from Confluent.
That being said, if you read through each of extra services that Confluent provides, you'll notice either a Zookeeper or a Bootstrap server configuration option. Fill out those fields, start the respective services, and you're good to go.
what to actually run from Confluent
Look in the bin directory, you can find all the start scripts. From the comments, looks like you're trying to use Connect Distributed (which should already be installed by any recent Kafka installation, it's not Confluent specific), and Schema Registry. You'll have to be more specific about the errors that you get, but the config files are all in the etc path.
Unless you're using KSQL, REST Proxy or Control Center, there's not much to run because, as mentioned, Kafka Connect is included with the base Apache Kafka project and Hortonworks is maintaining their own Schema Registry project
2 zookeepers installed by Ambari
This is a highly non-recommended setup. Please install an odd number of Zookeepers. 3 or 5, preferably

How to update running kafka connector

I have kafka conenct running in Marathon container. If I want to update the connector plugin (jar) I have to upload the new one and then restart the Connect task.
Is it possible to do that without restarting/downtime?
The updated jar for the connector plugin needs to be added to the classpath and then the classloader for the worker needs to pick it up. The best way to do this currently is to take an outage as described here.
Depending on your connector, you might be able to do rolling upgrades, but the generic answer is that if you need to upgrade the connector plugin, you currently have to take an outage.