I am using windows 10, and using docker container to run the confluent control center.
I am trying to upload one of the pre-built connectors that can be found on the confluent hub : https://www.confluent.io/product/connectors/?_ga=2.268912561.1564485000.1614024157-1461284509.1612365443
I am getting the following error: "Invalid connector class. Check the connector configuration file."
I am trying to upload the connector with the following .properties file
name=hdfs-sink
connector.class=io.confluent.connect.hdfs.HdfsSinkConnector
tasks.max=1
topics=test_hdfs
hdfs.url=hdfs://localhost:9000
flush.size=3
You cannot upload a property file using a class that's not available on that page
HDFS 2 Sink no longer comes with Confluent Platform due to security issues
If you're using Docker, you need to install the connector into the container (manually build your own Connect Docker image, use a volume mount, or using confluent-hub), then you should see it available in the list of connectors on that page.
Related
Good day,
Base on https://docs.confluent.io/kafka-connect-jdbc/current/index.html#installing-jdbc-drivers ,
I use the following command to isntall the jdbc connector:
confluent-hub install confluentinc/kafka-connect-jdbc:latest
Command run successfully, I can see confluentinc-kafka-connect-jdbc folder is created under <confluent-plaform>/share/confluent-hub-components.
And here is the screen shot to show the result of my install command:
After that, I following the next instruction, to upload the jdbc drivers jar file to share/java/kafka-connect-jdbc.
After that, I come to https://docs.confluent.io/kafka-connect-jdbc/current/source-connector/index.html , to load the db connector, first step, I use the list command to list down the connector I have by using following command:
confluent local services connect connector list
The output is show as follow:
[meow#localhost confluent-7.0.1]$ confluent local services connect connector list
The local commands are intended for a single-node development environment only,
NOT for production usage. https://docs.confluent.io/current/cli/index.html
Bundled Connectors:
file-sink
file-source
replicator
There is no connector name jdbc-source in the list, thus, I cant proceed to the next step to continue.
May I know what mistake on my steps?
After running confluent-hub install you must restart the Kafka Connect worker for it to pick up the new connector.
Since you're using the Confluent CLI the commands are:
confluent local services connect stop
confluent local services connect start
Edit: your screenshot shows that you told the Confluent Hub client not to update any of the Kafka Connect worker configurations. Therefore the worker will not pick up the connector that you've installed.
You should run the Confluent Hub client again and tell it to update the Kafka Connect worker configurations when prompted, and then restart the Kafka Connect worker. After that it will pick up the new connector.
when i tried to run Kafka without schema registry,
I got an error like
CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL is required.
Command [/usr/local/bin/dub ensure CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL] FAILED !
Is schema registry mandatory for setting up kafka connect? But I didn't find any words like this from the confluent official documentation.
The Registry is only required if you're using Confluent Converters that do need it, for example their AvroConverter, ProtobufConverter, or JsonSchemaConverter; it is not required for Connect itself.
If you want to run Connect in a container with minimal dependencies, see my image here - https://github.com/OneCricketeer/apache-kafka-connect-docker
I installed Neo4j and I can access the server. I can make nodes though cypher.
Now I want to use it for data streams. But I'm not sure how to do so. I just started Neo4j and I'm struggling with installing 'Stream Plugin'.
Any help is highly appreciated.
You should copy the jar files for the Neo4j streams plugin directly into your /plugins folder and configure the connections to Kafka and Zookeeper as well as other Neo4j property values at the neo4j.conf file as described here. For example:
kafka.zookeeper.connect=zookeeper-host:2181
kafka.bootstrap.servers=kafka-host:9092
Alternatively, if you are looking only for a sink connection from Kafka (i.e. moving records from Kafka topics to into Neo4j), you can also use Kafka Connect with the the supported Kafka Connect Neo4j Sink. More at https://www.confluent.io/hub/neo4j/kafka-connect-neo4j
When I create a source or sink connector using Confluent Control Center where does it save the settings related to that connector? Are there files I can browse? We are planning to create 50+ connectors and at one point we need to copy them from one environment to another, I was wondering if there is an easy way to do that.
Kafka Connect in distributed mode uses Kafka topics for storing configuration.
Kafka Connect supports a REST API. You can use this for viewing existing connector configuration, creating new ones (including programatically/automatically for 50+ new connectors), starting/stopping connectors, etc.
The REST API is documented here.
Kafka Connect distributed mode is started with a property file. That property file defines a "config topic".
The connectors you're able to load, however, are not stored there - that's only for the running source/sink configurations.
The classes themselves are bundled as JAR files in the classpaths of the individual Connect Workers, and Control Center has no current way of provisioning new Connect classes. In other words, you must use something like Ansible or manually connect to each worker, download the Connect type you want, and extract it next to the other connects.
For example, let's pretend you wanted the Syslog connector.
You'd already have folders for these under usr/share/java in the Confluent installation
kafka-connect-hdfs
kafka-connect-jdbc
...
So, you download or build that Syslog connector, make a kafka-connect-syslog folder, and drop all necessary jar libraries there.
Once you do this for all connect instances, you'll need to also restart the Kafka Connect process on those machines.
Once Control Center connects back to the Connect server, you'll be able to configure your new Connect classes
I am trying to upgrade from the apache kafka to the confluent kafka
As the storage of the temp folder is quite limited I have changed the log.dirs of server.properties to a custom folder
log.dirs=<custom location>
Then try to start kafka server via the Confluent CLI (version 4.0) using below command :
bin/confluent start kafka
However when I check the kafka data folder, the data still persitted under the temp folder instead of the customzied one.
I have tried to start kafka server directly which is not using the Confluent CLI
bin/kafka-server-start etc/kafka/server.properties
then seen the config has been picked up properly
is this a bug with confluent CLI or it is supposed to be
I am trying to upgrade from the apache kafka to the confluent kafka
There is no such thing as "confluent kafka".
You can refer to the Apache or Confluent Upgrade documentation steps for switching Kafka versions, but at the end of the day, both are Apache Kafka.
On a related note: You don't need Kafka from the Confluent site to run other parts of the Confluent Platform.
The confluent command, though, will read it's own embedded config files for running on localhost only, and is not intended to integrate with external brokers / zookeepers.
Therefore, kafka-server-start is the production way to run Apache Kafka
Confluent CLI is meant to be used during development with Confluent Platform. Therefore, it currently gathers all the data and logs under a common location in order for a developer to be able to easily inspect (with confluent log or manually) and delete (with confluent destroy or manually) such data.
You are able to change this common location by setting
export CONFLUENT_CURRENT=<top-level-logs-and-data-directory>
and get which location is used any time with:
confluent current
The rest of the properties are used as set in the various .properties files for each service.