Can a Kafka Connector load its own name?

Can a Kafka Connector load its own name? - apache-kafka

According to Kafka Documentation
Connector configurations are simple key-value mappings. For standalone
mode these are defined in a properties file and passed to the Connect
process on the command line.
Most configurations are connector dependent, so they can't be outlined
here. However, there are a few common options:
name - Unique name for the connector. Attempting to register again with the same name will fail.
I have 10 connectors running in standalone mode like this:
bin/connect-standalone.sh config/connect-standalone.properties connector1.properties connector2.properties ...
My question is can a connector load its own name at runtime?
Thanks in advance.

Yes, you can get the name of the connector at runtime.
When connector starts all properties are passed to Connector::start(Map<String, String> props). Connector can read those properties, validate them, save and later pass to Task. It depends on Connector implementation if he use it or not.
Connector name property is name.

Related

Is there a way of telling a sink connector in Kafka Connect how to look for schema entries

I have successfully set up Kafka Connect in distributed mode locally with the Confluent BigQuery connector. The topics are being made available to me by another party; I am simply moving these topics into my Kafka Connect on my local machine, and then to the sink connector (and thus into BigQuery).
Because of the topics being created by someone else, the schema registry is also being managed by them. So in my config, I set "schema.registry.url":https://url-to-schema-registry, but we have multiple topics which all use the same schema entry, which is located at, let's say, https://url-to-schema-registry/subjects/generic-entry-value/versions/1.
What is happening, however, is that Connect is looking for the schema entry based on the topic name. So let's say my topic is my-topic. Connect is looking for the entry at this URL: https://url-to-schema-registry/subjects/my-topic-value/versions/1. But instead, I want to use the entry located at https://url-to-schema-registry/subjects/generic-entry-value/versions/1, and I want to do so for any and all topics.
How can I make this change? I have tried looking at this doc: https://docs.confluent.io/platform/current/schema-registry/serdes-develop/index.html#configuration-details as well as this class: https://github.com/confluentinc/schema-registry/blob/master/schema-serializer/src/main/java/io/confluent/kafka/serializers/subject/TopicRecordNameStrategy.java
but this looks to be a config parameter for the schema registry itself (which I have no control over), not the sink connector. Unless I'm not configuring something correctly.
Is there a way for me to configure my sink connector to look for a specified schema entry like generic-entry-value/versions/..., instead of the default format topic-name-value/versions/...?

The strategy is configurable at the connector level.
e.g.
value.converter.value.subject.name.strategy=...
There are only strategies built-in, however for Topic and/or RecordName lookups. You'll need to write your own class for static lookups from "generic-entry" if you otherwise cannot copy this "generic-entry-value" schema into new subjects
e.g
# get output of this to a file
curl ... https://url-to-schema-registry/subjects/generic-entry-value/versions/1/schema
# upload it again where "new-entry" is the name of the other topic
curl -XPOST -d #schema.json https://url-to-schema-registry/subjects/new-entry-value/versions

Where can I find info about possible parameters to put in config in a json connector file?

What are the possible parameters to be passed on a config of json connector file?
Where can i find info for create my own json connectors files?

The only properties that are valid for all connectors (not the workers) include following
name
connector.class
key / value .converter
tasks.max
(among others)
Section - https://kafka.apache.org/documentation/#connectconfigs
Scroll down to see differences in worker configs and source/sink properties
Beyond that, each connector.class has its own possible configuration values that should be documented elsewhere. For example, Confluent Hub links to specific connector property pages if you are searching in there.
If you are trying to create your own Connector, then you would have a configure method where the properties are defined

Example connector property file to provide standalone Kafka Connect

I am starting to play with CDC and Kafka connect
After countless hours trying, I have come to understand the logic
Set Kafka Connect properties (bin/connect-standalone.sh) with your cluster information
Set Kafka Connect configuration file (config/connect-standalone.properties)
Download your Kafka connector (in this case MySQL from Debizium)
Configure connector properties in whatevername.properties
In order to run a worker with Kafka Connector, you need to
./bin/connect-standalone.sh config/connect-standalone.properties
which answers:
INFO Usage: ConnectStandalone worker.properties connector1.properties [connector2.properties ...] (org.apache.kafka.connect.cli.ConnectStandalone:62)
I know we need to run:
./bin/connect-standalone.sh config/connect-standalone.properties myconfig.properties
My issue is that I cannot find any format description, or example of that myconfig.properties field.
【Extra Info】
Debizium configuration properties list:
https://docs.confluent.io/debezium-connect-mysql-source/current/mysql_source_connector_config.html#mysql-source-connector-config
https://debezium.io/documentation/reference/1.5/connectors/mysql.html
【Question】
Where can I find an example of the connector properties?
Thanks!

I'm not sure if I understood your question, but here is an example of properties for this connector :
connector.class=io.debezium.connector.mysql.MySqlConnector
connector.name=someuniquename
database.hostname=192.168.99.100
database.port=3306
database.user=debezium-user
database.password=debezium-user-pw
database.server.id=184054
database.server.name=fullfillment
database.include.list=inventory
database.history.kafka.bootstrap.servers=kafka:9092
database.history.kafka.topic=dbhistory.fullfillment
include.schema.changes=true
The original config is the one from the documentation which I converted from json to properties : https://debezium.io/documentation/reference/1.5/connectors/mysql.html#mysql-example-configuration

MirrorMaker2 - Custom rename of topics with standalone connector

I want to run MirrorMaker as a standalone connector.
So far I haven't found any documentation about the configuration.
As far as I imagine the following would replicate myTopic.
Now in the destination cluster I need for the topic to have another name foo (not the automatic rename).
Is this directly supported by MirrorSourceConnector or do I need some other means for that?
connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector
tasksMax = 2
topics = myTopic
source.cluster.bootstrap.servers = sourceHost:9092
target.cluster.bootstrap.servers = sinkHost:9092

So the Kafka Mirror Maker source code has a decent readme.md.
How you configure it is different depending on if you're running MM2 directly or in Kafka Connect. You said directly, which is in the linked readme.md.
Basically:
By default, replicated topics are renamed based on "source cluster
aliases":
topic-1 --> source.topic-1
This can be customized by overriding the replication.policy.separator
property (default is a period). If you need more control over how
remote topics are defined, you can implement a custom
ReplicationPolicy and override replication.policy.class (default is
DefaultReplicationPolicy).
This unfortunately means that you cannot rename the topic through configuration code alone. (The DefaultReplicationPolicy class only allows you to specify the separator and nothing else). This is probably because when you specify the topics to mirror you use a regular expression, and not a single topic name (even if your source cluster topic config property is just the name of the topic - it's still treated like a regular expression).
So, back to the docs: ReplicationPolicy is a Java interface in the Kafka connect source code, so you would need to implement a custom Java class that implements ReplicationPolicy and then ensure it is on the classpath when you run MM2.
Let's imagine you do write such a class and you call it com.moffatt.kafka.connect.mirror.FooReplicationPolicy. A good template for your class is the default (and apparently only) replication policy class that comes with Kafka Connect: DefaultReplicationPolicy. You can see that building your own would not be too difficult. You could easily add a Map - either hard-coded or configured - that looks for specific configured topic names and maps it to the target topic name.
You use your new class by specifying it in the config as:
replication.policy.class = com.moffatt.kafka.connect.mirror.FooReplicationPolicy

Can I access Kafka Connect Worker config from connector or task?

I am developing a custom Kafka source connector. I would like to access the worker configuration, such as key converter or value converter or schema registry url or zookeeper url etc., but I didn't find a way to do that. Any idea? Is that possible?
To be more specific, in my implementation of connector and task, can I access worker's configuration? I checked, the only thing I can get is a ConnectorContext from connector implementation, and it has only one useful method to do reconfiguration, but it is not what I want.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse