How does kafka GroupCoordinator play a role in kafka connect? - apache-kafka

In kafka connect distributed mode, we need to submit kafka connect source and kafka connect sink configuration through REST api. How does Kafka Connect leverage kafka group coordinator to form a consumer group?

Kafka sink connectors form consumer groups using regular consumer API with a group.id named connect-<name>, using the name you've configured in the connector config

Related

Kafka Connector To read from a Topic and write to a topic

I want to build a Kafka connector which needs to read from the Kafka topic and make a call to the GRPC service to get some data and write the whole data into another kafka topic.
I have written a Kafka Sink connector which reads from a topic and called a GRPC service. But not sure how to redirect this data into a Kafka topic.
Kafka Streams can read from topics, call external services as necessary, then forward this data to a new topic in the same cluster.
MirrorMaker2 can be used between different clusters, but using Connect transforms is generally not recommended with external services.
Or you could make your gRPC service into a Kafka producer.

How to configure a debezium connector at multiple tables to different topics

Debezium Kafka Debezium Connect was configured on PostgreSQL payments(schema) outbox(table) to Kafka topic(payments-transactions).
Means at Kafka Connect perform below actions
payments.outbox ==streams=> payments-transactions Topic
As part of the new CDC flow, we plan to configure webhooks(schema) outbox(table) to different Kafka topic(webhooks)
Means at Kafka Connect perform below actions
payments.outbox ==streams=> payments-transactions topic
webhooks.outbox ==streams=> webhooks topic
How does the Debezium Kafka connect configuration looks? Is it possible to configure two table with separate topics?
If you configure table.whitelist to multiple tables of the same schema, the each table will generate a corresponding output topic.
https://debezium.io/documentation/reference/stable/connectors/postgresql.html#postgresql-topic-names

Enable kafka source connector idempotency

How can I enable Kafka source connector idempotency feature?
I know in confluent we can override producer configs by producer.* properties in the worker configuration, but how about Kafka itself? is it the same?
After setting these configs where can I see applied configs for my connect worker?
Confluent doesn't modify the base Kafka Connect properties.
For configuration of the producers used by Kafka source tasks and the consumers used by Kafka sink tasks, the same parameters can be used but need to be prefixed with producer. and consumer. respectively
Starting with 2.3.0, client configuration overrides can be configured individually per connector by using the prefixes producer.override. and consumer.override. for Kafka sources or Kafka sinks respectively
https://kafka.apache.org/documentation/#connect_running
However, Kafka Connect sources aren't idenpotent - KAFKA-7077 & KIP-308
After setting these configs where can I see applied configs for my connect worker
In the logs, it should show the ProducerConfig or ConsumerConfig when the tasks start

Apply a Quota to a Kafka Connect consumer group

I have Kafka Connect JDBC sink connectors writing to various databases and I'd like to throttle the traffic to one database. The Kafka quotas feature can set a consumer_byte_rate quota for a client ID, but Kafka Connect client IDs look like consumer-1234 and are dynamically assigned to connectors. So if my sink connector is rebalanced, it will be assigned all new client IDs. I tried setting a quota using my sink connector consumer group ID as the client ID, but that doesn't work. Is there any way to set a quota for a Kafka Connect consumer group?
If you upgrade to Apache Kafka 2.3 you'll benefit from KIP-411: Make default Kafka Connect worker task client IDs distinct
. You can see an example of it in action here. However, you'd have to test if the client-id is deterministic since quotas can't be wildcarded.

Kafka consumer api (no zookeeper configuration)

I am using Kafka client library comes with Kafka 0.11.0.1. I noticed that using kafkaconsumer does not need to configure zookeeper anymore. Does that mean zookeep server will automatically be located by the kafka bootstrap server?
Since Kafka 0.9 the KafkaConsumer implementation stores offsets commit and consumer group information in Kafka brokers themselves. This eliminates the zookeeper dependency and increases the scalability of the consumers.