Confluent Cloud Kafka - Audit Log Cluster : Sink Connector - apache-kafka

For Kafka cluster hosted in Confluent Cloud, there is an Audit Log cluster that gets created. It seems to be possible to hook a Sink connector to this cluster and drain the events out from "confluent-audit-log-events" topic.
However, I am running into the below error when I run the connector to do the same.
org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to access topics: [connect-offsets]
In my connect-distributed.properties file, I have the settings as :
offset.storage.topic=connect-offsets
offset.storage.replication.factor=3
offset.storage.partitions=3
What extra permission/s needs to be granted so that the connector can create the required topics in the cluster? The key/secret being used in the connect-distributed.properties files is a valid key/secret that is associated to the service account for this cluster.
Also, when I run the consumer in the console using the same key (as above) , I am able to read the audit log events just fine.

It's confirmed that this feature (hooking up a connector to the Audit Log cluster) is not supported at the moment in Confluent Cloud. This feature may be available later this year at some point.

Related

Schema Registry URL for IIDR CDC Kafka subscription

I have created a cluster Amazon MSK. Also, created an EC2 instance and installed Kafka on it to create a topic in Amazon MSK. I am able to produce/consume messages on the topic using Kafka scripts.
I have also installed the IIDR Replication agent on an EC2 instance. The plan is to migrate DB2 table data into the Amazon MSK topic.
In the IDR Management console, I am able to add the IIDR replication server as the target.
Now when creating the subscription, it is asking for ZooKeeper URL and Schema Registry URL. I can get the Zookeeper endpoints from Amazon MSK.
What value to provide for the schema registry URL as there's none created?
Thanks for your help.
If you do not need to specify a schema registry because say you are using a KCOP that generate JSON, just put in a dummy value. Equally if you are specifying a list of Kafka brokers in the kafkaconsumer.propertie and the kafkaproducer.properties files in the CDC instance.conf directory you can put in dummy values for the zookeeper fields.
Hope this helps
Robert

Cannot create new kafka topic in confluent cloud

I'm trying to create a new kafka topic in confluent cloud, but it gives me an 'authorization failed' error.
give yourself a role CloudClusterAdmin from confluent cloud UI Accounts & access, then you should able to manage cluster.

confluent_kafka, how to specify cluster for producer config

I have two environments in a dev Confluent Cloud account, each with a single cluster. I can see that both clusters have the same bootstrap server, and this is documented:
In the Confluent Cloud Console, you may see the same bootstrap server
for different clusters. This is working as designed; it occurs because
Confluent Cloud clusters are multi-tenant.
My problem is that when attempting to produce to a topic it appears that the producer is connected to the wrong cluster, I'll get:
cimpl.KafkaException: KafkaError{code=_UNKNOWN_TOPIC,val=-188,str="Unable to produce message: Local: Unknown topic"}
And producer.list_topics() shows the topics from the other cluster to the one I'm working on.
So how do I specify the exact cluster, which will have the right topics? I'm expecting to be able to provide cluster.id in my configuration. But that returns
KafkaError{code=_INVALID_ARG,val=-186,str="No such configuration property: "cluster.id""}

Same consumer group (s3 sink connector) across two different kafka connect cluster

I'm migrating Kafka connectors from an ECS cluster to a new cluster running on Kubernetes. I successfully migrated the Postgres source connectors over by deleting them and recreating them on the exact replication slots. They keep writing to the same topics in the same Kafka cluster. And the S3 connector in the old cluster continues to read from those and write records into S3. Everything works as usual.
But now to move the AWS s3 sink connectors, I first created a non-critical s3 connector in the new cluster with the same name as the one in the old cluster. I was going to wait a few minutes before deleting the old one to avoid missing data. To my surprise, it looks like (based on the UI provided by akhq.io) the one worker on that new s3 connector joins with the existing same consumer group. I was fully expecting to have duplicated data. Based on the Confluent doc,
All Workers in the cluster use the same three internal topics to share
connector configurations, offset data, and status updates. For this
reason all distributed worker configurations in the same Connect
cluster must have matching config.storage.topic, offset.storage.topic,
and status.storage.topic properties.
So from this "same Connect cluster", I thought having the same consumer group id only works within the same connect cluster. But from my observation, it seems like you could have multiple consumers in different clusters belonging to the same consumer group?
Based on this article __consumer_offsets is used by consumers, and unlike other hidden "offset" related topics, it doesn't have any cluster name designation.
Does that mean I could simply create S3 sink connectors in the new Kubernetes cluster and then delete the ones in the ECS cluster without duplicating or missing data then (as long as they have the same name -> same consumer group)? I'm not sure if this is the right pattern people usually use.
I'm not familiar with using a Kafka Connect Cluster but I understand that it is a cluster of connectors that is independent of the Kafka cluster.
In that case, since the connectors are using the same Kafka cluster and you are just moving them from ECS to k8s, it should work as you describe. The consumer offsets information and the internal kafka connect offsets information is stored in the Kafka cluster, so it doesn't really matter where the connectors run as long as they connect to the same Kafka cluster. They should restart from the same position or behave as additional replicas of the same connector regardless of where ther are running.

Debezium fails when using it with Kerberos

I'm trying to configure the Oracle Connector (debezium 1.9) with a Kerberized Kafka cluster (from Cloudera Private CDP) and have some weird troubles.
I first tried to configure Debezium with a PLAINTEXT security protocol (using an Apache Kafka 3.1.0) to validate everything was fine (Oracle, Connect config... ) and everything runs perfectly.
Next, I deployed the same connector, using the same Oracle DB instance on my On Premises Cloudera CDP platform, which is kerberized, and updating the connector config by adding :
"database.history.kafka.topic": "schema-changes.oraclecdc",
"database.history.consumer.sasl.jaas.config": "com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab=\"/tmp/debezium.keytab\" principal=\"debezium#MYREALM\";",
"database.history.consumer.security.protocol": "SASL_PLAINTEXT",
"database.history.consumer.sasl.kerberos.service.name": "kafka",
"database.history.producer.sasl.jaas.config": "com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab=\"/tmp/debezium.keytab\" principal=\"debezium#MYREALM\";",
"database.history.producer.security.protocol": "SASL_PLAINTEXT",
"database.history.producer.sasl.kerberos.service.name": "kafka"
In this case, the topic schema-changes.oraclecdc is automatically created when the connector starts (auto creation enabled) and the DDL definitions are correctly reported. But that's it. So I suppose the JAAS config is OK and the producer config is correctly set as the connector has been able to create the topic and publish something in it.
But I can't get my updates/inserts/deletes being published. And the corresponding topics are not created. Instead kafka connect reports me the producer is disconnected, as soon as the connector starts.
Activating the TRACE level into kafka-connect, I can check that the updates/inserts/... are correctly detected by debezium from the redo log.
The fact the producer is being disconnected makes me think there's a problem of authentication. But if I understand the debezium documentation, the producer config is the same for either schema topic and tables cdc topics. So I can't understand why the "schema changes topic" is created with messages published, but the "CDC mechanism" doesn't create topics...
What am I missing here?