Confluent Schema Registry 7.0.1

Confluent Schema Registry 7.0.1 - apache-kafka

I am working with confluentinc/cp-schema-registry:7.0.1.
What can I say, its a little temperamental. Will crash every time with a message
ERROR The retention policy of the schema topic _schemas is incorrect. You must configure the topic to 'compact' cleanup
Within my Spring Boot app, I have configured the TopicBuilder to suggest
.config(TopicConfig.CLEANUP_POLICY_CONFIG, "compact")
But problem is while running gradle composeUp, the container simply crashes without a possibility of salvation. I am not running an external Kafka instance to be able to configure with kafka-topics command, or cleanup.policy option. Not that I haven't tried providing some of these options in the docker-compose.yml but they all are ignored. Short of running Kafka and registry in my local env and the pipeline, is there any other alternative to this Confluent container at all?

Related

Kafka connect - completely removing a connector

my question is split to two. I've read Kafka Connect - Delete Connector with configs?. I'd like to completely remove a connector, with offsets and all, so I can recreate it with the same name later. Is this possible? To my understanding, a tombstone message will kill this connector indefinitely.
The second part is - is there a way to have the kafka-connect container automatically delete all connectors he created when bringing it down?
Thanks

There is no such command to completely cleanup connector state. For sink connectors, you can use kafka-consumer-groups to reset it's offsets. For source connectors, it's not as straightforward, as you'll need to manually produce data into the Connect-managed offsets topic.
The config and status topics also persist historical data, but shouldn't prevent you from recreating the connector with the same name/details.
The Connect containers published by Confluent and Debezium always uses Distributed mode. You'll need to override the entrypoint of the container to use standalone mode to not persist the connector metadata in Kafka topics (this won't be fault tolerant, but it'll be fine for testing)

Migration Cloudera Kafka (CDK) to Apache Kafka

I am looking to migrate a small 4 node Kafka cluster with about 300GB of data on the each brokers to a new cluster. The problem is we are currently running Cloudera's flavor of Kafka (CDK) and we would like to run Apache Kafka. For the most part CDK is very similar to Apache Kafka but I am trying to figure out the best way to migrate. I originally looked at using MirrorMaker, but to my understanding it will re-process messages once we cut over the consumers to the new cluster so I think that is out. I was wondering if we could spin up a new Apache Kafka cluster and add it to the CDK cluster (not sure how this will work yet, if at all) then decommission the CDK server one at a time. Otherwise I am out of ideas other than spinning up a new Apache Kafka cluster and just making code changes to every producer/consumer to point to the new cluster. which I am not really a fan of as it will cause down time.
Currently running 3.1.0 which is equivalent to Apache Kafka 1.0.1

MirrorMaker would copy the data, but not consumer offsets, so they'd be left at their configured auto.offset.reset policies.
I was wondering if we could spin up a new Apache Kafka cluster and add it to the CDK cluster
If possible, that would be the most effective way to migrate the cluster. For each new broker, give it a unique broker ID and the same Zookeeper connection string as the others, then it'll be part of the same cluster.
Then, you'll need to manually run the partition reassignment tool to move all existing topic partitions off of the old brokers and onto the new ones as data will not automatically be replicated
Alternatively, you could try shutting down the CDK cluster, backing up the data directories onto new brokers, then starting the same version of Kafka from your CDK on those new machines (as the stored log format is important).
Also make sure that you backup a copy of the server.properties files for the new brokers

Migrating topics,ACL and messages from apache kafka to confluent platform

We are migrating our application from Apache Kafka to Confluent Platform .
Apache Kafka version:1.1.0
Confluent :4.1.0
Tried these options:
Manually copying the zookeeper logs and Kafka Logs- Not an optimal way
because of volume and data correctness.
Mirror Maker - This will replicate newly created topics and ACL. It will not
migrate old details in Apache Kafka
Please suggest better approaches on this.

You can keep your existing Kafka and Zookeeper installation.
Confluent does not change any way these run or manage data.
You can configure the REST Proxy, Schema Registry, Control Center, KSQL, etc. to use your existing bootstrap servers or Zookeeper connection; nothing should need migrated, you're only adding extra consumer/producer services which just happen to be provided by Confluent.
If you later plan on upgrading your brokers, then you can start up new ones from the Confluent package, migrate the partitions, then shut down the old ones. Similarly for Zookeeper, but make sure that you have at least 2 up during this process, and always have an odd number of them available after your transition

Kafka logs configuration is not be picked up when starting kafka via Confluent CLI

I am trying to upgrade from the apache kafka to the confluent kafka
As the storage of the temp folder is quite limited I have changed the log.dirs of server.properties to a custom folder
log.dirs=<custom location>
Then try to start kafka server via the Confluent CLI (version 4.0) using below command :
bin/confluent start kafka
However when I check the kafka data folder, the data still persitted under the temp folder instead of the customzied one.
I have tried to start kafka server directly which is not using the Confluent CLI
bin/kafka-server-start etc/kafka/server.properties
then seen the config has been picked up properly
is this a bug with confluent CLI or it is supposed to be

I am trying to upgrade from the apache kafka to the confluent kafka
There is no such thing as "confluent kafka".
You can refer to the Apache or Confluent Upgrade documentation steps for switching Kafka versions, but at the end of the day, both are Apache Kafka.
On a related note: You don't need Kafka from the Confluent site to run other parts of the Confluent Platform.
The confluent command, though, will read it's own embedded config files for running on localhost only, and is not intended to integrate with external brokers / zookeepers.
Therefore, kafka-server-start is the production way to run Apache Kafka

Confluent CLI is meant to be used during development with Confluent Platform. Therefore, it currently gathers all the data and logs under a common location in order for a developer to be able to easily inspect (with confluent log or manually) and delete (with confluent destroy or manually) such data.
You are able to change this common location by setting
export CONFLUENT_CURRENT=<top-level-logs-and-data-directory>
and get which location is used any time with:
confluent current
The rest of the properties are used as set in the various .properties files for each service.

DCOS/Mesos Kafka command to increase partition

I have a Kafka cluster running on Mesos. I'm trying to increase number of partitions on a topic. That usually works with bin/kafka-topics.sh --alter command. Is this option exposed via dcos cli or kafka-mesos rest API? From what I see its not exposed.
If not, what is the best way to access kafka's cli within mesos installation?
Right now I use dcos cli to get broker IP and then in an adhoc way get to
/var/lib/mesos/slave/slaves/11aaafce-f12f-4aa8-9e5c-200b2a657225-S3/frameworks/11aaafce-f12f-4aa8-9e5c-200b2a657225-0001/executors/broker-1-7cf26bed-aa40-464b-b146-49b45b7800c7/runs/849ba6fb-b99e-4194-b90b-8c9b2bfabd7c/kafka_2.10-0.9.0.0/bin/kafka-console-consumer.sh
Is there a more direct way?

We've just released a new version of the Kafka framework with DC/OS 1.7. The new version supports changing the partition count via dcos kafka [--name=frameworkname] topic partitions <topicname> <count>.
See also: Service documentation ("Alter Topic Partition Count" reference)