Kafka logs configuration is not be picked up when starting kafka via Confluent CLI - apache-kafka

I am trying to upgrade from the apache kafka to the confluent kafka
As the storage of the temp folder is quite limited I have changed the log.dirs of server.properties to a custom folder
log.dirs=<custom location>
Then try to start kafka server via the Confluent CLI (version 4.0) using below command :
bin/confluent start kafka
However when I check the kafka data folder, the data still persitted under the temp folder instead of the customzied one.
I have tried to start kafka server directly which is not using the Confluent CLI
bin/kafka-server-start etc/kafka/server.properties
then seen the config has been picked up properly
is this a bug with confluent CLI or it is supposed to be

I am trying to upgrade from the apache kafka to the confluent kafka
There is no such thing as "confluent kafka".
You can refer to the Apache or Confluent Upgrade documentation steps for switching Kafka versions, but at the end of the day, both are Apache Kafka.
On a related note: You don't need Kafka from the Confluent site to run other parts of the Confluent Platform.
The confluent command, though, will read it's own embedded config files for running on localhost only, and is not intended to integrate with external brokers / zookeepers.
Therefore, kafka-server-start is the production way to run Apache Kafka

Confluent CLI is meant to be used during development with Confluent Platform. Therefore, it currently gathers all the data and logs under a common location in order for a developer to be able to easily inspect (with confluent log or manually) and delete (with confluent destroy or manually) such data.
You are able to change this common location by setting
export CONFLUENT_CURRENT=<top-level-logs-and-data-directory>
and get which location is used any time with:
confluent current
The rest of the properties are used as set in the various .properties files for each service.

Related

In Kafka DNS configuration, are there any source applications to look at?

env
apache kafka 2.7
apache flume 1.9.0
trouble
I'm gonna setup kafka cluster with version 2.7.
Using DNS attribute(use_all_dns_ips), I want to perform traffic switching by dividing the dns of both clusters in case of DR.
But now, i using Apache Flume tailing source. I cannot apply DNS attribute values in flume because flume channel doesn't support DNS properties.
I also looked for the kafka source connector, but in the official version, there seems to be only a spooldir version and no tailing source version.
Is there any tailing source application that you can recommend to add a dns attribute?

Where do user-supplied Kafka connectors live?

We've got a managed Kafka setup (Confluent platform, Kafka connect 5.5.1), streaming data from ~40 topics across 8 to 10 connectors. A few weeks ago I noticed that for some of those topics, we don't have any consumers assigned. The consumers which should be reading from or writing to those topics are ones that our org has written and have not changed in months.
Looking through our connector hosts (AWS EC2 instances) I actually cannot see where our connector JAR files exist - which surprises me a lot. We've got all the other connectors there, and when I used confluent hub to install the BigQuery connector that got put under /usr/share/java as one would expect.
Where should home-grown connectors live on the filesystem?
For the record, when I query :8083 using the appropriate calls I can see the connector and it does have an allegedly-running task.
They are picked from the Java CLASSPATH and plugin.path
As for where they should exist, is somewhere that the user account running the connect process has access to read those files.

How to use Kafka with Neo4j community edition

I installed Neo4j and I can access the server. I can make nodes though cypher.
Now I want to use it for data streams. But I'm not sure how to do so. I just started Neo4j and I'm struggling with installing 'Stream Plugin'.
Any help is highly appreciated.
You should copy the jar files for the Neo4j streams plugin directly into your /plugins folder and configure the connections to Kafka and Zookeeper as well as other Neo4j property values at the neo4j.conf file as described here. For example:
kafka.zookeeper.connect=zookeeper-host:2181
kafka.bootstrap.servers=kafka-host:9092
Alternatively, if you are looking only for a sink connection from Kafka (i.e. moving records from Kafka topics to into Neo4j), you can also use Kafka Connect with the the supported Kafka Connect Neo4j Sink. More at https://www.confluent.io/hub/neo4j/kafka-connect-neo4j

Using confluent cp-schema-registry, does it have to talk to the same Kafka you are using for producers/consumers?

We already have Kafka running in production. And unfortunately it's an older version, 0.10.2. I want to start using cp-schema-registry, from the community edition of Confluent Platform. That would mean installing the older 3.2.2 image of schema registry for compatibility with our old kafka.
From what I've read in the documentation, it seems that Confluent Schema Registry uses Kafka as it's backend for storing it's state. But the clients that are producing to/reading from Kafka topics talk to Schema Registry independently of Kafka.
So I am wondering if it would be easier to manage in production, running Schema Registry/Kafka/Zookeeper in one container all together, independent of our main Kafka cluster. Then I can use the latest version of everything. The other benefit is that standing up this new service component up could not cause any unexpected negative consequence to the existing Kafka cluster.
I find the documentation doesn't really explain well what the pros/cons of each deployment strategy are. Can someone offer guidance on how they have deployed schema registry in an environment with an existing Kafka? What is the main advantage of connecting schema registry to your main Kafka cluster?
Newer Kafka clients are backwards compatible with Kafka 0.10, so there's no reason you couldn't use a newer Schema Registry than 3.2
In the docs
Schema Registry that is included in Confluent Platform 3.2 and later is compatible with any Kafka broker that is included in Confluent Platform 3.0 and later
I would certainly avoid putting everything in one container... That's not how they're meant to be used and there's no reason you would need another Zookeeper server
Having a secondary Kafka cluster only to hold one topic of schemas seems unnecessary when you could store the same information on your existing cluster
the clients that are producing to/reading from Kafka topics talk to Schema Registry independently of Kafka
Clients talk to both. Only Avro schemas are sent over HTTP before your regular client code reaches the topic. No, schemas and client data do not have to be part of the same Kafka cluster
Anytime anyone deploys Schema Registry, it's being added to "an existing Kafka", just the difference is yours might have more data in it

Migrating topics,ACL and messages from apache kafka to confluent platform

We are migrating our application from Apache Kafka to Confluent Platform .
Apache Kafka version:1.1.0
Confluent :4.1.0
Tried these options:
Manually copying the zookeeper logs and Kafka Logs- Not an optimal way
because of volume and data correctness.
Mirror Maker - This will replicate newly created topics and ACL. It will not
migrate old details in Apache Kafka
Please suggest better approaches on this.
You can keep your existing Kafka and Zookeeper installation.
Confluent does not change any way these run or manage data.
You can configure the REST Proxy, Schema Registry, Control Center, KSQL, etc. to use your existing bootstrap servers or Zookeeper connection; nothing should need migrated, you're only adding extra consumer/producer services which just happen to be provided by Confluent.
If you later plan on upgrading your brokers, then you can start up new ones from the Confluent package, migrate the partitions, then shut down the old ones. Similarly for Zookeeper, but make sure that you have at least 2 up during this process, and always have an odd number of them available after your transition