Does zookeeper need to be stopped when cleaning old snapshots using zkCleanup.sh script? - apache-zookeeper

Does zookeeper need to be stopped before deleting old snapshots using zkCleanup.sh script?

No.
According to How to purge old zookeeper directory files by HortonWorks
and Maintaining a ZooKeeper Server by Cloudera neither mention the need to stop ZooKeeper in order to purge old snapshots.
Since HortonWorks & Cloudera represent most of the commercial installations of Hadoop, I'd treat this as a reliable source of truth.

Related

zookeeper-server-start.sh actually uses zookeeper jar from /kafka/libs not /zookeeper/lib. How to upgrade zookeeper?

We have installed zookeeper 3.6.2 and Kafka 2.13-2.6.0
Recently I noticed that zookeeper-server-start.sh is actually using the zookeeper jar file from /kafka/libs/ which is zookeeper-3.5.8.jar.
How do I upgrade zookeeper to 3.6.2. Do I have to find a version of kafka that has it bundled in the tar?
Why do they ask you to download and install zookeeper if the jar is already bundled in kafka and that is the one that kafka uses?
You will need to separately install (and start) a Zookeeper cluster outside of the versions and scripts provided by Kafka if you want to upgrade it, but there are risks involved.
Kafka does come bundled with Zookeeper and is tested against specific versions, but it is not upgraded as frequently as Zookeeper project does releases. You should follow the Kafka upgrade notes before blindly upgrading dependency components; there are some notes around version 2.6 that mentions situations where Zookeeper upgrades can fail and what settings need to be added to fix it. And unless you can guarantee a Zookeeper server upgrade will be backwards compatible, Kafka still has a version-specific Zookeeper client for network communications.
Worth pointing out that Kafka KRaft (Zookeeper-less mode) is already in preview release phase with newest 3.x releases, and is scheduled to be closer to production-ready by the end of the year, so I would hold off on upgrading Zookeeper unless you actually need if for something else.

Which ZooKeeper to use with Apache Kafka?

I have seen that i can either
(1) Use the ZooKeeper that comes with Kafka, or
(2) Use the ZooKeeper from Apache itself.
Which is the preferred method(if there is one) and why? My use-case is for a small application, so it will be a 3 ZooKeeper ensemble/cluster. I am using Window 10 for my test. The ZooKeeper version i am using is 3.5.6. The Apache Kafka version i am using is 2.12-2.3.0
Note:
I have tried both ways i.e (1) and (2), and both work.
UPDATE:
Found what i was looking for. For use case (2), if i want to use Kafka 3.0.0, ideally, i will use it with Zookeeper 3.6.3 as that's what it has been tested with, as noted here.
ZooKeeper has been upgraded to version 3.6.3.
Kafka is tested against the Zookeeper version it comes with.
If you want to upgrade, you'll need to verify Zookeeper itself is backwards compatible with older clients/protocols that Kafka may use.
Its unlikely that jumping from 3.4.x to 3.5.x is a compatible change, but if you stay within the same minor release, it should be fine

Kafka provided Zookeeper vs separate download

I am setting up kafka in production environment. I found that Kafka tar file comes with zookeeper. What are the pros and cons of using zookeeper that is shipped with kafka vs a compatible version of zookeeper downloaded from apache zookeeper directly?
What is recommended ?

How to start confluent platform?

i have following problem
I am using java 8 version for this and zookeeper is not working for this.
Downloads$ cd confluent-5.2.2/
roshni#roshni-HP-Pavilion-15-Notebook-PC:~/Downloads/confluent-5.2.2$
/home/roshni/Downloads/confluent-5.2.2/bin/confluent start
This CLI is intended for development only, not for production
https://docs.confluent.io/current/cli/index.html
WARNING: Java version 1.8 or 1.11 is recommended.
See https://docs.confluent.io/current/installation/versions-interoperability.html
What you did is correct. Based on the warning, it seems to suggest you're not using Java 8 or 11, though, so check your JAVA_HOME variable, for example.
You could try running confluent logs to see if you get more information, and just do confluent start kafka if you're truly only trying to run Kafka
Otherwise, you could explicitly start Zookeeper and Kafka manually, as well as other Confluent services, using the other scripts in the bin folder, just the same as any other Kafka install

Download Kafka or confluent platform in package distributed CDH 5.16

I have installed CDH 5.16 Express using packages in a RHEL server. I am trying to install Kafka now and i observed that it can be installed only if CDH is installed as parcels.
1) Is it possible to install Kafka or confluent platform" separately in the server and use it along withCDH` components.
2) Is there any other workaround to install Kafka using Cloudera Manager
In order use the CDK 4.0 (cloudera distribution of Kafka) with Cloudera 5.13, I was forced to install CDK 4.0 as a parcel.
I had a cloudera quickstart docker VM that I downloaded. It runs without Kerberos authentication. After starting the quickstart VM, I separately installed the quickstart Kafka from Apache kafka's website. This was required as the kafka packaged within cloudera was a older version. Since, this was non kerberos environment, the Kafka server upon startup started using the zookeeper that was running in quickstart VM. This way I achieved connection of Kafka with cloudera VM.
If you are new to CDH/CM then I suggest you first try and use the Kafka service that is bundled within Cloudera. Go to 'Add Service' within Cloudera drop-down and select kafka. Enabling this Kafka service will give you a set of brokers for kafka to run. Also, Kafka needs Zookeeper to run. Zookeeper comes by default in Cloudera. So, you would get a working cluster with kafka enabled in it. You can think of changing to the latest version of Kafka (using the approach mentioned above) once you are comfortable with inbuilt tools of CDH/CM.