Upgrading Kafka client from 0.8.2.0 to 0.11.0.0 - apache-kafka

Currently, at my company we are migrating from Kafka 0.8 to 0.11, brokers migration steps and clearly stated in kafka documentation here
What I am stuck in is, upgrading the kafka clients (producers, consumers, spark-streaming), I don't find any documentation/ articles listing out clearly what are the required changes or steps to follow to upgarde the client, all what I found is the java doc Producer Client
What I did so far is to change the kafka client version in my gradle to kafka-clients-0.11.0.0, and everything from the compilation point of view went fine with no code changes at all.
What I seek help with is, is there any expected problems I should take care of, any pointers for client changes other than the kafka-client version?

I went through lots of experiments to get this done.
For the consumers and producers, I just used the kafka consumers and producers 0.11.0.
The trick part was replacing spark-streaming, spark-streaming latest version only support upto kafka 0.10.X, which doesn't contains any updates related to the new broker.
What I recommend here, if you are about to write an application from scratch and your main goal is realtime streaming go for kafka-streaming API, it is just AWESOME!, if you already have spark streaming app (which was my case), you should either judge which is more important than the other wether to get stuck with the kafka-broker version 10.X and spark-streaming which was [experimental][1] btw.
The benefits of having the streaming inside kafka not spark the following:
Kafka streaming is a normal jar that can be injected in any java application, so you don't care that much about deployment, and environment
Auto-scaling is so easy when using kafka-streaming using any scaleset provided by any cloud service provider, unlike scaling a HDP cluster.
Monitoring using something like prometheus would be much easier.

Related

Is Kafka Streams library v1.1.0 forward compatible with Kafka cluster v2.8.1?

I'm running a Kafka Streams application. Here's the current version compatibility:
kafka-streams v1.1.0 (with custom changes on top of it)
kafka cluster v2.1.1
I'm planning to upgrade my Kafka cluster to v2.8.1. Is the same kafka-streams library compatible with the newer kafka version? I'm not able to find any official compatibility matrix for streams. I could find one from Confluent, but none from the official Kafka website.
I ran my Kafka Streams app against Kafka v2.8.1 and it does seems to run. I'd like to avoid any surprises at a later point in time, hence looking for any pointers to compatibility or support.
The Kafka Streams API uses the Kafka consumer and producer API underneath. The Kafka protocol is backward compatible so you should not have any problem (so having new Kafka cluster working with old Kafka clients).
Takes only into account that starting with Kafka 4.0.0 (not released yet), they are planning to remove this backward compatibility. See KIP-896 for more details here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-896%3A+Remove+old+client+protocol+API+versions+in+Kafka+4.0

MirrorMaker2 compatibility with Kafka 2.0.0

Trying to understand compatibility of MM2 libraries packaged as part of core kafka release 2.4 with Kafka Cluster running with Kafka core version 2.0.0.
The major improvements to MM2 got implemented as part of the KIP-382 got merged via this PR.
In this PR I don’t see any changes being made to core kafka. The changes are only seen with Kafka-connect and mirrormaker related code.
As I know Kafka Connect is a client application to kafka (Similar to any standard consumer and producer application). And MM2 built on top of Kafka Connect Architecture.
Though MirrorMaker2 related libraries are bundled with Core kafka libraries i.e., 2.4 Kafka version as part of release cycles and packages, I think MM2 doesn't have absolute dependency on the same version of the Kafka Broker (i.e., the one built with kafka 2.4.).
The reason being MM2 component is client side component.
This is valid to assume that we can set MM2 using kafka core 2.4 release packages to work with Kafka Broker set up with Kafka 2.0.0?
Any responses, comments or any materials in this regard are highly appreciated. Thank you.

How to upgrade Apache Kafka 2.0 to Apache Kafka 2.6 in running environment?

we are using Apache kafka 2.0 in our production environment and now we are planning to upgrade the kafka version to 2.6 from 2.0
we are running in three broker based cluster setup
i am having the below questions.
1)is it possible to upgrade the kafka from one version to higher version?
2)while upgrading is there any data loss happen?
3)is it possible to perform while the cluster is running?
4)How to rollback to the down version if something wrong happened?
can you share your valuable thoughts for this question..
it would be helpful to setup..
Yes, upgrades are possible - http://kafka.apache.org/26/documentation.html#upgrade
Data that's already written to the topics shouldn't get lost if you follow the guide. Active clients might experience network exceptions, retries, and potential dropped packets while individual brokers are restarting.
A rolling upgrade is possible to prevent downtime
Depending on the exact version, rollbacks are not possible due to internal log format changes (as indicated in the documentation)

Kafka - Confluent Hub - Exploit only part of it

I already saw a similar question in SO, but not clearly answer my doubts.
We have different Kafka clusters and lot of exploitation operational habits around it. We have our way to start/stop the cluster, lots of exploit scripts that help maintain the cluster etc..
Now we would like to use Kafka connect connectors for new needs, but from what I saw, Kafka connect is extremely coupled to confluent-hub.
It's like I can't even use the connectors without having to install a full operational confluent-hub.
This makes it very difficult for us to use Kafka connect connectors, I understand that confluent-hub might be a framework that help running those connectors, but it's like we can't even use a dissociated Kafka cluster ( a one not exploited by confluent-hub..).
But maybe I miss something..
Do you know if there is any way to use properly Kafka connectors on a already existing Kafka cluster ( completely independent from confluent-hub) ?
EDITED :
It's more a question regarding the high coupled behaviour between confluent-hub and Kafka-connect. All the features that comes with Kafka connect ( distributed workers to handle different fail over scenarios, etc..) are not usable without confluent-hub, thus a "need" to have Kafka cluster running exclusively via confluent-hub, which is not an easy task when you already have an existing big Kafka cluster with lots of OPS habits on it.
Kafka Connect is part of Apache Kafka. It's a pluggable framework for streaming integration between systems in and out of Kafka.
To use Kafka Connect you need connectors for the specific technology with which you want to integrate. For example, S3 sink, Elasticsearch sink, JDBC source or sink, and so on.
The connector API is part of Apache Kafka, and available for anyone who wants to develop a connector.
Connectors are written by various people and organisations, and available in various different ways. How you obtain a connector depends on which connector you want, how its licensed, and how the author has made it available for distribution. It could be you go to github, clone the repo and build the JAR. It could be you can download the JAR directly.
All that Confluent Hub does is make lots of these connectors available for you in one place, easily searchable, and with an optional CLI tool that will install them for you.
Do you have to use Confluent Hub? No, not at all. Might it make your life easier in locating connectors that you want to use, and make it easier to install them? Hopefully :)
Disclaimer: I work for Confluent.

Kafka broker 1.10, clients using API 0.10.2

Should we update our Scala Kafka client library dependency (currently 0.10.2) to match the Kafka version of the broker (v1.1.0) ?
The Kafka 0.10.2 Documentation mentions
Starting with version 0.10.2, Java clients (producer and consumer)
have acquired the ability to communicate with older brokers. Version
0.10.2 clients can talk to version 0.10.0 or newer brokers
Are there any adverse effects when the client API version lags behind the server version? More importantly, can we safely update our Kafka client API library from 0.10.2 to 1.10?
While the brokers are now compatible with older clients, there are a few drawbacks in using older clients.
The main one is Message conversion. Between 1.1 and 0.10.2, the record format has changed. So, by default, older clients will force brokers to convert messages when producing and consuming. Conversion is pretty memory intensive and has a performance cost. See http://kafka.apache.org/documentation/#upgrade_11_message_format
Then obviously old clients are unable to use new features. Between 0.10.2 and 1.1, there's a ton a nice features like Exactly Once semantics, better authentication feedback on failure, Admin operations, etc