Kafka 1.x Obsolete Vulnerability - apache-kafka

All of us know, Kafka is still using log4j 1.x jar files even though Log4j 1.x has reached End of Life in 2015 and is no longer supported. So it became obsolete vulnerability for Kafka.
Are there any ways to replace log4j 1.x in present Kafka (docker images) or Is there any development work is going on to replace these log4j1.x? If yes, is there any ETA provided by Kafka team?

Yes, there is active work to replace log4j, if not already done so. Please search Kafka JIRA.
Also refer to Kafka CVE list - https://kafka.apache.org/cve-list
And Confluent's own announcements - https://support.confluent.io/hc/en-us/articles/4412615410580-December-2021-Log4j-Vulnerabilities-Advisory
Regarding out of support versions of Kafka, that you're not willing to upgrade, then you can try replacing log4j directly with reload4j (I've not tried it), but of course, this will not handle nested dependencies very accurately.

Related

strimzi kafka operator have supported kafka versions

Why does the strimzi kafka operator have supported kafka versions; why do I care about this, if the version of kafka is being managed by the operator?
Is this only mentioned for client support?
The Apache Kafka versions supported by the different Strimzi versions are listed on the Strimzi website. Supported in this case means the versions for which we ship container images and which were tested. There are several reasons why we don't support more versions:
While you might not care about this, if the version of kafka is being managed by the operator, the operator still cares because it needs to understand what it operates because it encodes the operational knowledge.
As any other software, also Apache Kafka evolves, APIs (for example around the Admin APIs) and configurations (e.g. new options are added in different versions and the operator needs to understand them to validate them or update them) are changing etc. So supporting old versions is not always easy without code complexity.
We have limited resources to build and test the software. Both in terms of contributors but also as CI resources to run the build and test pipelines.
The current Strimzi commitment to what Kafka versions does it support is listed here. If you are interested, you can always join the project and help to make things better. Sicne Strimzi is open source, you can also always try to add another Kafka versions yourself and build and test it.
The Kafka consumers and producers have normally very good backwards / forwards compatibility. So you do not necessarily need to always use the same version of the clients as the brokers.

How to upgrade Apache Kafka 2.0 to Apache Kafka 2.6 in running environment?

we are using Apache kafka 2.0 in our production environment and now we are planning to upgrade the kafka version to 2.6 from 2.0
we are running in three broker based cluster setup
i am having the below questions.
1)is it possible to upgrade the kafka from one version to higher version?
2)while upgrading is there any data loss happen?
3)is it possible to perform while the cluster is running?
4)How to rollback to the down version if something wrong happened?
can you share your valuable thoughts for this question..
it would be helpful to setup..
Yes, upgrades are possible - http://kafka.apache.org/26/documentation.html#upgrade
Data that's already written to the topics shouldn't get lost if you follow the guide. Active clients might experience network exceptions, retries, and potential dropped packets while individual brokers are restarting.
A rolling upgrade is possible to prevent downtime
Depending on the exact version, rollbacks are not possible due to internal log format changes (as indicated in the documentation)

Kafka Streams client library compatibility with kafka broker version

I am using kafka client & streams library version 2.7.0 for building my application. However the kafka brokers(2 different clusters) are at older version ( 2.4.1 & 2.6.0).
As i understand we can use the latest clients & Streams library and it should run fine with older version of kafka brokers. Am i correct ? Is there any compatibility matrix between client & streams library with kafka brokers ?
I tried running in my application (with 2.7.0 client library) in local environment ( with kafka version 2.6.0) and it worked fine but wanted to get the supported compatibility between them
Update: As onecricketeer has helpfully pointed out, you can refer to the Kafka Compatability Matrix. He also notes:
There is a general answer. Clients above 0.10.2 work with brokers down to that version for all basic functionality until stated otherwise. Extra functionality includes transactional/idempotence and record headers, which Spring may depend on, but Kafka Streams natively has no dependency on.
Additionally, the upgrade section of the Kafka Documentation provides guidance on upgrade order for various Kafka versions.
The compatability matrix provided by the spring-cloud-stream project may also be of assistance.

Wrong package reference for TopicNameMatches class in both Apache and Confluent kafka documentation

I tried the kafka connect transform predicate examples with debezium connector for MS SQL, and faced the issue with documentation for kafka connect. Examples in both documentations mention wrong org.apache.kafka.connect.predicates.TopicNameMatches, instead of the correct org.apache.kafka.connect.transforms.predicates.TopicNameMatches:
http://kafka.apache.org/documentation.html#connect_predicates
https://docs.confluent.io/platform/current/connect/transforms/regexrouter.html#predicate-examples
predicates=IsFoo
predicates.IsFoo.type=org.apache.kafka.connect.predicates.TopicNameMatches
predicates.IsFoo.pattern=foo
while in both distributions package is the same:
package org.apache.kafka.connect.transforms.predicates;
https://github.com/a0x8o/kafka/blob/master/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/predicates/TopicNameMatches.java
https://github.com/confluentinc/kafka/blob/master/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/predicates/TopicNameMatches.java
KIP for documentation improvement should then be issued for both?
You are correct: it's really mistake.
For the Apache Kafka docs, I already made a fix, but don't know why it didn't apply (asked about it in the PR).
Update. Fix will be applied in release 2.8

Upgrading Kafka client from 0.8.2.0 to 0.11.0.0

Currently, at my company we are migrating from Kafka 0.8 to 0.11, brokers migration steps and clearly stated in kafka documentation here
What I am stuck in is, upgrading the kafka clients (producers, consumers, spark-streaming), I don't find any documentation/ articles listing out clearly what are the required changes or steps to follow to upgarde the client, all what I found is the java doc Producer Client
What I did so far is to change the kafka client version in my gradle to kafka-clients-0.11.0.0, and everything from the compilation point of view went fine with no code changes at all.
What I seek help with is, is there any expected problems I should take care of, any pointers for client changes other than the kafka-client version?
I went through lots of experiments to get this done.
For the consumers and producers, I just used the kafka consumers and producers 0.11.0.
The trick part was replacing spark-streaming, spark-streaming latest version only support upto kafka 0.10.X, which doesn't contains any updates related to the new broker.
What I recommend here, if you are about to write an application from scratch and your main goal is realtime streaming go for kafka-streaming API, it is just AWESOME!, if you already have spark streaming app (which was my case), you should either judge which is more important than the other wether to get stuck with the kafka-broker version 10.X and spark-streaming which was [experimental][1] btw.
The benefits of having the streaming inside kafka not spark the following:
Kafka streaming is a normal jar that can be injected in any java application, so you don't care that much about deployment, and environment
Auto-scaling is so easy when using kafka-streaming using any scaleset provided by any cloud service provider, unlike scaling a HDP cluster.
Monitoring using something like prometheus would be much easier.